citee-methodology/CHANGELOG.md
Jacek Kubas f76cf2858b v1.0.0 — initial Citee Index Methodology release
Foundational public methodology for the first open public ranking of brand
visibility in AI search results (ChatGPT, Perplexity, Gemini, Claude).

This release establishes the framework — no rankings have been computed
or published yet. First scan cycle: late May 2026 (private validation).
First public ranking publication target: August 2026, after 3 validation
cycles.

Includes:
- methodology.json: machine-readable formulas, weights, policies
- README.md: human-readable overview + open/closed boundary
- CHANGELOG.md: versioning policy + v1.0.0 release notes
- taxonomy.md: tier system + 11 PL pilot categories
- LICENSE: MIT
- .gitignore: closed operational data (exact prompts, anti-gaming thresholds)
- prompts/README.md: 6-stage prompt curation process
- prompts/example-swiece-sojowe-pl.md: illustrative framework for first category

Strategic principles:
- Algorithm-first, no advisory board
- Open methodology + closed exact prompts (Goodhart's Law defense)
- No retroactive changes (FIDE 2024 lesson)
- No pay-to-play, hard rule (Moody's / Forbes 30 Under 30 lessons)
- Subjective opinion disclaimer (Gartner v. NetScout 2020 First Amendment shield)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 17:25:56 +02:00

4.6 KiB
Raw Permalink Blame History

Changelog

All notable changes to Citee Index Methodology are documented in this file.

The format is based on Keep a Changelog, versioning follows Semantic Versioning adapted for methodology:

  • MAJOR (2.0.0) — fundamental scoring formula change, weight rebalance, definition of categories
  • MINOR (1.1.0) — new prompt types, new cross-signals, new model added, anti-gaming rule additions
  • PATCH (1.0.1) — documentation fixes, clarifications, additional examples, typos

Important: No retroactive changes. Methodology updates apply to FUTURE cycles only. Cycles published before a version bump are not recomputed.


[1.0.0] — 2026-05-03

Initial public release. Foundational methodology. No public ranking yet — first publication scheduled August 2026 after 3-month validation period.

Added

  • Scoring formula: CiteeScore = sum(mention_score_per_model * model_weight) * (1 + cross_signal_bonus), normalized to 0-100 per category
  • Model weighting for PL market: ChatGPT 0.45, Perplexity 0.25, Gemini 0.20, Claude 0.10 (Claude added Q4 2026 in pilot, see methodology.json for rationale)
  • Mention score per model: position (0.4) + prominence (0.3) + sentiment (0.15) + citation depth (0.15)
  • 5 prompt types with weights:
    • Buying intent (2.0) — 30% of pool
    • Comparison (1.5) — 25%
    • Specific need (1.5) — 20%
    • Informational (0.3) — 15%
    • Brand-direct (0.3) — 10%
  • 4 cross-signals with maximum total bonus +20%:
    • Wikidata entry (≥90 days, ≥5 triples): +5%
    • Trustpilot/Opineo (>50 reviews, ≥4.0 average, no review bombing): +5%
    • Reddit organic mentions (>10 in niche subreddit, account age + karma weighted): +5%
    • Google AI Overviews presence (verified via SerpAPI): +5%
  • Anti-gaming protections: rank-jump flag (>30), fresh Wikidata exclusion (<90 days), review bombing exclusion, sock puppet detection (Reddit), prompt injection scrape filters (CSS hidden text, off-screen content, font-size:0)
  • Honeypot brand mechanism for detecting AI training data circular logic and unauthorized scraping
  • Statistical methodology: 95% confidence intervals via bootstrap resampling, overlapping CIs reported as tied (no false precision), 100 prompts × 3 models × 2 repetitions = 600 queries per category per cycle in pilot
  • Tier system:
    • Tier 1 — large markets (>1000 brands, >100M PLN GMV) — monthly scan
    • Tier 2 — medium markets (100-1000 brands, 10-100M PLN GMV) — quarterly scan
    • Tier 3 — niche markets (<100 brands, <10M PLN GMV) — semi-annual scan
  • 11 pilot categories (PL, all Tier 2): kosmetyki naturalne, suplementy / nutricosmetyki, diety pudełkowe, premium pet food, kawa specialty, czekolada rzemieślnicza, kursy programowania / IT bootcampy, kliniki estetyczne / dermatologia, fitness studios premium, kosmetyki dla mężczyzn, świece sojowe
  • Publication policy: 3-month validation period before first public ranking. Hybrid format — Top 10 public HTML (SEO indexed), full ranking 100 brands as PDF behind email gate. robots.txt disallow for GPTBot, ClaudeBot, PerplexityBot, CCBot, Google-Extended on full-data endpoints.
  • Right to reply: each brand profile page includes "Brand response" section, moderated for factual accuracy, 30-day response window per cycle
  • Monetization policy: ranked brands NEVER pay Citee directly (hard rule). Revenue from Citee Pro SaaS (paid by shops optimizing visibility, not ranked brands), Industry Reports (paid by agencies/media), and Sponsored Custom Research (commissioned for category research, not brand-specific)
  • Prompt curation process (6 stages): persona generator → prompt brainstormer → reality check (Google Trends, Reddit, Quora) → multi-agent validation (3 critics) → pilot test run → human approval

Notes

This is v1.0.0 — methodology release only. No ranking has been computed or published. Foundational document establishing the framework.

First scan cycle planned: late May 2026 (private validation). First public ranking publication target: August 2026 (after 3 validation cycles).


Pre-history

Project began as "AIO Visibility" module within LMW Pulse SaaS in March 2026. Pivoted to standalone product citee.ai in May 2026 after market analysis showed no global competitor publishing public AI visibility rankings (27+ tracked SaaS dashboards but zero public rankings).

Strategic shift from advisory-board-driven model (Gartner / Forbes 30 Under 30 pattern) to algorithm-first model (Glassdoor / Trustpilot / FIDE / PageRank pattern) decided 2026-05-03 based on principle: "the tool must defend itself, not by authority."