Foundational public methodology for the first open public ranking of brand visibility in AI search results (ChatGPT, Perplexity, Gemini, Claude). This release establishes the framework — no rankings have been computed or published yet. First scan cycle: late May 2026 (private validation). First public ranking publication target: August 2026, after 3 validation cycles. Includes: - methodology.json: machine-readable formulas, weights, policies - README.md: human-readable overview + open/closed boundary - CHANGELOG.md: versioning policy + v1.0.0 release notes - taxonomy.md: tier system + 11 PL pilot categories - LICENSE: MIT - .gitignore: closed operational data (exact prompts, anti-gaming thresholds) - prompts/README.md: 6-stage prompt curation process - prompts/example-swiece-sojowe-pl.md: illustrative framework for first category Strategic principles: - Algorithm-first, no advisory board - Open methodology + closed exact prompts (Goodhart's Law defense) - No retroactive changes (FIDE 2024 lesson) - No pay-to-play, hard rule (Moody's / Forbes 30 Under 30 lessons) - Subjective opinion disclaimer (Gartner v. NetScout 2020 First Amendment shield) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
5.2 KiB
Citee Index Methodology
Open methodology for the first public ranking of brand visibility in AI search results.
citee.ai · Methodology page · Forgejo · GitHub mirror
What this is
Citee Index measures how brands appear in AI-generated answers across major LLM-powered search systems (ChatGPT with web search, Perplexity, Gemini, Claude). The ranking is published quarterly per category and country.
This repository contains the complete public methodology — formulas, model weights, prompt-type distribution, cross-signal definitions, and the prompt curation process. Every change is committed publicly with rationale.
This is NOT:
- A SaaS dashboard (that's Citee Pro, separate product)
- A list of paid placements (zero pay-to-play, hard rule in
methodology.json) - A static document — methodology evolves through versioned releases (see
CHANGELOG.md)
Why open
Three reasons:
- Reproducibility. Anyone can audit our scoring against the public raw query log.
- Cryptographic timestamping. Git history is immutable — we cannot retroactively edit the methodology to hide a bug.
- Subjective opinion shield. Open formula + public versioning establishes that scores are "expressions of opinion based on observed AI model outputs," not factual claims (legal precedent: Gartner v. NetScout, Connecticut Supreme Court 2020).
What's in this repo
| File | Purpose |
|---|---|
methodology.json |
Machine-readable methodology — formulas, weights, thresholds, policies |
CHANGELOG.md |
Version history with rationale for each change |
taxonomy.md |
Category list, tier system, scan cadence per tier |
prompts/README.md |
Prompt curation process (6 stages, multi-agent validation) |
prompts/example-*.md |
Example prompt frameworks per category (illustrative — exact strings remain closed to prevent Goodhart's Law) |
tools/prompt_curation/ |
Code for the multi-agent prompt curation pipeline |
LICENSE |
MIT |
What's NOT here (and why)
Some operational details remain closed:
- Exact prompt strings — disclosing the exact 100 prompts per category would let vendors optimize their pages specifically against our queries (Goodhart's Law). We publish the distribution by type (40% buying intent, 25% comparison, 20% specific need, 15% informational, 10% brand-direct) and example patterns, not exact strings. 20% of the prompt pool rotates quarterly.
- Anti-gaming thresholds — specific burst-detection cutoffs, sock puppet karma thresholds, and review-bombing pattern signatures are closed. We publish the categories (rank-jump flag at >30 ranks, fresh-Wikidata excluded <90 days, etc.) but not exact numbers.
- Honeypot brand details — disclosure would defeat the purpose. The honeypot is documented as existing in
methodology.jsonfor transparency. - First-party telemetry from Free Checker — aggregated weights from this telemetry feed into model weighting, but raw user data remains closed (GDPR).
These categories of closed information are explicitly listed in methodology.json so the boundary between open and closed is itself transparent.
Versioning policy
- No retroactive changes. Methodology updates apply to future cycles only. If we change the model weighting formula in v1.1, scores for cycles published before v1.1 are not retroactively recomputed (lesson from FIDE 2024 backlash, "stealing rating points").
- Quarterly major reviews + ad-hoc minor patches. Major reviews happen at the start of each quarter. Minor patches (typos, clarifications, additional examples) anytime — versioned as v1.0.1, v1.0.2, etc.
- Every change has a public commit with rationale. No silent edits.
Citation
If you cite Citee Index methodology in academic work, journalism, or business reports:
Citee Index Methodology v1.0.0 (2026-05-03).
LMW Commerce / Citee. https://github.com/lmwcommerce/citee-methodology
Contributing
Issues welcome — open one if you spot:
- Methodological flaws or statistical issues
- Errors in formulas or definitions
- Missing edge cases in anti-gaming
- Documentation typos or unclear sections
Pull requests considered for documentation, code in tools/, and example frameworks. Methodology changes themselves are decided internally based on quarterly review + community feedback. Every accepted methodology change is credited in CHANGELOG.md.
License
MIT. See LICENSE.
You're free to use this methodology, fork it, build on it, replicate it, criticize it. We only ask: if you publish a competing ranking, don't claim it's reproduced from Citee data without running the formulas yourself. Methodology is open; our raw query log is the source of truth.
Maintained by: LMW Commerce · Jacek Kubas Contact: hello@citee.ai