citee-methodology/README.md
Jacek Kubas f76cf2858b v1.0.0 — initial Citee Index Methodology release
Foundational public methodology for the first open public ranking of brand
visibility in AI search results (ChatGPT, Perplexity, Gemini, Claude).

This release establishes the framework — no rankings have been computed
or published yet. First scan cycle: late May 2026 (private validation).
First public ranking publication target: August 2026, after 3 validation
cycles.

Includes:
- methodology.json: machine-readable formulas, weights, policies
- README.md: human-readable overview + open/closed boundary
- CHANGELOG.md: versioning policy + v1.0.0 release notes
- taxonomy.md: tier system + 11 PL pilot categories
- LICENSE: MIT
- .gitignore: closed operational data (exact prompts, anti-gaming thresholds)
- prompts/README.md: 6-stage prompt curation process
- prompts/example-swiece-sojowe-pl.md: illustrative framework for first category

Strategic principles:
- Algorithm-first, no advisory board
- Open methodology + closed exact prompts (Goodhart's Law defense)
- No retroactive changes (FIDE 2024 lesson)
- No pay-to-play, hard rule (Moody's / Forbes 30 Under 30 lessons)
- Subjective opinion disclaimer (Gartner v. NetScout 2020 First Amendment shield)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 17:25:56 +02:00

85 lines
5.2 KiB
Markdown

# Citee Index Methodology
> Open methodology for the first public ranking of brand visibility in AI search results.
[**citee.ai**](https://citee.ai) · [**Methodology page**](https://citee.ai/methodology) · [Forgejo](https://git.lmwcommerce.com/citee/citee-methodology) · [GitHub mirror](https://github.com/lmwcommerce/citee-methodology)
---
## What this is
Citee Index measures how brands appear in AI-generated answers across major LLM-powered search systems (ChatGPT with web search, Perplexity, Gemini, Claude). The ranking is published quarterly per category and country.
This repository contains the **complete public methodology** — formulas, model weights, prompt-type distribution, cross-signal definitions, and the prompt curation process. Every change is committed publicly with rationale.
**This is NOT:**
- A SaaS dashboard (that's [Citee Pro](https://citee.ai/pro), separate product)
- A list of paid placements (zero pay-to-play, hard rule in [`methodology.json`](./methodology.json))
- A static document — methodology evolves through versioned releases (see [`CHANGELOG.md`](./CHANGELOG.md))
## Why open
Three reasons:
1. **Reproducibility.** Anyone can audit our scoring against the public raw query log.
2. **Cryptographic timestamping.** Git history is immutable — we cannot retroactively edit the methodology to hide a bug.
3. **Subjective opinion shield.** Open formula + public versioning establishes that scores are "expressions of opinion based on observed AI model outputs," not factual claims (legal precedent: *Gartner v. NetScout*, Connecticut Supreme Court 2020).
## What's in this repo
| File | Purpose |
|---|---|
| [`methodology.json`](./methodology.json) | Machine-readable methodology — formulas, weights, thresholds, policies |
| [`CHANGELOG.md`](./CHANGELOG.md) | Version history with rationale for each change |
| [`taxonomy.md`](./taxonomy.md) | Category list, tier system, scan cadence per tier |
| [`prompts/README.md`](./prompts/README.md) | Prompt curation process (6 stages, multi-agent validation) |
| [`prompts/example-*.md`](./prompts/) | Example prompt frameworks per category (illustrative — exact strings remain closed to prevent Goodhart's Law) |
| [`tools/prompt_curation/`](./tools/prompt_curation/) | Code for the multi-agent prompt curation pipeline |
| [`LICENSE`](./LICENSE) | MIT |
## What's NOT here (and why)
Some operational details remain closed:
- **Exact prompt strings** — disclosing the exact 100 prompts per category would let vendors optimize their pages specifically against our queries (Goodhart's Law). We publish the **distribution by type** (40% buying intent, 25% comparison, 20% specific need, 15% informational, 10% brand-direct) and **example patterns**, not exact strings. 20% of the prompt pool rotates quarterly.
- **Anti-gaming thresholds** — specific burst-detection cutoffs, sock puppet karma thresholds, and review-bombing pattern signatures are closed. We publish the categories (rank-jump flag at >30 ranks, fresh-Wikidata excluded <90 days, etc.) but not exact numbers.
- **Honeypot brand details** disclosure would defeat the purpose. The honeypot is documented as existing in [`methodology.json`](./methodology.json) for transparency.
- **First-party telemetry from Free Checker** aggregated weights from this telemetry feed into model weighting, but raw user data remains closed (GDPR).
These categories of closed information are explicitly listed in [`methodology.json`](./methodology.json) so the boundary between open and closed is itself transparent.
## Versioning policy
- **No retroactive changes.** Methodology updates apply to **future cycles only**. If we change the model weighting formula in v1.1, scores for cycles published before v1.1 are not retroactively recomputed (lesson from FIDE 2024 backlash, "stealing rating points").
- **Quarterly major reviews + ad-hoc minor patches.** Major reviews happen at the start of each quarter. Minor patches (typos, clarifications, additional examples) anytime versioned as v1.0.1, v1.0.2, etc.
- **Every change has a public commit with rationale.** No silent edits.
## Citation
If you cite Citee Index methodology in academic work, journalism, or business reports:
```
Citee Index Methodology v1.0.0 (2026-05-03).
LMW Commerce / Citee. https://github.com/lmwcommerce/citee-methodology
```
## Contributing
Issues welcome open one if you spot:
- Methodological flaws or statistical issues
- Errors in formulas or definitions
- Missing edge cases in anti-gaming
- Documentation typos or unclear sections
Pull requests considered for documentation, code in `tools/`, and example frameworks. **Methodology changes themselves are decided internally** based on quarterly review + community feedback. Every accepted methodology change is credited in `CHANGELOG.md`.
## License
MIT. See [`LICENSE`](./LICENSE).
You're free to use this methodology, fork it, build on it, replicate it, criticize it. We only ask: if you publish a competing ranking, **don't claim it's reproduced from Citee data without running the formulas yourself.** Methodology is open; our raw query log is the source of truth.
---
**Maintained by:** [LMW Commerce](https://lmwcommerce.com) · Jacek Kubas
**Contact:** hello@citee.ai