v1.0.0 — initial Citee Index Methodology release

Foundational public methodology for the first open public ranking of brand visibility in AI search results (ChatGPT, Perplexity, Gemini, Claude). This release establishes the framework — no rankings have been computed or published yet. First scan cycle: late May 2026 (private validation). First public ranking publication target: August 2026, after 3 validation cycles. Includes: - methodology.json: machine-readable formulas, weights, policies - README.md: human-readable overview + open/closed boundary - CHANGELOG.md: versioning policy + v1.0.0 release notes - taxonomy.md: tier system + 11 PL pilot categories - LICENSE: MIT - .gitignore: closed operational data (exact prompts, anti-gaming thresholds) - prompts/README.md: 6-stage prompt curation process - prompts/example-swiece-sojowe-pl.md: illustrative framework for first category Strategic principles: - Algorithm-first, no advisory board - Open methodology + closed exact prompts (Goodhart's Law defense) - No retroactive changes (FIDE 2024 lesson) - No pay-to-play, hard rule (Moody's / Forbes 30 Under 30 lessons) - Subjective opinion disclaimer (Gartner v. NetScout 2020 First Amendment shield) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 17:25:56 +02:00 · 2026-05-03 17:25:56 +02:00 · f76cf2858b
commit f76cf2858b
8 changed files with 884 additions and 0 deletions
--- a/.gitignore
+++ b/.gitignore
@ -0,0 +1,54 @@
 # OS / editor cruft
 .DS_Store
 Thumbs.db
 *.swp
 *.swo
 *~
 .vscode/
 .idea/
 # OneDrive sync conflicts (just in case repo ends up under OneDrive accidentally)
 *-Bob.*
 *conflict*
 # Python
 __pycache__/
 *.py[cod]
 *$py.class
 .venv/
 venv/
 env/
 *.egg-info/
 .pytest_cache/
 # Closed operational data — exact prompt strings remain CLOSED to prevent
 # Goodhart's Law (when a measure becomes a target, it ceases to be a measure).
 # Public examples and frameworks live in prompts/ at the repo root.
 prompts/swiece-sojowe-pl/
 prompts/kosmetyki-naturalne-pl/
 prompts/suplementy-nutricosmetyki-pl/
 prompts/diety-pudelkowe-pl/
 prompts/premium-pet-food-pl/
 prompts/kawa-specialty-pl/
 prompts/czekolada-rzemieslnicza-pl/
 prompts/kursy-programowania-bootcampy-pl/
 prompts/kliniki-estetyczne-dermo-pl/
 prompts/fitness-studios-premium-pl/
 prompts/kosmetyki-meskie-pl/
 # Closed anti-gaming thresholds (private values, public categories documented)
 anti_gaming/private_thresholds.json
 anti_gaming/honeypot_brand.json
 # First-party telemetry from Free Checker (GDPR — raw user data closed)
 telemetry/raw/
 # Output of scan cycles (raw query logs are public via API but not in repo)
 output/
 scans/
 # Secrets
 .env
 .env.*
 *.key
 secrets.json
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@ -0,0 +1,61 @@
 # Changelog
 All notable changes to Citee Index Methodology are documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/), versioning follows [Semantic Versioning](https://semver.org/) adapted for methodology:
 - **MAJOR** (`2.0.0`) — fundamental scoring formula change, weight rebalance, definition of categories
 - **MINOR** (`1.1.0`) — new prompt types, new cross-signals, new model added, anti-gaming rule additions
 - **PATCH** (`1.0.1`) — documentation fixes, clarifications, additional examples, typos
 **Important:** No retroactive changes. Methodology updates apply to FUTURE cycles only. Cycles published before a version bump are not recomputed.
 ---
 ## [1.0.0] — 2026-05-03
 Initial public release. Foundational methodology. **No public ranking yet** — first publication scheduled August 2026 after 3-month validation period.
 ### Added
 - **Scoring formula:** `CiteeScore = sum(mention_score_per_model * model_weight) * (1 + cross_signal_bonus)`, normalized to 0-100 per category
 - **Model weighting** for PL market: ChatGPT 0.45, Perplexity 0.25, Gemini 0.20, Claude 0.10 (Claude added Q4 2026 in pilot, see `methodology.json` for rationale)
 - **Mention score per model:** position (0.4) + prominence (0.3) + sentiment (0.15) + citation depth (0.15)
 - **5 prompt types** with weights:
  - Buying intent (2.0) — 30% of pool
  - Comparison (1.5) — 25%
  - Specific need (1.5) — 20%
  - Informational (0.3) — 15%
  - Brand-direct (0.3) — 10%
 - **4 cross-signals** with maximum total bonus +20%:
  - Wikidata entry (≥90 days, ≥5 triples): +5%
  - Trustpilot/Opineo (>50 reviews, ≥4.0 average, no review bombing): +5%
  - Reddit organic mentions (>10 in niche subreddit, account age + karma weighted): +5%
  - Google AI Overviews presence (verified via SerpAPI): +5%
 - **Anti-gaming protections:** rank-jump flag (>30), fresh Wikidata exclusion (<90 days), review bombing exclusion, sock puppet detection (Reddit), prompt injection scrape filters (CSS hidden text, off-screen content, font-size:0)
 - **Honeypot brand** mechanism for detecting AI training data circular logic and unauthorized scraping
 - **Statistical methodology:** 95% confidence intervals via bootstrap resampling, overlapping CIs reported as tied (no false precision), 100 prompts × 3 models × 2 repetitions = 600 queries per category per cycle in pilot
 - **Tier system:**
  - Tier 1 — large markets (>1000 brands, >100M PLN GMV) — monthly scan
  - Tier 2 — medium markets (100-1000 brands, 10-100M PLN GMV) — quarterly scan
  - Tier 3 — niche markets (<100 brands, <10M PLN GMV) — semi-annual scan
 - **11 pilot categories (PL, all Tier 2):** kosmetyki naturalne, suplementy / nutricosmetyki, diety pudełkowe, premium pet food, kawa specialty, czekolada rzemieślnicza, kursy programowania / IT bootcampy, kliniki estetyczne / dermatologia, fitness studios premium, kosmetyki dla mężczyzn, świece sojowe
 - **Publication policy:** 3-month validation period before first public ranking. Hybrid format — Top 10 public HTML (SEO indexed), full ranking 100 brands as PDF behind email gate. `robots.txt` disallow for GPTBot, ClaudeBot, PerplexityBot, CCBot, Google-Extended on full-data endpoints.
 - **Right to reply:** each brand profile page includes "Brand response" section, moderated for factual accuracy, 30-day response window per cycle
 - **Monetization policy:** ranked brands NEVER pay Citee directly (hard rule). Revenue from Citee Pro SaaS (paid by shops optimizing visibility, not ranked brands), Industry Reports (paid by agencies/media), and Sponsored Custom Research (commissioned for category research, not brand-specific)
 - **Prompt curation process** (6 stages): persona generator → prompt brainstormer → reality check (Google Trends, Reddit, Quora) → multi-agent validation (3 critics) → pilot test run → human approval
 ### Notes
 This is **v1.0.0 — methodology release only**. No ranking has been computed or published. Foundational document establishing the framework.
 First scan cycle planned: late May 2026 (private validation).
 First public ranking publication target: August 2026 (after 3 validation cycles).
 ---
 ## Pre-history
 Project began as "AIO Visibility" module within LMW Pulse SaaS in March 2026. Pivoted to standalone product `citee.ai` in May 2026 after market analysis showed no global competitor publishing public AI visibility rankings (27+ tracked SaaS dashboards but zero public rankings).
 Strategic shift from advisory-board-driven model (Gartner / Forbes 30 Under 30 pattern) to algorithm-first model (Glassdoor / Trustpilot / FIDE / PageRank pattern) decided 2026-05-03 based on principle: "the tool must defend itself, not by authority."
--- a/40
+++ b/40
@ -0,0 +1,40 @@
 MIT License
 Copyright (c) 2026 LMW Commerce / Jacek Kubas
 Permission is hereby granted, free of charge, to any person obtaining a copy
 of this software and associated documentation files (the "Software"), to deal
 in the Software without restriction, including without limitation the rights
 to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 copies of the Software, and to permit persons to whom the Software is
 furnished to do so, subject to the following conditions:
 The above copyright notice and this permission notice shall be included in all
 copies or substantial portions of the Software.
 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
 SOFTWARE.
 ---
 Note on Citee Index data:
 While this methodology is MIT-licensed and freely usable, the Citee Index
 itself (the published rankings, raw query logs, and brand-level scores) is
 provided under a separate data license described at
 https://citee.ai/data-license. The methodology being open does not imply
 that derived datasets from Citee scans are public domain.
 Disclaimer regarding scoring:
 Citee Index scores represent expressions of opinion based on observed AI
 model outputs at specific points in time. They are not factual claims about
 the relative quality, popularity, or merit of any brand. The methodology is
 a framework for converting observed AI outputs into a comparable index;
 reasonable people could construct alternative methodologies that produce
 different rankings.
--- a/README.md
+++ b/README.md
@ -0,0 +1,85 @@
 # Citee Index Methodology
 > Open methodology for the first public ranking of brand visibility in AI search results.
 [**citee.ai**](https://citee.ai) · [**Methodology page**](https://citee.ai/methodology) · [Forgejo](https://git.lmwcommerce.com/citee/citee-methodology) · [GitHub mirror](https://github.com/lmwcommerce/citee-methodology)
 ---
 ## What this is
 Citee Index measures how brands appear in AI-generated answers across major LLM-powered search systems (ChatGPT with web search, Perplexity, Gemini, Claude). The ranking is published quarterly per category and country.
 This repository contains the **complete public methodology** — formulas, model weights, prompt-type distribution, cross-signal definitions, and the prompt curation process. Every change is committed publicly with rationale.
 **This is NOT:**
 - A SaaS dashboard (that's [Citee Pro](https://citee.ai/pro), separate product)
 - A list of paid placements (zero pay-to-play, hard rule in [`methodology.json`](./methodology.json))
 - A static document — methodology evolves through versioned releases (see [`CHANGELOG.md`](./CHANGELOG.md))
 ## Why open
 Three reasons:
 1. **Reproducibility.** Anyone can audit our scoring against the public raw query log.
 2. **Cryptographic timestamping.** Git history is immutable — we cannot retroactively edit the methodology to hide a bug.
 3. **Subjective opinion shield.** Open formula + public versioning establishes that scores are "expressions of opinion based on observed AI model outputs," not factual claims (legal precedent: *Gartner v. NetScout*, Connecticut Supreme Court 2020).
 ## What's in this repo
 | File | Purpose |
 |---|---|
 | [`methodology.json`](./methodology.json) | Machine-readable methodology — formulas, weights, thresholds, policies |
 | [`CHANGELOG.md`](./CHANGELOG.md) | Version history with rationale for each change |
 | [`taxonomy.md`](./taxonomy.md) | Category list, tier system, scan cadence per tier |
 | [`prompts/README.md`](./prompts/README.md) | Prompt curation process (6 stages, multi-agent validation) |
 | [`prompts/example-*.md`](./prompts/) | Example prompt frameworks per category (illustrative — exact strings remain closed to prevent Goodhart's Law) |
 | [`tools/prompt_curation/`](./tools/prompt_curation/) | Code for the multi-agent prompt curation pipeline |
 | [`LICENSE`](./LICENSE) | MIT |
 ## What's NOT here (and why)
 Some operational details remain closed:
 - **Exact prompt strings** — disclosing the exact 100 prompts per category would let vendors optimize their pages specifically against our queries (Goodhart's Law). We publish the **distribution by type** (40% buying intent, 25% comparison, 20% specific need, 15% informational, 10% brand-direct) and **example patterns**, not exact strings. 20% of the prompt pool rotates quarterly.
 - **Anti-gaming thresholds** — specific burst-detection cutoffs, sock puppet karma thresholds, and review-bombing pattern signatures are closed. We publish the categories (rank-jump flag at >30 ranks, fresh-Wikidata excluded <90 days, etc.) but not exact numbers.
 - **Honeypot brand details** — disclosure would defeat the purpose. The honeypot is documented as existing in [`methodology.json`](./methodology.json) for transparency.
 - **First-party telemetry from Free Checker** — aggregated weights from this telemetry feed into model weighting, but raw user data remains closed (GDPR).
 These categories of closed information are explicitly listed in [`methodology.json`](./methodology.json) so the boundary between open and closed is itself transparent.
 ## Versioning policy
 - **No retroactive changes.** Methodology updates apply to **future cycles only**. If we change the model weighting formula in v1.1, scores for cycles published before v1.1 are not retroactively recomputed (lesson from FIDE 2024 backlash, "stealing rating points").
 - **Quarterly major reviews + ad-hoc minor patches.** Major reviews happen at the start of each quarter. Minor patches (typos, clarifications, additional examples) anytime — versioned as v1.0.1, v1.0.2, etc.
 - **Every change has a public commit with rationale.** No silent edits.
 ## Citation
 If you cite Citee Index methodology in academic work, journalism, or business reports:
 ```
 Citee Index Methodology v1.0.0 (2026-05-03).
 LMW Commerce / Citee. https://github.com/lmwcommerce/citee-methodology
 ```
 ## Contributing
 Issues welcome — open one if you spot:
 - Methodological flaws or statistical issues
 - Errors in formulas or definitions
 - Missing edge cases in anti-gaming
 - Documentation typos or unclear sections
 Pull requests considered for documentation, code in `tools/`, and example frameworks. **Methodology changes themselves are decided internally** based on quarterly review + community feedback. Every accepted methodology change is credited in `CHANGELOG.md`.
 ## License
 MIT. See [`LICENSE`](./LICENSE).
 You're free to use this methodology, fork it, build on it, replicate it, criticize it. We only ask: if you publish a competing ranking, **don't claim it's reproduced from Citee data without running the formulas yourself.** Methodology is open; our raw query log is the source of truth.
 ---
 **Maintained by:** [LMW Commerce](https://lmwcommerce.com) · Jacek Kubas
 **Contact:** hello@citee.ai
--- a/methodology.json
+++ b/methodology.json
@ -0,0 +1,270 @@
 {
  "version": "1.0.0",
  "released": "2026-05-03",
  "name": "Citee Index Methodology",
  "description": "Public methodology for Citee Index — the first open public ranking of brand visibility in AI search results (ChatGPT, Perplexity, Gemini, Claude).",
  "license": "MIT",
  "repository": "https://git.lmwcommerce.com/citee/citee-methodology",
  "mirror": "https://github.com/lmwcommerce/citee-methodology",
  "homepage": "https://citee.ai/methodology",
  "philosophy": {
    "approach": "algorithm-first",
    "principles": [
      "Open methodology, public versioning (every change committed publicly)",
      "Reproducibility — anyone can replicate scores from raw query log",
      "No pay-to-play — ranked brands never pay Citee directly. Hard rule in ToS.",
      "Subjective opinion disclaimer — scores are expressions of opinion based on observed AI model outputs (First Amendment shield, Gartner v. NetScout 2020)",
      "No retroactive changes — methodology updates apply to FUTURE cycles only (FIDE 2024 backlash lesson)",
      "Confidence intervals — overlapping CIs reported as 'tied', no false precision",
      "Annual transparency report — manipulation patterns detected, anti-gaming actions taken"
    ]
  },
  "scoring": {
    "formula": "CiteeScore(brand, category, country, month) = sum(mention_score_per_model * model_weight) * (1 + cross_signal_bonus)",
    "normalization": "Raw score 0-120 normalized to 0-100 per category (top brand = 100, others proportional)",
    "ranking": "Sort by CiteeScore descending. Brands with overlapping confidence intervals reported as tied."
  },
  "models": {
    "weighting_basis": "Each model weighted by its share of AI search traffic per region. Weights revised quarterly using 3 public data sources (OpenRouter rankings, Similarweb free tier, Statcounter/IAB Polska/Mobirank reports) plus first-party Free Checker telemetry.",
    "weights": {
      "PL": {
        "chatgpt": {
          "weight": 0.45,
          "model_version": "gpt-4o-search-2026-04",
          "rationale": "Largest user share PL based on OpenRouter + Similarweb data"
        },
        "perplexity": {
          "weight": 0.25,
          "model_version": "sonar-pro-2026-03",
          "rationale": "Growing power user segment, search-native architecture"
        },
        "gemini": {
          "weight": 0.20,
          "model_version": "gemini-2.0-pro",
          "rationale": "Google embed + AI Overviews coverage"
        },
        "claude": {
          "weight": 0.10,
          "model_version": "claude-sonnet-2026-q1",
          "rationale": "Niche but growing, added Q4 2026 in pilot",
          "status": "added_q4_2026"
        }
      }
    },
    "pilot_models": ["chatgpt", "perplexity", "gemini"],
    "claude_addition_planned": "2026-Q4"
  },
  "mention_score_per_model": {
    "formula": "mention_score = (position * 0.4) + (prominence * 0.3) + (sentiment * 0.15) + (citation_depth * 0.15)",
    "range": "0.0 - 1.0",
    "components": {
      "position": {
        "weight": 0.4,
        "scale": {
          "rank_1": 1.0,
          "rank_2": 0.7,
          "rank_3": 0.5,
          "rank_4_to_10": 0.3,
          "not_mentioned": 0.0
        }
      },
      "prominence": {
        "weight": 0.3,
        "scale": {
          "passing_mention": 0.3,
          "listed_with_description": 0.6,
          "actively_recommended": 1.0
        }
      },
      "sentiment": {
        "weight": 0.15,
        "scale": {
          "positive": 0.2,
          "neutral": 0.0,
          "negative_or_caveated": -0.3
        }
      },
      "citation_depth": {
        "weight": 0.15,
        "scale": {
          "direct_link_to_brand_site": 1.0,
          "mention_only_no_link": 0.5
        }
      }
    }
  },
  "prompt_types": {
    "rationale": "Different prompt types reflect different stages of buyer funnel. Buying intent prompts weighted higher because they correlate with revenue impact.",
    "weights": {
      "buying": {
        "weight": 2.0,
        "examples_pattern": "Where to buy [category] premium / Best place to buy [category]",
        "share_of_pool": "30%"
      },
      "comparison": {
        "weight": 1.5,
        "examples_pattern": "Best [category] / Top [category] handmade / [Brand A] vs [Brand B]",
        "share_of_pool": "25%"
      },
      "specific_need": {
        "weight": 1.5,
        "examples_pattern": "[Category] with [specific attribute] / [Category] for [specific use case]",
        "share_of_pool": "20%"
      },
      "informational": {
        "weight": 0.3,
        "examples_pattern": "What is [category] / How does [category] work",
        "share_of_pool": "15%"
      },
      "brand_direct": {
        "weight": 0.3,
        "examples_pattern": "[Brand X] reviews / Opinions about [Brand X]",
        "share_of_pool": "10%"
      }
    },
    "pool_size_per_category": 100,
    "pool_rotation": "20% of prompts rotate quarterly. Distribution by type published. Exact strings remain CLOSED to prevent Goodhart's Law (when a measure becomes a target, it ceases to be a measure)."
  },
  "cross_signals": {
    "rationale": "Cross-signals provide reality check — does the brand exist outside AI training data? Brand with high AI score but zero cross-signals may indicate content spam farm rather than real entity.",
    "max_total_bonus": 0.20,
    "signals": {
      "wikidata_entry": {
        "bonus": 0.05,
        "criteria": "Brand has Wikidata entry, minimum 5 triples (instance_of, country, founder OR founded_date, official_website, ISNI), entry age >= 90 days",
        "anti_gaming": "Entries < 90 days old excluded to prevent rapid-deployment manipulation"
      },
      "trustpilot_or_opineo": {
        "bonus": 0.05,
        "criteria": "Reviews count > 50, average rating > 4.0, no review bombing detected (review burst > 50 in 30 days = excluded)"
      },
      "reddit_organic_mentions": {
        "bonus": 0.05,
        "criteria": "Organic mentions in niche subreddit > 10, account_age + karma weighted, sock puppet detection applied (new accounts < 30 days excluded)"
      },
      "google_ai_overviews_presence": {
        "bonus": 0.05,
        "criteria": "Brand cited in Google AI Overviews response for at least one tracked prompt in category, verified via SerpAPI"
      }
    }
  },
  "anti_gaming": {
    "public_thresholds": {
      "rank_jump_flag": "Brand jumping > 30 ranks in single cycle triggers anomaly review and one-cycle score freeze",
      "fresh_wikidata_excluded": "< 90 days",
      "review_bombing_excluded": "> 50 reviews in 30 days from new accounts",
      "sock_puppet_excluded": "Reddit accounts < 30 days old or karma < threshold"
    },
    "private_thresholds": {
      "rationale": "Specific burst detection thresholds, sock puppet karma cutoffs, and pattern matching rules remain CLOSED to prevent gaming. Available to legal/regulatory authorities upon request.",
      "categories": [
        "burst_detection_thresholds",
        "sock_puppet_karma_cutoffs",
        "review_bombing_pattern_signatures",
        "prompt_injection_detection_signatures"
      ]
    },
    "honeypot_brand": {
      "active": true,
      "rationale": "Fictional brand inserted at predetermined ranking position to detect AI training data circular logic and unauthorized scraping. If model cites honeypot brand, evidence of training on Citee data without attribution.",
      "details": "CLOSED — disclosure would defeat purpose"
    },
    "prompt_injection_defense": {
      "scrape_filters": [
        "Strip CSS hidden text (display:none, visibility:hidden, color:white-on-white)",
        "Strip off-screen positioned content (left:-9999px, etc.)",
        "Strip font-size:0 and opacity:0 elements",
        "Detect and exclude content in noscript that contradicts visible content"
      ],
      "consequence": "Brands using prompt injection excluded from current cycle + publicly named in annual transparency report"
    }
  },
  "statistical_methodology": {
    "queries_per_cycle": {
      "prompts_per_category": 100,
      "models": "3 in pilot (ChatGPT, Perplexity, Gemini), 4 from Q4 2026 (+ Claude)",
      "repetitions_per_prompt": 2,
      "total_per_category_per_cycle": "100 * 3 * 2 = 600 (pilot), 100 * 4 * 2 = 800 (post Q4 2026)"
    },
    "confidence_intervals": "95% CI computed via bootstrap resampling. Brands with overlapping CIs reported as tied — no false precision.",
    "minimum_brands_per_category": 20,
    "tied_score_handling": "If CI(A) overlaps CI(B), both reported at same rank with '=' indicator"
  },
  "scan_cadence": {
    "tier_1_large_markets": {
      "frequency": "monthly",
      "criteria": ">1000 brands visible, >100M PLN GMV"
    },
    "tier_2_medium_markets": {
      "frequency": "quarterly",
      "criteria": "100-1000 brands, 10-100M PLN GMV"
    },
    "tier_3_niche_markets": {
      "frequency": "semi-annually",
      "criteria": "<100 brands, <10M PLN GMV"
    },
    "current_pilot_tier": "all categories in pilot are Tier 2 (quarterly)"
  },
  "publication_policy": {
    "validation_period_before_first_publication": "3 months / 3 cycles minimum",
    "first_public_ranking": "August 2026 (target)",
    "format": "Hybrid — Top 10 public HTML (SEO indexed), full ranking 100 brands as PDF behind email gate",
    "ai_crawler_policy": {
      "robots_txt_disallow": ["GPTBot", "ClaudeBot", "PerplexityBot", "CCBot", "Google-Extended"],
      "endpoints_protected": ["/api/ranking-full", "/index/*/full.pdf"],
      "rationale": "Prevents AI training data circular logic. Hybrid approach (top 10 public, ogon protected) balances SEO with measurement integrity."
    },
    "right_to_reply": "Each brand profile page includes 'Brand response' section. Brands can submit response (moderated for factual accuracy) within 30 days of cycle publication."
  },
  "monetization_policy": {
    "ranked_brands_pay_zero": true,
    "rationale": "Issuer-pays model fundamentally compromises ranking credibility (Moody's $864M settlement, Forbes 30 Under 30 fraud roundup). Citee Index revenue comes from indirect channels only.",
    "approved_revenue_sources": [
      "Citee Pro SaaS (199-449 PLN/mo) — paid by shops optimizing their visibility, NOT by ranked brands",
      "Industry Reports (999-2999 PLN/quarter) — paid by agencies, media, market research firms",
      "Sponsored Custom Research (9990-29990 PLN) — commissioned by media/agency for category research, NOT brand-specific"
    ],
    "prohibited": [
      "Brand profile upgrades (paid premium listing)",
      "Verified badges (annual fee for ranking participation)",
      "Awards sponsored by ranked brands",
      "Any direct payment from ranked entity to Citee"
    ]
  },
  "categories_pilot_2026": {
    "country": "PL",
    "tier": "Tier 2 (quarterly scan)",
    "list": [
      "kosmetyki-naturalne",
      "suplementy-nutricosmetyki",
      "diety-pudelkowe",
      "premium-pet-food",
      "kawa-specialty",
      "czekolada-rzemieslnicza",
      "kursy-programowania-bootcampy",
      "kliniki-estetyczne-dermo",
      "fitness-studios-premium",
      "kosmetyki-meskie",
      "swiece-sojowe"
    ],
    "expansion_plan": {
      "Q3_2026": "Add Tier 1 PL categories (kosmetyki ogólne, odzież dziecięca, dom & ogród, elektronika audio, biuro)",
      "Q4_2026": "DACH expansion — pilot 5 categories DE",
      "2027_Q1": "CEE expansion (CZ, SK, HU, RO)"
    }
  },
  "changelog_reference": "See CHANGELOG.md for version history. Methodology evolves through public commits with rationale. NO retroactive changes — modifications apply to FUTURE cycles only."
 }
--- a/prompts/README.md
+++ b/prompts/README.md
@ -0,0 +1,172 @@
 # Prompt Curation Process
 > How Citee Index builds and validates the prompt pool per category. The 6-stage process that prevents the "garbage in, garbage out" failure mode.
 ---
 ## Why this matters
 If the prompt pool is junk ("dyfuzory do włosów ranking", "wąski do samochodu"), the ranking is junk. Prompt quality is the single most important upstream input to ranking integrity.
 This process exists to ensure every prompt in the active pool meets two tests:
 1. **Real buyer test** — would an actual buyer of this category type this query into ChatGPT/Perplexity?
 2. **Reality check** — does this query appear in actual search/discussion data (Google Trends, Reddit, Quora)?
 Prompts failing either test are excluded.
 ## The 6 stages
 ```
 Stage 1: Persona Generator       (AI)
   ↓ 5–10 buyer personas per category
 Stage 2: Prompt Brainstormer     (AI per persona)
   ↓ 200–300 raw prompts
 Stage 3: Reality Check            (Google Trends / Reddit / Quora / AnswerThePublic)
   ↓ ~150 prompts with verified search demand
 Stage 4: Multi-agent Validation  (3 critic agents in parallel)
   ↓ ~120 prompts after critique
 Stage 5: Pilot Test Run           (10-prompt sample × 3 models)
   ↓ ~110 prompts that produce stable, sensible AI outputs
 Stage 6: Human Approval           (founder + category expert)
   ↓ FINAL POOL: 100 prompts
 ```
 ### Stage 1 — Persona Generator
 Claude generates 5–10 buyer personas per category. Each persona has:
 - Demographics (age, location, income bracket)
 - Pain points (what they're trying to solve)
 - Decision factors (price, ingredients, brand, reviews, certifications)
 - Vocabulary (how they actually talk — formal vs colloquial, technical vs lay)
 Example for Świece sojowe PL:
 - "30+ kobieta kupująca prezent dla mamy"
 - "Self-care millennial 25–35 po pracy"
 - "Wnętrzarz minimalistyczne mieszkanie"
 - "Mężczyzna kupujący prezent walentynkowy"
 - "Mama małych dzieci szukająca bezpiecznego zapachu"
 ### Stage 2 — Prompt Brainstormer
 For each persona, Claude generates 30–50 prompts in the voice of that persona — "how would I phrase this question to ChatGPT?" Total per category: ~200–300 raw prompts.
 Distribution target by type (enforced at this stage):
 - Buying intent (weight 2.0): 30%
 - Comparison (weight 1.5): 25%
 - Specific need (weight 1.5): 20%
 - Informational (weight 0.3): 15%
 - Brand-direct (weight 0.3): 10%
 ### Stage 3 — Reality Check
 Each prompt cross-referenced against real-world data:
 | Source | Method | Threshold |
 |---|---|---|
 | **Google Trends API** | PL queries past 12 months | minimum search volume present |
 | **Google Search Console** (where available) | Real search queries to brand sites we have access to | inspirational source for vocabulary |
 | **Reddit search** | r/Polska_Marka, niche subreddits | actual user phrasing |
 | **Quora PL** | Questions asked in category | real curiosity patterns |
 | **AnswerThePublic** | Public scraping of "people also ask" | discovery of long-tail patterns |
 | **People Also Ask (Google)** | For top category queries | semantic neighbors |
 Prompts with zero/marginal real-world signal are removed. ~300 → ~150.
 ### Stage 4 — Multi-agent Validation
 Three AI critic agents review the list in parallel:
 **Agent A — "Real buyer critique"**
 Persona-grounded review. Each persona "reads" the prompts and flags ones that don't sound natural for that persona. Prompts marked unnatural by 2+ personas are removed.
 **Agent B — "Methodology critic"**
 Statistical and structural review. Checks:
 - Prompt type distribution stays within ±5% of target
 - No subcategory over/under-represented
 - Vocabulary diversity (we're not repeating the same phrasing)
 - Length distribution reasonable (no 50-word prompts, no 2-word prompts)
 **Agent C — "Vendor exploit hunter"**
 Anti-gaming review. Identifies prompts that are too easy to game by content marketing fluff:
 - Generic informational queries that any vendor can write a blog post for
 - Prompts where AI answer is dominated by Wikipedia (vendor can edit Wikipedia)
 - Prompts where answer comes from one Reddit post (vendor can write that post)
 Each agent produces a list of flagged prompts. Anything flagged by 2+ agents is removed. ~150 → ~120.
 ### Stage 5 — Pilot Test Run
 The ~120 candidate prompts get a sample test:
 - Pick 10 prompts (stratified across types)
 - Run on ChatGPT-search, Perplexity Sonar, Gemini Pro
 - Each prompt × 3 models = 30 outputs
 **Reject criteria:**
 - AI returns "I don't know" or "this depends on your preferences" (no actionable brand mentions)
 - Outputs across 3 models have zero overlap (prompt produces incoherent/random results)
 - AI returns a list of countries/categories instead of brands (prompt was misinterpreted)
 Prompts failing pilot are flagged for revision or removal. ~120 → ~110.
 ### Stage 6 — Human Approval
 The founder + category expert review the final ~110 candidates and select the production 100.
 **Founder always reviews.** For categories outside founder's domain knowledge, a paid expert reviewer (1–2 hours, $50–100) is engaged:
 | Category | Expert profile |
 |---|---|
 | Kosmetyki naturalne | Beauty product manager / freelance marketer |
 | Suplementy / nutricosmetyki | Nutritionist / DTC supplement marketer |
 | Diety pudełkowe | Fitness coach / dietitian |
 | Premium pet food | Pet specialty store owner / dog trainer |
 | Kawa specialty | Coffee blogger / barista trainer |
 | Czekolada rzemieślnicza | Food blogger / chocolate-focused content creator |
 | Kursy programowania | Bootcamp graduate / hiring manager |
 | Kliniki estetyczne | Dermatologist or aesthetic medicine consultant |
 | Fitness studios | Personal trainer / gym manager |
 | Kosmetyki męskie | Men's grooming influencer / DTC marketer |
 | Świece sojowe | Founder + JAKULO customer service data |
 The final 100 prompts are committed to the closed `prompts/{slug}/` directory (gitignored). A public example framework is committed to `prompts/example-{slug}.md` (this repo) showing the structure and 5–10 illustrative examples per type — but **not the exact production strings**.
 ## Quarterly refresh — 20% rotation
 Every quarter, the curation pipeline runs in refresh mode:
 1. **Trend check** — Google Trends API: which prompts have lost relative search volume?
 2. **New patterns** — Reddit/Quora scrape: what new question patterns have emerged?
 3. **New entrants** — scan model outputs from past quarter: what brands appeared in answers but aren't in our brand catalog?
 4. **Generate replacements** — Stages 1–5 for the rotation set
 5. **Human approval** — founder reviews the proposed 20 swaps in 5–10 minutes
 This prevents Goodhart's Law: as the prompt pool becomes known to vendors (through reverse-engineering or leaks), 20% rotation per quarter ensures vendors can't permanently optimize against our exact queries.
 ## Cost per category
 | Stage | API cost | Human cost |
 |---|---|---|
 | 1 — Persona Generator | ~$0.50 (Claude) | — |
 | 2 — Prompt Brainstormer | ~$1.50 (Claude) | — |
 | 3 — Reality Check | $0 (free APIs) | — |
 | 4 — Multi-agent Validation | ~$3 (Claude × 3 critics) | — |
 | 5 — Pilot Test Run | ~$5 (3 models × 30 outputs) | — |
 | 6 — Human Approval | — | ~30 min founder + 1–2h expert ($50–100 for non-founder categories) |
 | **Total per category** | **~$10** | **~30 min + $50–100 for expert categories** |
 For 11 pilot categories: ~$110 API + ~5 hours founder time + ~$500 expert reviewers.
 ## Quarterly refresh cost
 Per category per quarter: ~$3 API + 5 minutes founder review.
 For 11 categories: ~$35 API + 1 hour founder time per quarter.
 ## Why this is published openly
 We publish the **process** because the integrity of the ranking depends on the integrity of the prompts, and external review of the process is the strongest defense against "your prompts are garbage" attack.
 We do NOT publish the **exact strings** because Goodhart's Law: known prompts get optimized against, ceasing to measure organic AI search behavior.
 The boundary between "open process" and "closed strings" is itself documented openly.
--- a/prompts/example-swiece-sojowe-pl.md
+++ b/prompts/example-swiece-sojowe-pl.md
@ -0,0 +1,100 @@
 # Example prompt framework — Świece sojowe PL
 > Illustrative framework showing how a category prompt pool is structured. Exact production strings remain CLOSED in `prompts/swiece-sojowe-pl/` (gitignored).
 This document is **public** to demonstrate the curation process and prompt-type distribution. It is **not** the actual production prompt list.
 ---
 ## Distribution
 100 prompts total, distributed by type:
 | Type | Count | Weight | Share |
 |---|---|---|---|
 | Buying intent | 30 | 2.0 | 30% |
 | Comparison | 25 | 1.5 | 25% |
 | Specific need | 20 | 1.5 | 20% |
 | Informational | 15 | 0.3 | 15% |
 | Brand-direct | 10 | 0.3 | 10% |
 ## Personas referenced
 - "30+ kobieta kupująca prezent dla mamy"
 - "Self-care millennial 25–35 po pracy"
 - "Wnętrzarz minimalistyczne mieszkanie"
 - "Mężczyzna kupujący prezent walentynkowy"
 - "Mama małych dzieci szukająca bezpiecznego zapachu"
 - "Eko-świadomy konsument 30+"
 - "Hostess kupująca świece dla agroturystyki"
 ## Buying intent (30 prompts × 2.0 weight) — illustrative examples
 These prompts signal active purchase intent. Highest weight because they correlate directly with revenue impact for ranked brands.
 - "Gdzie kupić premium ręcznie robioną świecę sojową na prezent dla mamy"
 - "Polska marka świec sojowych z certyfikatem ekologicznym do 200 zł"
 - "Świeca sojowa w eleganckim opakowaniu jako prezent firmowy"
 - "Gdzie zamówić zestaw prezentowy z polskich świec sojowych handmade"
 - *(...26 more, exact strings closed)*
 ## Comparison (25 prompts × 1.5 weight) — illustrative examples
 Decision-stage queries. User is comparing brands or making a choice.
 - "JAKULO vs Naturaodpauli — która polska marka świec sojowych lepsza"
 - "Najlepsze polskie świece sojowe handmade 2026 ranking"
 - "Polskie świece sojowe premium — porównanie najpopularniejszych marek"
 - *(...22 more, exact strings closed)*
 ## Specific need (20 prompts × 1.5 weight) — illustrative examples
 Specific use cases or attributes — buyer knows what they want.
 - "Świeca sojowa o zapachu wanilii i bursztynu w średnim rozmiarze"
 - "Długo paląca naturalna świeca sojowa do sypialni 60 godzin"
 - "Świeca sojowa bezzapachowa dla osoby z alergią na zapachy"
 - *(...17 more, exact strings closed)*
 ## Informational (15 prompts × 0.3 weight) — illustrative examples
 Research-stage queries. Lower weight because easily gamed by content marketing fluff.
 - "Czym różni się świeca sojowa od parafinowej"
 - "Jak rozpoznać prawdziwie sojową świecę"
 - "Czy świece sojowe są zdrowe i bezpieczne"
 - *(...12 more, exact strings closed)*
 ## Brand-direct (10 prompts × 0.3 weight) — illustrative examples
 Direct brand queries. Lower weight because brand winning queries about itself = baseline expectation, not value-add.
 - "JAKULO opinie 2026 czy warto kupować"
 - "Co sądzą o polskiej marce świec Naturaodpauli"
 - *(...8 more, exact strings closed)*
 ## Anti-patterns (excluded)
 The following types of prompts are explicitly excluded during Stage 4 (Vendor exploit hunter critic):
 | Pattern | Reason | Example |
 |---|---|---|
 | Single-word | No buyer intent, ambiguous | "świeczki", "świece" |
 | Hobbystyczny / DIY | Off-topic for retail | "DIY świece sojowe w domu" |
 | B2B retail | Not consumer-facing | "hurtownia świec sojowych Warszawa" |
 | Brand-agnostic generic | Easy content marketing target | "co to świeca sojowa" |
 | Price-only without category context | Too vague | "tania świeca" |
 | Off-topic technicality | Detection of hobby-craft, not retail | "knot bawełniany do świec wymiary" |
 | Polish typos at scale | Not real query patterns | "swieczka sojova" (single typo OK if frequent in real data) |
 ## Quarterly rotation policy
 Each quarter, 20 prompts (20% of pool) are rotated:
 - 10 retired (lowest real-world search signal in past 90 days, OR known to be gamed)
 - 10 added (new patterns from Reddit/Quora/trends, new persona refinements, new product attributes emerging)
 Rotation log is committed to `prompts/swiece-sojowe-pl/rotation_log.md` (closed) with rationale per swap.
 ---
 **This framework is illustrative.** The actual 100 production prompts evolve with each quarterly cycle and are not published as exact strings — only the distribution, personas, anti-patterns, and example patterns are public.
--- a/taxonomy.md
+++ b/taxonomy.md
@ -0,0 +1,102 @@
 # Citee Index Taxonomy
 > Category list, tier system, and scan cadence per tier. Live document — updated as new categories are added or existing ones are reclassified.
 **Last updated:** 2026-05-03 (v1.0.0)
 ---
 ## Tier system
 Categories are classified by **market depth** (number of visible brands) and **GMV** (PLN annual e-commerce volume in category).
 | Tier | Criteria | Scan cadence | Brands tracked per cycle |
 |---|---|---|---|
 | **Tier 1 — Large** | >1000 brands visible, >100M PLN GMV | Monthly | 100 |
 | **Tier 2 — Medium** | 100–1000 brands, 10–100M PLN GMV | Quarterly | 50–100 |
 | **Tier 3 — Niche** | <100 brands, <10M PLN GMV | Semi-annual | 20–50 |
 Cross-cutting categories (e.g., "Polish DTC brands Top 100", "Polish handmade Top 100") are published **annually** as flagship reports.
 ## Country coverage
 | Country | Status | First publication |
 |---|---|---|
 | **PL** (Poland) | Pilot — 11 categories | August 2026 |
 | **DE** (Germany) | Planned Q4 2026 | — |
 | **AT** (Austria) | Planned with DE | — |
 | **CH** (Switzerland) | Planned with DE | — |
 | **CZ** (Czech Republic) | Planned Q1 2027 | — |
 | **SK** (Slovakia) | Planned Q1 2027 | — |
 | **HU** (Hungary) | Planned Q2 2027 | — |
 | **RO** (Romania) | Planned Q2 2027 | — |
 | **DK / SE / NO / FI** (Nordic) | Planned Q3 2027 | — |
 | FR / ES / IT | Year 2 | — |
 | UK / US (English-speaking) | **Not in roadmap.** Heavy competition (Profound, Otterly, AthenaHQ). Citee focuses on markets where local language and brand knowledge create defensible moat. | — |
 ---
 ## PL — pilot categories (Tier 2)
 All 11 launch categories scan quarterly. Selection criteria: each ranked brand is a potential Citee Pro customer (DTC consumer brands or service businesses with marketing budgets), zero overlap with B2B SaaS competitors of LMW Commerce.
 | # | Category slug | Display name | Sample brands | Notes |
 |---|---|---|---|---|
 | 1 | `kosmetyki-naturalne-pl` | Kosmetyki naturalne | Resibo, Tołpa, Yope, Bielenda, Dr Irena Eris, Vianek, Lirene | Premium DTC, brand-conscious vendors |
 | 2 | `suplementy-nutricosmetyki-pl` | Suplementy / nutricosmetyki | Olimp, Trec, OstroVit, Allnutrition, Health Labs Care, NaturDay, Pharmovit | DTC growth segment, marketing-heavy |
 | 3 | `diety-pudelkowe-pl` | Diety pudełkowe / catering dietetyczny | Maczfit, Nice To Fit You, Fit&Easy, BistroBox, Light Box | Subscription DTC, high LTV |
 | 4 | `premium-pet-food-pl` | Premium pet food / akcesoria | Brit Care, Acana, Animonda, Royal Canin, Josera, Belcando, Pies Pisany | Loyal customers, premium pricing |
 | 5 | `kawa-specialty-pl` | Kawa specialty / gourmet | Coffee Plant, Bonjour Cafe, Etno Cafe, Hard Beans, Coffeedesk, Cafezal | Vocal community, Reddit-rich |
 | 6 | `czekolada-rzemieslnicza-pl` | Czekolada rzemieślnicza / premium | Manufaktura Czekolady, Mount Blanc, Wawel premium, Ujejski, Wedel exclusive | Premium DTC lifestyle |
 | 7 | `kursy-programowania-bootcampy-pl` | Kursy programowania / IT bootcampy | Kodilla, Coders Lab, Boring Owl, Future Collars, SDA, WSB Online, Akademia Górska | High-LTV edutech, growing 2026 |
 | 8 | `kliniki-estetyczne-dermo-pl` | Kliniki estetyczne / dermatologia premium | Klinika La Perla, Medilumi, Klinika Holistic, Estetica, Dermika | Service business, leadgen-driven |
 | 9 | `fitness-studios-premium-pl` | Fitness studios / personal training | Calypso, Pure Jatomi, Fabric Health Club, niche premium studios | Subscription model, vocal community |
 | 10 | `kosmetyki-meskie-pl` | Kosmetyki dla mężczyzn / męska pielęgnacja | Bartholomew, Chytry Lis, Onlomen, Ziaja Yego, Nivea Men premium, Lirene Men | Growing 2026 segment, new DTC entrants |
 | 11 | `swiece-sojowe-pl` | Świece sojowe | JAKULO, Naturaodpauli, Bookiet, Triny, Aromatowo, Yush, Oskiknot, LemonGlas, Paleta Smaków, Bennovate | Pilot test bed + LMW founder pet category |
 ## Excluded from pilot — explicit rationale
 | Category | Why excluded |
 |---|---|
 | Wino / alkohol PL | Polish "Ustawa o wychowaniu w trzeźwości" art. 13¹ restricts alcohol advertising — regulatory risk too high |
 | Rękodzieło / handmade / Etsy crowd | Margins 20-30%, micro-businesses won't pay 449 PLN/mo for visibility tools |
 | B2B SaaS (CRM, marketing automation, e-commerce platforms) | LMW Commerce competes in adjacent space — these vendors won't pay Citee competitor; also they have own visibility tools |
 | Hosting / domeny | Vendors with own marketing teams, low conversion to Pro SaaS |
 | Banki / ubezpieczenia / fintech B2B | Buyers of reports (agencies) but ranked brands won't buy Pro — banks have enterprise marketing tools |
 ## Q3-Q4 2026 expansion candidates (Tier 1, monthly scan)
 To be added after pilot validation:
 - Kosmetyki ogólne (mainstream, not just naturalne) — mega rynek
 - Odzież dziecięca DTC
 - Dom & ogród / wyposażenie wnętrz
 - Elektronika audio premium
 - Akcesoria biurowe / papiernicze B2B-light
 ## Cross-cutting flagship reports (annual)
 - "Citee Index — Polski DTC e-commerce Top 100" (year-end)
 - "Citee Index — Polski handmade Top 50" (year-end)
 - "Citee Index — Polski D2C lifestyle ecosystem" (mid-year)
 These reports cut across categories to identify the strongest brand presences in AI search overall, regardless of vertical.
 ---
 ## Adding a new category — checklist
 Before adding a category to active scan, the following must be true:
 1. **Market depth** — at least 20 brands with internet presence visible in PL e-commerce
 2. **AI search demand** — Google Trends data confirms users search for category-related queries
 3. **Buyer profile** — ranked brands fit potential Citee Pro customer persona OR clear agency/media buyer for reports
 4. **No regulatory risk** — category is not subject to advertising restrictions (alcohol, gambling, prescription pharma, etc.)
 5. **Prompt curation feasible** — buyer personas identifiable, decision factors articulable, expert reviewer available if outside founder's domain knowledge
 6. **Category integrated with brand catalog** — minimum 30 brands cataloged with normalized names (handling variations: "JAKULO" vs "Jakulo" vs "jakulo.pl")
 When all 6 are true, category enters the pilot validation cycle (3 cycles minimum before public publication).
 ## Versioning
 Changes to this taxonomy are tracked in [CHANGELOG.md](./CHANGELOG.md). Adding a new category, reclassifying tier, or removing a category constitutes a MINOR version bump. Adding a new country or fundamentally revising the tier system is a MAJOR version bump.