citee-methodology/tools/prompt_curation/4_validation_agents.py
Jacek Kubas 03a397343e Faza 1: brand catalog (świece sojowe PL) + prompt curation pipeline
DATA — Public reference datasets for methodology:
- data/README.md: schema + format definitions for brand catalogs
- data/swiece-sojowe-pl/brand_catalog.json: 35 tracked brands (33 manufacturers + 2 importers) + 5 excluded marketplaces/resellers
- data/swiece-sojowe-pl/brand_catalog.md: human-readable companion
- data/swiece-sojowe-pl/market_metadata.json: GMV estimate, personas, seasonality, expected dynamics

TOOLS — 6-stage prompt curation pipeline (Python 3.12+):
- tools/prompt_curation/README.md: process documentation + cost estimates
- tools/prompt_curation/config.py: tunable parameters per stage
- tools/prompt_curation/.env.example: required API keys template
- tools/prompt_curation/requirements.txt: dependencies
- tools/prompt_curation/1_persona_generator.py: Claude generates 7 buyer personas
- tools/prompt_curation/2_prompt_brainstormer.py: per persona × 30 prompts in voice
- tools/prompt_curation/3_reality_checker.py: Google Trends + Reddit cross-check
- tools/prompt_curation/4_validation_agents.py: 3 critic agents async (real_buyer/methodology/exploit_hunter)
- tools/prompt_curation/5_pilot_test_runner.py: sample × 3 LLM models pre-flight
- tools/prompt_curation/6_human_review_export.py: CSV export for founder approval
- tools/prompt_curation/7_finalize.py: post-approval → closed prompts/{cat}/v{N}.json
- tools/prompt_curation/pipeline.py: orchestrator (stages 1–6, then human review, then 7)

GITIGNORE — Fixed .env.* exclusion to allow .env.example.

This commit completes Faza 1. Stages outputs (data/{cat}/personas.json,
raw_prompts.json, validated_prompts.json, critic_review.json, pilot_test_results.json,
for_human_review.csv) are runtime artifacts — public when committed, derived from
public methodology + public brand catalog. Final approved prompt strings in
prompts/{cat}/v{N}.json remain CLOSED (gitignored, anti-Goodhart's Law).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 18:40:12 +02:00

244 lines
7.8 KiB
Python

"""Stage 4 — Multi-agent Validation.
Three Claude critic agents review prompts in parallel:
- Agent A: Real buyer critique (does this sound like real persona phrasing?)
- Agent B: Methodology critic (statistical balance, distribution, vocabulary)
- Agent C: Vendor exploit hunter (which prompts are too easy to game by content marketing?)
Prompts flagged by N+ agents (default: 2) are removed.
"""
from __future__ import annotations
import argparse
import asyncio
import json
import os
from pathlib import Path
from anthropic import AsyncAnthropic
from config import CONFIG
AGENT_A_PROMPT = """You are reviewing a list of prompts that buyer personas would supposedly type into ChatGPT/Perplexity/Gemini when researching purchases in **{category}**.
Your job: identify prompts that DON'T sound natural for any realistic Polish e-commerce buyer.
Flag prompts that:
1. Are too formal/academic (no buyer phrases queries like a research paper)
2. Are too long (real users don't type 30-word queries)
3. Are too short / generic (single words or 2-word phrases)
4. Use vocabulary no real Polish buyer would use
5. Are buyer-impossible (e.g., asking about specs only B2B buyer would care about, in a B2C context)
Here are the {prompt_count} prompts to review:
{prompts_list}
Output JSON array of flagged prompt IDs (use the index as ID, 0-indexed):
```json
{{
"flagged_indices": [3, 7, 12],
"reasons": {{
"3": "Too formal — no real buyer types like this",
"7": "Single word, no buying intent",
"12": "B2B language in B2C context"
}}
}}
```
Only output JSON. No prose."""
AGENT_B_PROMPT = """You are a methodology critic for a Polish e-commerce AI visibility ranking project.
Review this prompt list for **statistical and structural issues**:
Target distribution per the methodology:
- buying: 30% (weight 2.0)
- comparison: 25% (weight 1.5)
- specific_need: 20% (weight 1.5)
- informational: 15% (weight 0.3)
- brand_direct: 10% (weight 0.3)
Total prompts: {prompt_count}
Flag issues:
1. Type distribution off by >10% from target
2. Vocabulary too repetitive (same phrases recurring)
3. Subcategory bias (e.g., 80% prompts about prezenty, 20% about everything else)
4. Length distribution unreasonable (all prompts are very long or very short)
5. Missing realistic buyer scenarios (e.g., no prompts about specific occasions, sizes, attributes)
Prompts list:
{prompts_list}
Output:
```json
{{
"flagged_indices": [...],
"reasons": {{...}},
"structural_issues": [
"Type 'comparison' is over-represented at 35% (target 25%)",
"20+ prompts mention 'prezent dla mamy' — too repetitive"
]
}}
```
Only JSON output."""
AGENT_C_PROMPT = """You are a vendor exploit hunter for a Polish e-commerce AI visibility ranking.
Your job: identify prompts that are TOO EASY for a vendor to game by content marketing fluff.
A prompt is "exploitable" if:
1. The answer can be dominated by writing one good blog post
2. The answer comes primarily from Wikipedia (vendors can edit Wikipedia)
3. The answer is brand-agnostic (any vendor can position to win it via SEO content)
4. The prompt would be answered by listing Wikipedia / blog content rather than specific brand recommendations
We WANT prompts where:
- AI must recommend specific brands (with real reviews, real authority, multi-source citation)
- Prompt requires real product positioning, not just content production
- Multiple sources (reviews, Reddit, brand sites) need to align for ranking
Flag prompts that are too gameable:
{prompts_list}
Output:
```json
{{
"flagged_indices": [...],
"reasons": {{
"5": "Generic 'co to świeca sojowa' — easily gamed by Wikipedia + blog post",
"12": "Brand-agnostic 'jak działa świeca sojowa' — content marketing fluff target"
}}
}}
```
Only JSON output."""
async def run_agent(client: AsyncAnthropic, prompt: str) -> dict:
"""Single agent call."""
response = await client.messages.create(
model=CONFIG.critic_models["real_buyer_critique"],
max_tokens=4000,
messages=[{"role": "user", "content": prompt}],
)
text = response.content[0].text.strip()
if text.startswith("```json"):
text = text[7:]
if text.endswith("```"):
text = text[:-3]
return json.loads(text.strip())
async def run_three_critics(prompts: list[dict], category_display_name: str) -> dict:
client = AsyncAnthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
# Format prompts for review
prompts_text = "\n".join(
f"{i}. [{p['type']}] {p['prompt']}" for i, p in enumerate(prompts)
)
agent_a = AGENT_A_PROMPT.format(
category=category_display_name,
prompt_count=len(prompts),
prompts_list=prompts_text,
)
agent_b = AGENT_B_PROMPT.format(
prompt_count=len(prompts),
prompts_list=prompts_text,
)
agent_c = AGENT_C_PROMPT.format(
prompts_list=prompts_text,
)
# Run 3 agents in parallel
print("[Stage 4] Running 3 critic agents in parallel...")
results = await asyncio.gather(
run_agent(client, agent_a),
run_agent(client, agent_b),
run_agent(client, agent_c),
)
return {
"agent_a_real_buyer": results[0],
"agent_b_methodology": results[1],
"agent_c_exploit_hunter": results[2],
}
def aggregate_flags(critic_results: dict, total_prompts: int) -> dict:
"""Count how many agents flagged each prompt index."""
flag_counts: dict[int, list[str]] = {}
for agent_name, result in critic_results.items():
for idx in result.get("flagged_indices", []):
if idx not in flag_counts:
flag_counts[idx] = []
reason = result.get("reasons", {}).get(str(idx), "no reason given")
flag_counts[idx].append(f"{agent_name}: {reason}")
flagged_for_removal = [
idx for idx, reasons in flag_counts.items()
if len(reasons) >= CONFIG.flagged_by_n_critics_to_remove
]
return {
"flag_counts_by_prompt": flag_counts,
"flagged_for_removal": sorted(flagged_for_removal),
"removal_threshold_critics": CONFIG.flagged_by_n_critics_to_remove,
"total_prompts": total_prompts,
"total_removed": len(flagged_for_removal),
"total_kept": total_prompts - len(flagged_for_removal),
}
def main():
parser = argparse.ArgumentParser(description="Multi-agent validation of prompts.")
parser.add_argument("--category", required=True)
parser.add_argument("--display-name", required=True)
args = parser.parse_args()
data_dir = Path(__file__).parent.parent.parent / "data" / args.category
validated_file = data_dir / "validated_prompts.json"
if not validated_file.exists():
raise FileNotFoundError(f"Run 3_reality_checker.py first. Missing: {validated_file}")
with open(validated_file, "r", encoding="utf-8") as f:
validated_data = json.load(f)
# Filter out reality-check failures first
candidates = [p for p in validated_data["validated_prompts"] if p["reality_signal"] != "fail"]
print(f"[Stage 4] Reviewing {len(candidates)} prompts (post-reality-check)...")
critic_results = asyncio.run(run_three_critics(candidates, args.display_name))
aggregation = aggregate_flags(critic_results, len(candidates))
output = {
"category": args.category,
"input_count": len(candidates),
"critic_results": critic_results,
"aggregation": aggregation,
"kept_prompts": [
p for i, p in enumerate(candidates) if i not in aggregation["flagged_for_removal"]
],
}
output_file = data_dir / "critic_review.json"
with open(output_file, "w", encoding="utf-8") as f:
json.dump(output, f, ensure_ascii=False, indent=2)
print(f"[Stage 4] ✅ Saved {output_file}")
print(f"[Stage 4] Removed: {aggregation['total_removed']}, Kept: {aggregation['total_kept']}")
if __name__ == "__main__":
main()