DATA — Public reference datasets for methodology:
- data/README.md: schema + format definitions for brand catalogs
- data/swiece-sojowe-pl/brand_catalog.json: 35 tracked brands (33 manufacturers + 2 importers) + 5 excluded marketplaces/resellers
- data/swiece-sojowe-pl/brand_catalog.md: human-readable companion
- data/swiece-sojowe-pl/market_metadata.json: GMV estimate, personas, seasonality, expected dynamics
TOOLS — 6-stage prompt curation pipeline (Python 3.12+):
- tools/prompt_curation/README.md: process documentation + cost estimates
- tools/prompt_curation/config.py: tunable parameters per stage
- tools/prompt_curation/.env.example: required API keys template
- tools/prompt_curation/requirements.txt: dependencies
- tools/prompt_curation/1_persona_generator.py: Claude generates 7 buyer personas
- tools/prompt_curation/2_prompt_brainstormer.py: per persona × 30 prompts in voice
- tools/prompt_curation/3_reality_checker.py: Google Trends + Reddit cross-check
- tools/prompt_curation/4_validation_agents.py: 3 critic agents async (real_buyer/methodology/exploit_hunter)
- tools/prompt_curation/5_pilot_test_runner.py: sample × 3 LLM models pre-flight
- tools/prompt_curation/6_human_review_export.py: CSV export for founder approval
- tools/prompt_curation/7_finalize.py: post-approval → closed prompts/{cat}/v{N}.json
- tools/prompt_curation/pipeline.py: orchestrator (stages 1–6, then human review, then 7)
GITIGNORE — Fixed .env.* exclusion to allow .env.example.
This commit completes Faza 1. Stages outputs (data/{cat}/personas.json,
raw_prompts.json, validated_prompts.json, critic_review.json, pilot_test_results.json,
for_human_review.csv) are runtime artifacts — public when committed, derived from
public methodology + public brand catalog. Final approved prompt strings in
prompts/{cat}/v{N}.json remain CLOSED (gitignored, anti-Goodhart's Law).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
97 lines
3.4 KiB
Markdown
97 lines
3.4 KiB
Markdown
# Prompt Curation Pipeline
|
||
|
||
> Multi-stage pipeline for curating production prompts per category. Translates the 6-stage methodology process from `prompts/README.md` into runnable code.
|
||
|
||
---
|
||
|
||
## Pipeline overview
|
||
|
||
```
|
||
1_persona_generator.py → data/{category}/personas.json
|
||
2_prompt_brainstormer.py → data/{category}/raw_prompts.json
|
||
3_reality_checker.py → data/{category}/validated_prompts.json
|
||
4_validation_agents.py → data/{category}/critic_review.json
|
||
5_pilot_test_runner.py → data/{category}/pilot_test_results.json
|
||
6_human_review_export.py → data/{category}/for_human_review.csv
|
||
7_finalize.py → prompts/{category}/v{N}.json (CLOSED)
|
||
```
|
||
|
||
Each stage is idempotent and can be re-run with cached intermediate outputs.
|
||
|
||
## Tech stack
|
||
|
||
- **Python 3.12+**
|
||
- **Anthropic SDK** (`anthropic>=0.50.0`) — Claude for persona generation, brainstorming, critic agents
|
||
- **OpenAI SDK** (`openai>=1.50.0`) — GPT-4o-search for pilot test runs
|
||
- **Google Generative AI** (`google-generativeai>=0.8.0`) — Gemini for pilot test runs
|
||
- **httpx** for Perplexity API
|
||
- **pandas** for CSV export to human reviewer
|
||
- **pytrends** for Google Trends API (free, unofficial)
|
||
- **praw** for Reddit search (requires Reddit OAuth app)
|
||
|
||
## Usage
|
||
|
||
```bash
|
||
# 1. Set up environment variables (see .env.example)
|
||
cp .env.example .env
|
||
# Edit .env with API keys
|
||
|
||
# 2. Run pipeline for a category
|
||
python pipeline.py --category swiece-sojowe-pl
|
||
|
||
# Or run individual stages
|
||
python 1_persona_generator.py --category swiece-sojowe-pl
|
||
python 2_prompt_brainstormer.py --category swiece-sojowe-pl
|
||
# ... etc.
|
||
|
||
# 3. After Stage 6, review CSV manually + approve in human_review tool
|
||
python 6_human_review_export.py --category swiece-sojowe-pl
|
||
# Open data/{category}/for_human_review.csv in spreadsheet
|
||
# Mark approved/rejected/edited
|
||
# Save back as for_human_review_decided.csv
|
||
|
||
# 4. Finalize
|
||
python 7_finalize.py --category swiece-sojowe-pl
|
||
# Outputs: prompts/{category}/v1.json (gitignored, closed)
|
||
```
|
||
|
||
## Cost per category (estimated)
|
||
|
||
| Stage | API used | Cost |
|
||
|---|---|---|
|
||
| 1 — Persona Generator | Claude Sonnet | ~$0.50 |
|
||
| 2 — Prompt Brainstormer | Claude Sonnet | ~$1.50 |
|
||
| 3 — Reality Checker | Free APIs (Trends, Reddit, Quora) | $0 |
|
||
| 4 — Validation Agents (3 critics) | Claude Sonnet × 3 | ~$3 |
|
||
| 5 — Pilot Test Runner (10 prompts × 3 models) | GPT-4o + Perplexity + Gemini | ~$5 |
|
||
| 6 — Human Review Export | (no API) | $0 |
|
||
| 7 — Finalize | (no API) | $0 |
|
||
| **TOTAL** | | **~$10** |
|
||
|
||
For 11 pilot categories: ~$110.
|
||
|
||
## Configuration
|
||
|
||
See `config.py` for tunable parameters per stage:
|
||
- Number of personas (default: 7)
|
||
- Prompts per persona (default: 30)
|
||
- Type distribution targets (40/25/20/15/10 weights → buying/comparison/specific/info/brand-direct)
|
||
- Pilot sample size (default: 10)
|
||
- Critic agent thresholds (flagged-by-N agents → remove)
|
||
|
||
## Quarterly rotation mode
|
||
|
||
```bash
|
||
python pipeline.py --category swiece-sojowe-pl --mode rotation
|
||
```
|
||
|
||
In rotation mode:
|
||
- Reads existing `prompts/{category}/v{N}.json`
|
||
- Identifies 20 prompts with lowest real-world signal in past 90 days (via Stage 3 scan)
|
||
- Generates 20 replacements (Stages 1–5 for refresh set)
|
||
- Outputs `prompts/{category}/v{N+1}.json` (CLOSED)
|
||
- Logs swap decisions to `prompts/{category}/rotation_log.md` (CLOSED)
|
||
|
||
---
|
||
|
||
**Status:** v0.1 — initial scaffold. Implementation in progress as part of Citee Index pilot phase (May–August 2026).
|