Synthesis preparation script generates structured prompts with mandatory unit_id citations

Source type: obs · Harvested: 2026-05-03 · Original date: 2026-05-03T12:26:37.877Z Metadata: {"project":"lunhsiangyuan","type":"feature","obs_id":65013}


obs/65013 · feature · 2026-05-03T12:26:37.877Z

Synthesis preparation script generates structured prompts with mandatory unit_id citations

Implemented synthesis preparation pipeline as critical bridge between harvest and narrative generation. The 03-synthesize-prep.ts script operationalizes gpt-5.5’s “forced unit_id attribution” recommendation by generating structured prompts that make hallucination structurally impossible. Script filters manifest units by topic using topic-router, reads actual content from source files (respecting anchor boundaries for heading-delimited sections), bundles existing wiki for incremental context, and assembles comprehensive synthesis prompt with embedded hard rules. Four non-negotiable constraints enforce provenance discipline: mandatory [📎 unit-id] citations at claim/paragraph end, strict material-only facts, append-only wiki modification, explicit conflict detection without auto-resolution. Prompt includes Karpathy narrative style guidelines (causal reasoning over fact lists, problem-first framing, 300-600 word daily increments) and self-check items. First test run on medical-oncology topic (2026-05-02) successfully bundled 127 units into 364KB structured prompt showing distribution: 88 observations, 1 memory file, 38 AAI Wiki entries. This materializes the architectural shift from “prompt hoping AI doesn’t hallucinate” to “schema-enforced attribution where AI想亂編也亂不了” per primary session insight.

Concepts: [“how-it-works”,“pattern”,“why-it-exists”,“problem-solution”]

Facts: [“scripts/03-synthesize-prep.ts implements synthesis prompt generation: reads manifest.json, filters by topic, bundles unit content, writes structured prompt to wiki/{topic}/_synthesis-prompt-{date}.md”,“Prompt enforces four hard rules: (1) every claim must cite unit_id as [📎 unit-id-1, unit-id-2], (2) no fabrication beyond provided units, (3) append-only no rewrite of existing wiki, (4) conflicts listed in possible_conflicts section with excerpts and 0-1 confidence score without auto-resolution”,“readUnitContent() function extracts content from source_file using anchor for heading-delimited sections, falls back to full file if no anchor”,“Prompt includes existing wiki as context for incremental append, falls back to “(此 topic 尚無現有 wiki,這是首次合成)” if no wiki/{topic}/index.md exists”,“Karpathy narrative style guidelines embedded: start with “why” (what problem solved), use causal relationships not fact bullets, include examples, Traditional Chinese with English technical terms, 300-600 words per daily increment”,“Self-check checklist included: unit_id at end of every claim, all cited unit_ids in material list, no external facts, no rewrite of old wiki”,“First execution for medical-oncology 2026-05-02 found 127 units (88 obs + 1 memory + 38 aai_wiki), generated 364286 bytes prompt”]



[← 回 Alfred Brain Hub]