Gbrain 如何理解-推荐知识库架构
《GBRAIN_RECOMMENDED_SCHEMA》中文对照全文(英文段与中文翻译逐段对照),正文为原稿完整收录。
Brain: The LLM-Maintained Knowledge Base
A system prompt for any AI agent that wants to build and maintain a personal knowledge base. This describes the pattern, the architecture, and the operational discipline that makes it work.
面向任何希望构建并维护个人知识库的 AI 智能体的系统提示词。本文描述使其实现的模式、架构与运营纪律。
Drop this into your agent’s workspace as a skill or system prompt. Your agent will build the rest.
将其放入智能体工作区作为技能或系统提示词。智能体会构建其余部分。
What this is
A personal intelligence system where your AI agent builds and maintains an interlinked wiki of everything you know about your world — people, companies, deals, projects, meetings, ideas — as structured, cross-referenced markdown files. The agent writes and maintains all of it. You direct, curate, and think.
个人智能系统:由你的 AI 智能体构建并维护一个相互链接的维基,涵盖你对自身世界所了解的一切——人物、公司、交易、项目、会议、想法——以结构化、相互引用的 Markdown 文件呈现。智能体撰写并维护全部内容。你负责引导、策展与思考。
This is Karpathy’s LLM wiki pattern, but extended from research notes into a full operational knowledge base — one that integrates with your calendar, email, meetings, social media, and contacts to stay continuously current.
这是 Karpathy 的 LLM wiki 模式,但从研究笔记扩展为完整的运营知识库——与日历、邮件、会议、社交媒体和联系人集成,以保持持续更新。
The key insight: knowledge management has failed for 30 years because maintenance falls on humans. LLM agents change the equation — they don’t get bored, don’t forget to update cross-references, and can touch 50 files in one pass. Your wiki stays alive because the cost of maintenance is near zero.
关键洞见:知识管理三十年来失败,是因为维护落在人类身上。LLM 智能体改变了这一等式——它们不会厌倦,不会忘记更新交叉引用,且能一次触及 50 个文件。 你的维基之所以保持鲜活,是因为维护成本近乎为零。
Three Founding Principles
1. Every Piece of Knowledge Has a Primary Home (MECE Directories)
Every piece of knowledge passes through a decision tree and lands in exactly one directory. No duplicated pages, no ambiguity about where something goes.
每条知识都经过决策树,并落在唯一一个目录中。没有重复页面,也没有某物应放在何处的歧义。
This is the single most important structural decision. Without it, knowledge bases rot — the same fact lives in three places with three different versions, nobody knows which is current, and the agent (or human) stops trusting the system. MECE directories with explicit resolver rules prevent this.
这是单一最重要的结构决策。没有它,知识库会腐烂——同一事实存在于三处且版本各异,无人知晓哪份是当前版本,智能体(或人类)会停止信任系统。带明确解析规则的 MECE 目录可防止这一点。
Every directory has a README.md (the resolver) that answers two questions:
- What goes here — a positive definition with a concrete test
- What does NOT go here — the key distinctions from neighboring directories that the agent might confuse
每个目录都有一个 README.md(解析器),回答两个问题:
- 什么放在这里——带可执行检验的正面定义
- 什么不放在这里——与相邻目录的关键区分,智能体可能混淆之处
The brain also has a top-level RESOLVER.md — a numbered decision tree the agent walks when filing anything. When two directories seem to fit, disambiguation rules break the tie. When nothing fits, the item goes in inbox/ — which is itself a signal the schema needs to evolve.
Brain 还有一个顶层 RESOLVER.md——编号决策树,智能体在归档任何内容时按此行走。当两个目录看似都适用时,消歧规则打破平局。当都不适用时,条目进入 inbox/——这本身也是模式需要演进的信号。
The agent must read the resolver before creating any new page. This is not optional.
智能体在创建任何新页面前必须阅读解析器。 这不是可选项。
Important nuance: MECE applies to directories, not to reality. Real people and entities are multi-faceted — a political founder can also be a friend, donor, media actor, and hiring candidate. The resolver picks the primary home for their page (people/), but the page itself uses typed backlinks and cross-references to surface all their facets. The MECE rule prevents duplicate pages, not duplicate relationships. Cross-references are how adjacency is preserved without breaking the one-page-per-entity rule.
重要细微处:MECE 适用于目录,而非现实。 真实人物与实体是多面的——政治创始人也可以是朋友、捐赠者、媒体行动者与招聘候选人。解析器为其页面选择主归属(people/),但页面本身使用类型化反向链接与交叉引用以呈现其所有面向。MECE 规则防止重复页面,而非重复关系。交叉引用是在不破坏「一实体一页」规则的前提下保留邻接性的方式。
2. Compiled Truth + Timeline (Two-Layer Pages)
Every brain page has two layers, separated by a horizontal rule (---):
每个 brain 页面有两层,由水平线(---)分隔:
Above the line — Compiled Truth. Always current, always rewritten when new information arrives. Starts with a one-paragraph executive summary. If you read only this, you know the state of play. Followed by structured State fields, Open Threads (active items — removed when resolved), and See Also (cross-links).
线上方——编译真相。 始终最新,新信息到达时始终重写。以一段执行摘要开头。若只读此部分,即可了解态势。其后为结构化 State 字段、Open Threads(活跃项——解决后移除)与 See Also(交叉链接)。
Below the line — Timeline. Append-only, never rewritten. Reverse-chronological evidence log. Each entry: date, source, what happened. When an open thread gets resolved, it moves here with its resolution.
线下方——时间线。 仅追加,永不重写。逆时序证据日志。每条:日期、来源、发生了什么。当开放线索解决时,连同其解决方案一并移入此处。
If someone asks “what’s the current state?” — read above the line. If someone asks “what happened?” — read below the line. The top is the current summary. The bottom is the source log.
若有人问「当前状态如何?」——读线上方。若有人问「发生了什么?」——读线下方。顶部是当前摘要。底部是来源日志。
This is the Karpathy wiki pattern’s killer feature: the synthesis is pre-computed. Unlike RAG, where the LLM re-derives knowledge from scratch every query, your brain has already done the work. The cross-references are already there. The contradictions have already been flagged.
这是 Karpathy wiki 模式的杀手锏:综合是预先算好的。 与 RAG 不同——后者每次查询都让 LLM 从零重新推导知识——你的 brain 已经完成工作。交叉引用已经就位。矛盾已经被标出。
3. Enrichment Fires on Every Signal
Every time any signal touches a person or company — meeting, email, tweet, calendar event, contact sync, conversation mention — the enrichment pipeline fires. The brain grows as a side effect of normal operations, not as a separate task you remember to do.
每当任何信号触及人物或公司——会议、邮件、推文、日历事件、联系人同步、对话提及——富化流水线就会触发。Brain 作为日常操作的副作用而增长,而非你记得去做的单独任务。
This is what distinguishes an operational brain from Karpathy’s research wiki. He describes ingesting sources you manually add. An operational brain goes further — every pipeline (meetings, email, social media, contacts) automatically triggers enrichment on every entity it touches. You never have to remember to update someone’s page. The system does it because the plumbing is wired correctly.
这是运营 brain 与 Karpathy 的研究维基的区别。他描述的是手动添加来源的摄取。运营 brain 更进一步——每条管道(会议、邮件、社交媒体、联系人)在触及每个实体时自动触发富化。你不必记得更新某人的页面。系统会这样做,因为管线连接正确。
Wiring It Into Your Agent
The brain must be referenced in your agent’s configuration (AGENTS.md or equivalent) as a hard rule, not a suggestion. Specifically:
Brain 必须在智能体配置(AGENTS.md 或同等文件)中作为硬规则引用,而非建议。具体而言:
Before creating any brain page → read RESOLVER.md. This should be in your agent’s operational rules, not buried in documentation.
Before answering any question about people, companies, deals, or strategy → search the brain first. Even if the agent thinks it knows the answer. File contents are current; the agent’s memory of them goes stale.
The enrich skill fires on every signal. Every ingest pathway — meeting processing, email triage, social monitoring, contact sync — should call the enrichment pipeline when it encounters a person or company. This is wiring, not discipline. If it depends on the agent remembering, it will eventually be forgotten.
Corrections are the highest-value data. If the user corrects the agent about a person, company, deal, or decision — it gets written to the brain immediately. No batching, no deferring.
创建任何 brain 页面前 → 阅读 RESOLVER.md。 这应写在智能体的运营规则中,而非埋在文档里。
在回答关于人物、公司、交易或策略的任何问题前 → 先搜索 brain。 即使智能体自以为知道答案。文件内容是当前的;智能体对它们的记忆会变旧。
富化技能在每条信号上触发。 每条摄取路径——会议处理、邮件分拣、社交监控、联系人同步——在遇到人物或公司时应调用富化流水线。这是接线问题,不是纪律问题。若依赖智能体记住,最终会被遗忘。
纠正是最高价值数据。 若用户就人物、公司、交易或决策纠正智能体——应立即写入 brain。不要批量、不要推迟。
The chain of authority: Agent config (AGENTS.md) says “read RESOLVER.md” → RESOLVER.md is the decision tree → each directory README.md is the local resolver → schema.md defines page structure → the enrich skill defines the enrichment protocol.
权威链:智能体配置(AGENTS.md)写明「读 RESOLVER.md」→ RESOLVER.md 是决策树 → 各目录 README.md 是本地解析器 → schema.md 定义页面结构 → enrich 技能定义富化协议。
Architecture
Three layers:
三层:
Raw sources — meeting transcripts, emails, tweets, web research, API responses, calendar events, contact data. Immutable. The agent reads from these but never modifies them. Stored in sources/ and .raw/ sidecar directories.
原始来源——会议转录、邮件、推文、网络研究、API 响应、日历事件、联系人数据。不可变。智能体从中读取但从不修改。存放在 sources/ 与 .raw/ 侧车目录。
The brain — a directory of interlinked markdown files. People pages, company pages, deal pages, meeting pages, project pages, concept pages. The agent owns this layer entirely. It creates pages, updates them when new information arrives, maintains cross-references, and keeps everything consistent. You read it; the agent writes it.
Brain——相互链接的 Markdown 文件目录。人物页、公司页、交易页、会议页、项目页、概念页。智能体完全拥有这一层。它创建页面、在新信息到达时更新、维护交叉引用并保持一切一致。你阅读;智能体写入。
The schema — a document (this one, plus schema.md and RESOLVER.md) that tells the agent how the brain is structured, what the conventions are, and what workflows to follow. This is the key configuration file — it makes your agent a disciplined knowledge maintainer rather than a generic chatbot.
Schema——文档(本文加上 schema.md 与 RESOLVER.md),告诉智能体 brain 如何结构化、约定是什么、应遵循哪些工作流。这是关键配置文件——它使你的智能体成为有纪律的知识维护者,而非通用聊天机器人。
The Database + Markdown Architecture
The markdown wiki is the human-facing layer — the primary interface for humans and LLMs. But it’s not the sole source of truth. A structured database layer provides the foundation, and the markdown is generated from it.
Markdown 维基是面向人类的层——人类与 LLM 的主要界面。但它不是唯一真理来源。结构化数据库层提供基础,Markdown 由其生成。
The Four Database Primitives
Entity registry — canonical ID, all aliases, all external IDs (LinkedIn member ID, X user ID, email addresses, phone numbers) in one table. This is the single source of truth for “is this the same person?” When you merge two entities, it’s a database operation (point both IDs at the same canonical record), not a file-merge operation with cross-reference fixups.
实体注册表——规范 ID、所有别名、所有外部 ID(LinkedIn 成员 ID、X 用户 ID、邮箱、电话)在一张表中。这是「是否为同一人?」的单一真理来源。合并两个实体时,是数据库操作(两个 ID 指向同一规范记录),而非带交叉引用修补的文件合并。
Event ledger — every signal that touches the brain is an immutable event: meeting attended, email received, tweet published, enrichment completed, user correction applied. Events have provenance: source, timestamp, confidence, raw payload reference. The timeline section of markdown pages is generated from this ledger. You never lose events because a page rewrite went wrong.
事件账本——触及 brain 的每条信号都是不可变事件:出席会议、收到邮件、发布推文、富化完成、用户纠正已应用。事件有出处:来源、时间戳、置信度、原始载荷引用。Markdown 页面的时间线部分由此账本生成。不会因页面重写出错而丢失事件。
Fact store — structured claims with provenance. “Jane Doe is CTO of Acme” with source=crustdata, confidence=high, observed_at=2026-04-07. When two sources disagree (LinkedIn says CTO, company website says VP Engineering), the conflict is visible as two facts for the same field with different values. The compiled truth section above the line is generated from the fact store’s latest-confident values. Contradictions become data, not bugs.
事实存储——带出处的结构化断言。「Jane Doe 是 Acme 的 CTO」,source=crustdata, confidence=high, observed_at=2026-04-07。当两个来源不一致(LinkedIn 写 CTO,公司网站写工程副总裁)时,冲突体现为同一字段的两条事实、不同取值。线上方编译真相由事实存储中最新且置信的值生成。矛盾成为数据,而非 bug。
Relationship graph — typed edges between entities. Person→Company (role: CTO, started: 2024-01), Person→Person (relationship: co-founded company together), Company→Deal (type: Series A, date: 2025-03). Enables graph queries that markdown grep can’t answer: “who do I know who’s invested in AI infrastructure companies?” becomes a traversal, not a prayer.
关系图——实体间的类型化边。Person→Company(角色:CTO,开始:2024-01),Person→Person(关系:共同创办公司),Company→Deal(类型:A 轮,日期:2025-03)。支持 Markdown grep 无法回答的图查询:「我认识的人里谁投资了 AI 基础设施公司?」成为遍历,而非祈祷。
Why This Matters
Identity resolution becomes a database operation (merge entity IDs), not a file-merge operation with manual cross-reference fixups
Contradictions are structural (two facts with different values for the same field and different sources) rather than textual (hoping the LLM notices a discrepancy buried in prose)
Concurrency is solved — events append to a ledger, facts upsert to a store, markdown is rebuilt. No more merge conflicts on shared files
Graph queries work — “who do I know at this company?” and “what companies has this investor backed that I also know the founders of?” become database queries, not impossible grep chains
身份解析成为数据库操作(合并实体 ID),而非带手动交叉引用修补的文件合并
矛盾是结构性的(同一字段、不同来源、不同取值的两条事实),而非文本性的(指望 LLM 注意到埋在散文里的差异)
并发已解决——事件追加到账本,事实 upsert 到存储,Markdown 重建。共享文件上不再有合并冲突
图查询可用——「这家公司我认识谁?」与「这位投资者还支持了哪些我也认识创始人的公司?」成为数据库查询,而非不可能的 grep 链
File-Layer Conventions
The markdown layer uses conventions that map directly to the database primitives:
Markdown 层使用直接映射到数据库原语的约定:
Use frontmatter for structured metadata — anything you’d want to query (role, company, stage, score, tags) goes in YAML frontmatter, not buried in prose. These map to the fact store.
Use
.raw/for provenance — save every API response with source and timestamp. These map to provenance records in the fact store.Treat the timeline as an event stream — dated, sourced, append-only. These map to the event ledger.
Keep compiled truth conceptually separate from evidence — above the line is synthesis; below the line is evidence. The synthesis is a generated view; the evidence is queryable records.
Use canonical slugs consistently — every cross-reference uses the filename slug. These are the entity IDs in the registry.
用 frontmatter 存结构化元数据——任何你想查询的内容(角色、公司、阶段、分数、标签)放在 YAML frontmatter,而非埋在散文里。这些映射到事实存储。
用
.raw/存出处——保存每次 API 响应并带来源与时间戳。这些映射到事实存储中的出处记录。将时间线视为事件流——有日期、有来源、仅追加。这些映射到事件账本。
在概念上将编译真相与证据分开——线上方是综合;线下方是证据。综合是生成的视图;证据是可查询记录。
一致使用规范 slug——每个交叉引用使用文件名片段。这些是注册表中的实体 ID。
Directory Structure
brain/
├── RESOLVER.md — master decision tree for filing (agent reads this first)
├── schema.md — page conventions, templates, workflows
├── index.md — content catalog with one-line summaries
├── log.md — chronological record of all ingests/updates
├── people/ — one page per human being
│ ├── README.md — resolver: what goes here, what doesn't
│ └── .raw/ — raw API responses per person (JSON sidecars)
├── companies/ — one page per organization
│ ├── README.md
│ └── .raw/
├── deals/ — financial transactions with terms and decisions
│ └── README.md
├── meetings/ — records of specific events with transcripts
│ └── README.md
├── projects/ — things being actively built (has a repo, spec, or team)
│ └── README.md
├── ideas/ — raw possibilities nobody is building yet
│ └── README.md
├── concepts/ — mental models and frameworks you'd teach
│ └── README.md
├── writing/ — prose artifacts (essays, philosophy, drafts)
│ └── README.md
├── programs/ — major life workstreams (the forest, not the trees)
│ └── README.md
├── org/ — your institution's strategy and operations
│ └── README.md
├── civic/ — political landscape, policy, government
│ └── README.md
├── media/ — public narrative, content ops, social monitoring
│ └── README.md
├── personal/ — private notes, health, personal reflections
│ └── README.md
├── household/ — domestic operations, properties, logistics
│ └── README.md
├── hiring/ — candidate pipelines and evaluations
│ └── README.md
├── sources/ — raw data imports and archived snapshots
│ └── README.md
├── prompts/ — reusable LLM prompt library
├── inbox/ — unsorted quick captures (temporary)
└── archive/ — dead pages, historical recordbrain/
├── RESOLVER.md — 归档主决策树(智能体先读此文件)
├── schema.md — 页面约定、模板、工作流
├── index.md — 内容目录与一行摘要
├── log.md — 所有摄取/更新的时间序记录
├── people/ — 每人一页
│ ├── README.md — 解析器:什么在此、什么不在
│ └── .raw/ — 每人原始 API 响应(JSON 侧车)
├── companies/ — 每个组织一页
│ ├── README.md
│ └── .raw/
├── deals/ — 含条款与决策的金融交易
│ └── README.md
├── meetings/ — 具体事件记录与转录
│ └── README.md
├── projects/ — 正在积极构建的事项(有仓库、规格或团队)
│ └── README.md
├── ideas/ — 尚无人构建的原始可能性
│ └── README.md
├── concepts/ — 你会教授的心智模型与框架
│ └── README.md
├── writing/ — 散文产物(随笔、哲学、草稿)
│ └── README.md
├── programs/ — 人生主要工作流(森林而非树木)
│ └── README.md
├── org/ — 你所在机构的战略与运营
│ └── README.md
├── civic/ — 政治版图、政策、政府
│ └── README.md
├── media/ — 公共叙事、内容运营、社交监控
│ └── README.md
├── personal/ — 私人笔记、健康、个人反思
│ └── README.md
├── household/ — 家庭运营、房产、后勤
│ └── README.md
├── hiring/ — 候选人管道与评估
│ └── README.md
├── sources/ — 原始数据导入与归档快照
│ └── README.md
├── prompts/ — 可复用 LLM 提示词库
├── inbox/ — 未分类快速捕获(临时)
└── archive/ — 失效页面、历史记录Every directory has a README.md resolver. Adapt directories to your life — add or remove domains as needed. Not everyone needs civic/ or hiring/ or household/. The invariant is: one directory per knowledge domain, one file per entity, every directory has a resolver, and RESOLVER.md is the master decision tree that guarantees MECE filing.
每个目录都有 README.md 解析器。按你的生活调整目录——按需增减领域。不是每个人都需要 civic/、hiring/ 或 household/。不变量是:每个知识领域一个目录,每个实体一个文件,每个目录有解析器,且 RESOLVER.md 是保证 MECE 归档的主决策树。
Entity Identity and Deduplication
In a system fed by meetings, email, social media, contacts, and APIs, entity identity is the first real failure mode. Without a canonical identity layer, you will end up with subtle split-brain pages — “Jane Smith” from a meeting transcript and “J. Smith” from an email and “jsmith” from Twitter all creating separate pages for the same person.
在由会议、邮件、社交媒体、联系人与 API 供给的系统中,实体身份是第一个真正的失效模式。 没有规范身份层,你最终会陷入微妙的「分裂脑」页面——会议转录里的「Jane Smith」、邮件里的「J. Smith」与 Twitter 上的「jsmith」为同一人分别创建页面。
Canonical slugs
Every entity gets a canonical slug that serves as its stable ID:
- People:
first-last.md(all lowercase, hyphens for spaces) - Companies:
company-name.md - If collisions arise, disambiguate:
david-liu-crustdata.md,david-liu-meta.md
每个实体获得作为其稳定 ID 的规范 slug:
- 人物:
first-last.md(全小写,空格用连字符) - 公司:
company-name.md - 若冲突,消歧:
david-liu-crustdata.md、david-liu-meta.md
The filename IS the identity. All references, cross-links, and .raw/ sidecars use this slug.
文件名即身份。所有引用、交叉链接与 .raw/ 侧车均使用此 slug。
Aliases
People have many names across sources. The frontmatter aliases field captures all known variants:
人物在各来源中有许多名字。Frontmatter 的 aliases 字段捕获所有已知变体:
aliases:
["Jenny Shao", "Jenny G. Shao", "JennyGShao", "jennifer.shao@company.com"]Aliases include: misspellings from meeting transcripts, maiden names, nicknames, email addresses, social handles, and phonetic variants. When the enrich skill encounters a new name variant for a known entity, it adds the variant to aliases — it does NOT create a new page.
别名包括:会议转录中的拼写错误、婚前姓、昵称、邮箱、社交账号与语音变体。当富化技能遇到已知实体的新名字变体时,将变体加入别名——不创建新页面。
Deduplication protocol
Before creating any new page, the agent must:
- Search existing pages by name (exact and fuzzy)
- Search aliases across all pages:
grep -rl "NAME_VARIANT" /data/brain/people/ --include="*.md" - Check .raw/ sidecars for matching email addresses or social handles
- If a match is found → UPDATE the existing page (add alias if the name variant is new)
- If no match → CREATE a new page
创建任何新页面前,智能体必须:
- 按姓名搜索现有页面(精确与模糊)
- 在所有页面中搜索别名:
grep -rl "NAME_VARIANT" /data/brain/people/ --include="*.md" - 检查
.raw/侧车中是否匹配邮箱或社交账号 - 若找到匹配 → 更新现有页面(若名字变体为新,则添加别名)
- 若无匹配 → 创建新页面
Merge protocol
When you discover two pages are the same person:
- Pick the more complete page as the survivor
- Merge all timeline entries from the duplicate into the survivor (chronological order)
- Merge all aliases
- Update all cross-references that pointed to the duplicate
- Delete the duplicate
- Commit with message:
merge: [duplicate] into [survivor]
During weekly lint, actively look for potential duplicates: similar names, same company, same email across different pages.
当你发现两页是同一人时:
- 选择更完整的页面作为保留页
- 将重复页的所有时间线条目合并到保留页(按时间顺序)
- 合并所有别名
- 更新所有指向重复页的交叉引用
- 删除重复页
- 提交信息:
merge: [duplicate] into [survivor]
在每周 lint 中,主动查找潜在重复:相似姓名、同一公司、不同页面上的同一邮箱。
Key Disambiguation Rules
The most common filing confusions and how to resolve them:
最常见的归档困惑及解决方式:
Concept vs. Idea: Could you teach it as a framework? → concept. Could you build it? → idea.
Concept vs. Personal: Would you share it in a professional talk? → concept. Is it private reflection? → personal.
Idea vs. Project: Is anyone working on it? Yes → project. No → idea. The graduation moment is when work starts.
Writing vs. Media: Writing is the artifact (the essay). Media is the production and distribution infrastructure (content pipeline, social monitoring).
Writing vs. Concepts: A concept page is distilled (200 words of compiled truth). An essay is developed prose (argument, narrative, story).
Person vs. Company: Is it about them as a human? → people/. Is it about the organization? → companies/. Both pages link to each other.
Household vs. Personal: Would a PA execute on it? → household (operational). Is it private reflection? → personal.
Sources vs. .raw/ sidecars: Per-entity enrichment data → .raw/ sidecar. Bulk multi-entity imports → sources/.
Concept vs. Idea: 你能把它教成框架吗?→ concept。你能做出它吗?→ idea。
Concept vs. Personal: 你会在专业演讲中分享吗?→ concept。是私人反思吗?→ personal。
Idea vs. Project: 有人在做了吗?是 → project。否 → idea。毕业时刻是工作开始之时。
Writing vs. Media: Writing 是成品(文章)。Media 是生产与分发基础设施(内容管道、社交监控)。
Writing vs. Concepts: 概念页是提炼的(200 字编译真相)。文章是展开的散文(论证、叙事、故事)。
Person vs. Company: 是关于作为人的他们吗?→ people/。是关于组织吗?→ companies/。两页相互链接。
Household vs. Personal: 私人助理会执行吗?→ household(运营)。是私人反思吗?→ personal。
Sources vs. .raw/ sidecars: 按实体的富化数据 →
.raw/侧车。批量多实体导入 → sources/。
When nothing fits, file in inbox/ and flag it. That’s a signal the schema needs to evolve.
当都不适用时,归档到 inbox/ 并标记。这是模式需要演进的信号。
Page Types and Templates
Person
The most important page type. A great person page is a well-researched briefing — not a LinkedIn scrape.
最重要的页面类型。优秀的人物页是充分研究的简报——不是 LinkedIn 抓取。
# Person Name
> Executive summary: who they are, why they matter, what you should
> know walking into any interaction with them.
## State
- **Role:** Current title
- **Company:** Current org
- **Relationship:** To you (friend, colleague, investor, etc.)
- **Key context:** 2-4 bullets of what matters right now
## What They Believe
Worldview, positions, first principles. The hills they die on.
Every claim must cite its source and type:
- [Belief] — observed: [tweet/meeting/article, date]
- [Belief] — self-described: [interview/bio, date]
- [Belief] — inferred: [pattern across N interactions, confidence: high/medium/low]
## What They're Building
Current projects, recent ships, product direction.
## What Motivates Them
Ambition drivers, career arc, what gets them out of bed.
Distinguish between what they say motivates them (self-described) and
what their behavior suggests (observed/inferred).
## Communication Style
How they prefer to communicate. How they handle disagreement.
What energizes them in conversation.
This section is high-value but requires careful sourcing.
Rules: only write here from direct observation (meeting behavior,
language in emails/tweets, visible patterns). Never generalize
from a single data point. Mark confidence level.
## Hobby Horses
Topics they return to obsessively. Recurring themes in their public voice.
## Assessment
- **Strengths:** What they're great at. Be specific.
- **Gaps:** Where they could grow. Be specific and fair.
- **Net read:** One-line synthesis.
- **Confidence:** high (5+ interactions) / medium (2-4) / low (1 or inferred)
- **Last assessed:** YYYY-MM-DD
## Trajectory
Ascending, plateauing, pivoting, declining? Evidence.
## Relationship
History of interactions, temperature, dynamic.
## Contact
- Email, phone, LinkedIn, X handle, location
## Network
- **Close to:** People they're frequently seen with
- **Crew:** Which cluster they belong to
## Open Threads
- Active items, pending intros, follow-ups
---
## Timeline
- **YYYY-MM-DD** | Source — What happened.All sections are optional — include what you have, leave empty sections as [No data yet] rather than omitting them. The structure itself is a prompt for future enrichment. When a section says [No data yet], the agent knows what to look for next time it encounters this person.
所有区块均为可选——有你有的就写,空区块写 [No data yet] 而非省略。结构本身即对未来富化的提示。 当某节为 [No data yet] 时,智能体知道下次遇到此人应寻找什么。
The principle: facts are table stakes. Context is the value.
原则:事实是门槛。情境是价值。
Epistemic discipline on people pages
The context sections (Beliefs, Motivations, Communication Style, Assessment) are the highest-value parts of the system but also the most prone to hallucination. An agent can over-generalize from sparse evidence or overfit to one recent interaction. Rules:
- Every claim cites its source. Not “she’s aggressive” but “she pushed back hard on pricing in the March 15 meeting (observed).”
- Three source types:
observed(you saw it happen),self-described(they said it about themselves),inferred(you’re reading between lines). Label each. - Confidence tracks interaction count. One meeting = low confidence. Five meetings = high. Don’t write definitive assessments from thin data.
- Recency matters. A belief from 2 years ago may not be current. Mark dates.
- Never generalize from a single data point. “She seemed frustrated in one meeting” is a timeline entry. Patterns require multiple observations.
- The user’s corrections override everything. If the user says “that’s wrong about her,” update immediately — that correction is the highest-confidence signal in the system.
情境区块(Beliefs、Motivations、Communication Style、Assessment)是系统中价值最高、也最容易幻觉的部分。智能体可能从稀疏证据过度概括,或过度拟合最近一次互动。规则:
- 每条断言注明来源。 不是「她很强势」而是「她在 3 月 15 日会议中对定价强烈反对(观察)。」
- 三种来源类型:
observed(你亲眼所见)、self-described(他们自述)、inferred(你在字里行间推断)。逐条标注。 - 置信度随互动次数。 一次会议 = 低置信。五次会议 = 高。不要用单薄数据写定论式评估。
- 时效重要。 两年前的信念可能已非当前。标注日期。
- 切勿从单点数据泛化。 「她在一次会议中显得沮丧」是时间线条目。模式需要多次观察。
- 用户的纠正覆盖一切。 若用户说「关于她那是错的」,立即更新——该纠正是系统中置信度最高的信号。
Company
# Company Name
> What they do, stage, why they matter.
## State
- **What:** One-line description
- **Stage:** Seed / Series A / Growth / Public
- **Key people:** Names with links to people pages
- **Key metrics:** Revenue, headcount, funding
- **Connection:** How they relate to your world
## Open Threads
---
## TimelineMeeting
# Meeting Title
> YOUR analysis — not a copy of the AI meeting notes.
> What matters given everything else going on.
> What was decided. What was left unsaid.
## Attendees
## Key Decisions
## Action Items
## Connections to other brain pages
---
## Full TranscriptDeal, Project, Concept — same pattern. Compiled truth on top, timeline on bottom.
Deal、Project、Concept 同理:上方编译真相,下方时间线。
The Enrichment Pipeline
This is the most important operational pattern. Every time your agent encounters a person or company — in a meeting, email, tweet, calendar event, contact sync — it should enrich the corresponding brain page.
这是最重要的运营模式。 每当智能体在会议、邮件、推文、日历事件、联系人同步中遇到人物或公司——都应富化对应的 brain 页面。
Enrichment is not just “look up their LinkedIn.” It’s:
- What they believe — positions, worldview, public stances
- What they’re building — current projects, what’s shipping
- What motivates them — ambition, career trajectory
- Their communication style — how they engage, what energizes them
- Their relationship to you — history, context, open threads
- Hard facts — role, company, contact info, funding (table stakes)
Facts are table stakes. Context is the value.
富化不只是「查一下 LinkedIn」。它包括:
- 他们相信什么——立场、世界观、公开态度
- 他们在构建什么——当前项目、正在交付什么
- 什么驱动他们——野心、职业轨迹
- 他们的沟通风格——如何互动、什么能激发他们
- 他们与你的关系——历史、情境、开放线索
- 硬事实——角色、公司、联系信息、融资(门槛事实)
事实是门槛。情境是价值。
When to enrich
Any time a person or company signal appears:
- Someone is mentioned in a meeting transcript → enrich
- Someone emails you → enrich
- Someone interacts with you on social media → enrich
- A new contact appears → enrich
- You mention someone in conversation and their page is thin → enrich
- A company announces funding, ships a product, makes news → enrich
任何时候出现人物或公司信号:
- 会议转录中提及某人 → 富化
- 某人给你发邮件 → 富化
- 某人在社交媒体上与你互动 → 富化
- 新联系人出现 → 富化
- 你在对话中提到某人且其页面单薄 → 富化
- 公司宣布融资、发布产品、成为新闻 → 富化
Enrichment sources (in order of value)
Your own interactions — what you said about them, what they said to you (highest signal)
Meeting transcripts — richest context source
Email threads — tone, urgency, relationship dynamics
Social media — beliefs, public positioning, who they engage with
Web search — background, press, talks
People APIs — structured profile data (career history, education, skills, contact info)
Company APIs — funding, investors, valuations, headcount, financials
Contact data — email, phone, location
你自己的互动——你对他们的说法、他们对你说的(信号最强)
会议转录——最丰富的情境来源
邮件线程——语气、紧迫性、关系动态
社交媒体——信念、公开定位、他们与谁互动
网络搜索——背景、报道、演讲
人物 API——结构化档案数据(职业史、教育、技能、联系信息)
公司 API——融资、投资者、估值、人数、财务
联系人数据——邮箱、电话、地点
Data source skills
Each external data source should be its own named skill with full API documentation, auth patterns, and usage notes. The enrich skill orchestrates them — it decides which sources to call based on tier, then delegates to the individual skill for how to call the API.
This keeps things DRY: the enrich skill owns the logic (when to enrich, what tier, what to extract), and each data source skill owns the API contract (endpoints, auth, rate limits, gotchas, validation rules).
Recommended data source skills:
- Web search — broad keyword search (Brave, Google, etc.). Quick background, press, funding.
- Semantic search — better than keyword for finding specific people, LinkedIn URLs, personal writing. (Exa, Perplexity, etc.)
- Social search — X/Twitter, Bluesky, etc. for public voice: beliefs, projects, engagement patterns.
- People enrichment — structured LinkedIn-like data: career history, education, skills, contact info. (Crustdata, Proxycurl, People Data Labs, etc.)
- Network search — search your professional network for warm intros and connections. (Happenstance, Clay, etc.)
- Company intelligence — Pitchbook-grade data: funding rounds, investors, valuations, headcount, financials. (Captain API, Crunchbase, etc.)
- Meeting history — search past meetings for interactions with this entity. (Circleback, Otter, Fireflies, etc.)
- Contact data — email, phone, location from your contacts. (Google Contacts, etc.)
The typical enrichment flow for a new person:
- Network search → find LinkedIn URL, career arc, alternate names
- People enrichment → deep structured data (skills, work history, education, contact info)
- Semantic search → find personal sites, talks, writing that reveal beliefs and perspective
- Social search → their public voice, who they engage with, hobby horses
- Web search → press coverage, recent news, talks
- Meeting history → past interactions with you
For a new company:
- Company intelligence → funding, investors, headcount, financials
- Web search → product, press, traction
- Social search → company’s public positioning
- People enrichment → enrich founders/key team members (each triggers person enrichment)
每个外部数据源应是自带完整 API 文档、认证模式与使用说明的命名技能。Enrich 技能编排它们——它按层级决定调用哪些来源,再委托给各技能处理如何调用 API。
保持 DRY:enrich 技能拥有逻辑(何时富化、何层级、提取什么),各数据源技能拥有 API 契约(端点、认证、限流、坑、校验规则)。
推荐数据源技能:
- Web search——广泛关键词搜索(Brave、Google 等)。快速背景、报道、融资。
- Semantic search——比关键词更适合找特定人物、LinkedIn URL、个人写作。(Exa、Perplexity 等)
- Social search——X/Twitter、Bluesky 等,用于公开声音:信念、项目、互动模式。
- People enrichment——类 LinkedIn 结构化数据:职业史、教育、技能、联系信息。(Crustdata、Proxycurl、People Data Labs 等)
- Network search——在你的职业网络中搜索内推与连接。(Happenstance、Clay 等)
- Company intelligence——Pitchbook 级数据:融资轮、投资者、估值、人数、财务。(Captain API、Crunchbase 等)
- Meeting history——搜索与该实体的过往会议。(Circleback、Otter、Fireflies 等)
- Contact data——来自联系人的邮箱、电话、地点。(Google Contacts 等)
新人物的典型富化流:
- Network search → 找 LinkedIn URL、职业弧线、别名
- People enrichment → 深度结构化数据(技能、工作经历、教育、联系信息)
- Semantic search → 找个人网站、演讲、揭示信念与视角的写作
- Social search → 公开声音、互动对象、执念话题
- Web search → 报道、近期新闻、演讲
- Meeting history → 与您的过往互动
新公司:
- Company intelligence → 融资、投资者、人数、财务
- Web search → 产品、报道、 traction
- Social search → 公司公开定位
- People enrichment → 富化创始人/核心团队(每人触发人物富化)
Enrichment tiers (don’t over-enrich)
- Tier 1 (key people): Full pipeline — all sources. Inner circle, business partners, important collaborators.
- Tier 2 (notable): Web search + social + brain cross-reference. People you interact with occasionally.
- Tier 3 (minor mentions): Extract signal from source only, append to timeline. Everyone else worth tracking.
A thin page with real interaction data is better than a fat page stuffed with generic web results. Don’t waste 10 API calls on someone with no public presence.
- Tier 1(关键人物): 全管道——所有来源。内圈、商业伙伴、重要合作者。
- Tier 2(值得关注): Web search + 社交 + brain 交叉引用。偶尔互动的人。
- Tier 3(次要提及): 仅从来源提取信号,追加到时间线。其余值得追踪者。
有真实互动数据的薄页优于塞满通用网络结果的厚页。不要为无公开存在感的人浪费 10 次 API 调用。
Raw data sidecars
Every enrichment API response gets saved as a JSON sidecar:
每次富化 API 响应保存为 JSON 侧车:
people/jane-doe.md ← brain page (curated, readable)
people/.raw/jane-doe.json ← raw API responsespeople/jane-doe.md ← brain 页面(策展、可读)
people/.raw/jane-doe.json ← 原始 API 响应The JSON is keyed by source with fetch timestamps:
JSON 按来源键控并带抓取时间戳:
{
"sources": {
"crustdata": { "fetched_at": "2026-04-05T...", "data": { ... } },
"web_search": { "fetched_at": "...", "data": { ... } }
}
}The brain page is the distilled version. Raw data is the archive.
What goes in the brain page (distilled): location, current title, company, headline, education (one line), career arc (condensed), top skills, social handles, profile picture permalink.
What stays in .raw/ only: full work history with job descriptions, complete skill lists, company descriptions for each employer, platform-specific IDs, follower/connection counts, full API response bodies.
When re-enriching: overwrite the source key with fresh data + new timestamp. Don’t append — replace.
Brain 页面是提炼版。原始数据是归档。
进入 brain 页面(提炼)的内容:地点、当前头衔、公司、 headline、教育(一行)、职业弧线(浓缩)、顶级技能、社交账号、头像永久链接。
仅留在 .raw/ 的:含职位描述的完整工作经历、完整技能列表、各雇主的公司描述、平台特定 ID、粉丝/连接数、完整 API 响应体。
重新富化时:用新数据 + 新时间戳覆盖该来源键。不要追加——替换。
Validation rules
When auto-enriching from people/company APIs:
- Low connection/follower count (e.g., <20): Likely wrong person. Save to .raw/ with a
"validation": "low_connections"flag. Don’t auto-write to the brain page. - Name mismatch: If the returned name doesn’t share a last name with the entity, skip.
- Obviously joke profiles: Career arcs mentioning absurd titles — skip.
- When in doubt: Save raw data but don’t update the brain page. Wrong data is worse than no data.
从人物/公司 API 自动富化时:
- 低连接/粉丝数(如 <20): 可能找错人。保存到
.raw/并带"validation": "low_connections"标记。不要自动写入 brain 页面。 - 姓名不匹配: 若返回姓名与实体的姓不一致,跳过。
- 明显玩笑档案: 职业弧线含荒谬头衔——跳过。
- 存疑时: 保存原始数据但不更新 brain 页面。错误数据比没有数据更糟。
Browser budget
If enrichment involves browser-based lookups (LinkedIn, authenticated pages), set a daily budget (e.g., 20 lookups/day) to avoid account flagging. Prefer API-based enrichment services for bulk work — they don’t touch the user’s browser session.
若富化涉及基于浏览器的查询(LinkedIn、需登录页面),设置每日预算(如 20 次查询/天)以避免账号被标记。批量工作优先使用基于 API 的富化服务——它们不触碰用户的浏览器会话。
Entry Criteria — Who Gets a Page
Not everyone deserves a brain page. Scale page creation to relationship importance:
Always create a page for:
- Anyone you’ve had a 1:1 or small-group meeting with
- Key colleagues, partners, and direct collaborators
- Anyone with a strong working relationship or better
- Family, close friends, inner circle
Create if signal exists:
- People from contacts with recent interaction
- Anyone mentioned by name in conversation with context
- Event contacts with multiple shared events
Do NOT create:
- Random names from mass event guest lists with no interaction
- Single-name entries with no identifying context
- Contacts with no relationship signal at all
When in doubt: does the user benefit from this entry existing? If no, skip it.
不是每个人都值得有 brain 页面。按关系重要性扩展页面创建:
始终创建页面:
- 任何与你一对一或小组会议过的人
- 关键同事、伙伴与直接合作者
- 任何有较强工作关系或更好者
- 家人、密友、内圈
有信号则创建:
- 联系人中近期有互动者
- 对话中有名有姓且有上下文提及者
- 有多次共同活动的活动联系人
不要创建:
- 无互动的大规模活动嘉宾名单中的随机姓名
- 无识别上下文的单名条目
- 毫无关系信号的联系人
存疑时:用户是否从该条目中受益?若否,跳过。
The Skill Architecture
Skills are the modular building blocks of the system. There are three types, and understanding how they compose is critical.
技能是系统的模块化构建块。有三类,理解其组合方式至关重要。
1. Data source skills (leaf nodes)
Each external API or data source gets its own named skill. The skill owns the API contract: endpoints, authentication, rate limits, error handling, validation rules, and what the response looks like.
Examples:
每个外部 API 或数据源有独立命名技能。技能拥有 API 契约:端点、认证、限流、错误处理、校验规则与响应形态。
示例:
- People enrichment(Crustdata、Proxycurl、People Data Labs)——类 LinkedIn 结构化数据
- Network search(Happenstance、Clay)——搜索职业网络、找共同连接
- Company intelligence(Captain API/Pitchbook、Crunchbase)——融资、投资者、财务
- Semantic search(Exa、Perplexity)——找 LinkedIn URL、个人网站、写作
- Meeting history(Circleback、Otter、Fireflies)——过往会议转录与笔记
- Calendar/contacts(Google Calendar、Google Contacts 经集成工具)——日程、联系信息
- Social media(X API、Bluesky API)——公开帖子、互动、粉丝数据
- Workspace tools(Gmail、Slack、Drive 经集成工具)——邮件线程、消息、文档
Data source skills are never called directly by the user. They’re called by orchestration skills (below).
数据源技能从不由用户直接调用。 由编排技能(见下)调用。
2. Orchestration skills (coordinators)
These skills contain the logic — they decide what to do, then delegate to data source skills for how to do it.
The enrich skill is the most important orchestration skill. It decides:
- Is this a CREATE (new page) or UPDATE (new signal)?
- What tier is this entity? (determines which data sources to call)
- What signal types to extract from the source material?
- Which data source skills to call, in what order?
- How to write the results to the brain?
Other orchestration skills:
- Meeting ingestion — pulls meetings from a meeting tool, creates brain meeting pages with analysis, then calls enrich for every attendee and company discussed
- Email triage / executive assistant — processes inbox, handles scheduling, then calls enrich when it encounters people or companies
- Social monitoring — scans public social media for mentions and engagement, then calls enrich for notable accounts
这些技能包含逻辑——决定做什么,再委托数据源技能执行如何做。
Enrich 技能是最重要的编排技能。它决定:
- 这是 CREATE(新页)还是 UPDATE(新信号)?
- 该实体是何层级?(决定调用哪些数据源)
- 从源材料提取哪些信号类型?
- 按何顺序调用哪些数据源技能?
- 如何将结果写入 brain?
其他编排技能:
- Meeting ingestion——从会议工具拉取会议,创建 brain 会议页与分析,再对每位与会者和讨论的公司调用 enrich
- Email triage / executive assistant——处理收件箱、处理日程,遇到人物或公司时调用 enrich
- Social monitoring——扫描公开社交媒体的提及与互动,对值得关注的账号调用 enrich
3. Pipeline skills (end-to-end workflows)
These are the user-facing skills that chain multiple orchestration and data source skills together:
- Morning briefing — reads calendar + tasks + brain state + recent signals → produces a briefing
- Person research — given a name, runs full Tier 1 enrichment and presents the result
- Weekly brain maintenance — runs lint, flags stale pages, suggests enrichment targets
面向用户的技能,将多个编排与数据源技能串成链:
- Morning briefing——读日历 + 任务 + brain 状态 + 近期信号 → 生成简报
- Person research——给定姓名,运行完整 Tier 1 富化并呈现结果
- Weekly brain maintenance——运行 lint、标记陈旧页面、建议富化目标
How they compose
User says "tell me about Jane Doe"
→ Agent searches brain (grep/index)
→ Page is thin → calls enrich skill (orchestration)
→ enrich determines Tier 1
→ calls happenstance skill (data source) → gets LinkedIn URL
→ calls crustdata skill (data source) → gets full profile
→ calls exa skill (data source) → finds personal writing
→ calls web_search (built-in tool) → gets press coverage
→ calls meeting history (data source) → finds past meetings
→ writes brain page, saves .raw/ sidecar, cross-references
→ Agent presents the enriched page to user用户说「告诉我 Jane Doe 的情况」
→ 智能体搜索 brain(grep/index)
→ 页面单薄 → 调用 enrich 技能(编排)
→ enrich 判定 Tier 1
→ 调用 happenstance 技能(数据源)→ 获得 LinkedIn URL
→ 调用 crustdata 技能(数据源)→ 获得完整档案
→ 调用 exa 技能(数据源)→ 找到个人写作
→ 调用 web_search(内置工具)→ 获得报道覆盖
→ 调用 meeting history(数据源)→ 找到过往会议
→ 写入 brain 页面、保存 .raw/ 侧车、交叉引用
→ 智能体向用户呈现富化后的页面Cron fires "meeting ingestion" every afternoon
→ meeting-ingestion skill (orchestration) pulls new meetings
→ for each meeting: creates brain meeting page
→ for each attendee: calls enrich skill (orchestration)
→ enrich calls relevant data source skills based on tier
→ for each company discussed: calls enrich skill
→ extracts tasks, commits brain repoCron 每天下午触发「会议摄取」
→ meeting-ingestion 技能(编排)拉取新会议
→ 每场会议:创建 brain 会议页
→ 每位与会者:调用 enrich 技能(编排)
→ enrich 按层级调用相关数据源技能
→ 每个讨论到的公司:调用 enrich 技能
→ 提取任务、提交 brain 仓库The key insight: data source skills are stateless and reusable. The enrich skill can call Crustdata whether the trigger was a meeting, an email, a social mention, or a direct user request. The data source skill doesn’t care where the request came from.
关键洞见:数据源技能无状态且可复用。 无论触发是会议、邮件、社交提及还是直接用户请求,enrich 技能都能调用 Crustdata。数据源技能不关心请求来自何处。
How Enrich Wires Into Everything
The enrich skill is the central hub. Every ingest pathway converges on it:
Enrich 技能是中心枢纽。每条摄取路径汇聚于此:
Meeting ingestion ───────┬─────────────────────────┬─── people enrichment API
Email triage ────────────┤ ├─── company intelligence API
Social monitoring ───────┤ ENRICH SKILL ├─── network search API
Contact sync ────────────┤ (orchestration) ├─── semantic search API
Manual conversation ─────┤ ├─── social search API
Calendar events ─────────┤ ├─── web search
Webhooks ────────────────┴─────────────────────────┴─── meeting history API
│
▼
BRAIN REPO
(people/, companies/,
meetings/, deals/)会议摄取 ────────────────┬─────────────────────────┬─── 人物富化 API
邮件分拣 ────────────────┤ ├─── 公司情报 API
社交监控 ────────────────┤ ENRICH 技能 ├─── 网络搜索 API
联系人同步 ──────────────┤ (编排) ├─── 语义搜索 API
手动对话 ────────────────┤ ├─── 社交搜索 API
日历事件 ────────────────┤ ├─── 网络搜索
Webhooks ────────────────┴─────────────────────────┴─── 会议历史 API
│
▼
BRAIN 仓库
(people/, companies/,
meetings/, deals/)Every arrow into the enrich skill carries a signal (the raw information from the source) and an entity (the person or company to enrich). The enrich skill:
- Checks brain state — does a page exist? Is it thin?
- Determines tier — Tier 1 (full pipeline), Tier 2 (web + social + cross-ref), Tier 3 (source extraction only)
- Extracts signal from the source material (beliefs, motivations, trajectory, facts)
- Calls data source skills based on tier (each skill is a named, documented module)
- Writes to brain — CREATE (via RESOLVER.md) or UPDATE (append timeline, update compiled truth)
- Cross-references — updates all linked entity pages
- Saves raw data to
.raw/sidecar - Commits to the brain repo
The critical wiring rule: every ingest skill must call enrich. This is not optional or aspirational. It’s structural. If a new ingest pathway is added (say, a Slack monitoring skill), its implementation must include “for each person/company mentioned, call the enrich skill.” If that line is missing, the brain stops compounding from that source.
每条指向 enrich 技能的箭头携带信号(来自来源的原始信息)与实体(要富化的人物或公司)。Enrich 技能:
- 检查 brain 状态——页面是否存在?是否单薄?
- 判定层级——Tier 1(全管道)、Tier 2(web + 社交 + 交叉引用)、Tier 3(仅来源提取)
- 从源材料提取信号(信念、动机、轨迹、事实)
- 按层级调用数据源技能(每个技能为命名、有文档的模块)
- 写入 brain——CREATE(经 RESOLVER.md)或 UPDATE(追加时间线、更新编译真相)
- 交叉引用——更新所有链接的实体页面
- 将原始数据保存到
.raw/侧车 - 提交到 brain 仓库
关键接线规则:每个摄取技能必须调用 enrich。 这不是可选项或理想目标。这是结构性的。若新增摄取路径(例如 Slack 监控技能),其实现必须包含「对每个提及的人物/公司调用 enrich 技能。」若缺这一行,brain 将无法从该来源复利增长。
Automated Cron Jobs
The brain doesn’t just grow when you’re actively using it. Cron jobs make the system autonomous — the brain is maintained, the inbox is triaged, meetings are ingested, and mentions are monitored even while you sleep.
Brain 不仅在你主动使用时增长。Cron 任务使系统自主——brain 被维护、收件箱被分拣、会议被摄取、提及被监控,即便你入睡亦然。
The cron architecture
Cron jobs run as isolated agent sessions — they get their own context, read their own skills, and don’t block the main conversation thread. They can post to specific notification channels (Telegram topics, Slack channels, Discord threads) or work silently.
Each cron job is essentially: “wake up, read a skill, do the work, post results (or stay silent if nothing happened), go back to sleep.”
Cron 任务以隔离的智能体会话运行——自有上下文、读自有技能,不阻塞主对话线程。可发布到特定通知渠道(Telegram 话题、Slack 频道、Discord 线程)或静默工作。
每个 cron 任务本质上是:「唤醒,读技能,干活,发布结果(若无事则保持静默),再休眠。」
Recommended cron jobs for a brain-powered system
High frequency (every 10-30 minutes):
- Email monitor — scan inbox, classify by priority, post digest to a notification channel. Handle low-risk items (scheduling, acknowledgments) directly.
- Message monitor — check key communication channels for unreplied messages from important contacts. Surface them with suggested responses.
高频(每 10–30 分钟):
- Email monitor——扫描收件箱、按优先级分类、将摘要发到通知渠道。直接处理低风险项(日程、确认)。
- Message monitor——检查关键沟通渠道中来自重要联系人的未回复消息。附带建议回复呈现。
Medium frequency (every 1-3 hours):
- Social radar — scan public social media for mentions of you or your organization, engagement, emerging narratives. Alert on items that need attention. Call enrich for notable new accounts engaging with you.
- Heartbeat — the omnibus check. Calendar lookahead, task review, email scan, brain state review. Post if something needs attention; stay silent if not.
中频(每 1–3 小时):
- Social radar——扫描公开社交媒体对你或组织的提及、互动、新兴叙事。需要关注时告警。对新互动的重要账号调用 enrich。
- Heartbeat——综合检查。日历前瞻、任务回顾、邮件扫描、brain 状态回顾。有需要关注则发布;否则保持静默。
Daily:
- Morning briefing — calendar + tasks + urgent items + overnight signals → one notification. The “here’s your day” message.
- Task prep — archive yesterday’s completed tasks, build today’s list from calendar + backlog + recurring items.
- Meeting ingestion — pull all new meetings from your meeting tool, run full ingestion (create meeting pages, propagate to entity pages, extract tasks). This is the heaviest cron job — it touches the most brain pages.
- Social media collection — archive your own posts, track engagement velocity, detect deletions. Feed into media/ pages.
每日:
- Morning briefing——日历 + 任务 + 紧急项 + 夜间信号 → 一条通知。「这是你的一天」消息。
- Task prep——归档昨日完成任务,从日历 + 待办 + 周期项构建今日列表。
- Meeting ingestion——从会议工具拉取所有新会议,完整摄取(创建会议页、传播到实体页、提取任务)。这是最重的 cron——触及最多 brain 页面。
- Social media collection——归档你自己的帖子、跟踪互动速度、检测删除。汇入 media/ 页面。
Weekly:
- Brain lint — run the full maintenance pass: contradictions, stale pages, orphans, missing cross-references, MECE filing violations. Post a report.
- Enrichment sweep — find brain pages that haven’t been enriched in 90+ days, or pages with many
[No data yet]sections. Queue them for re-enrichment. - Contact sync — pull recent additions from your contacts, cross-reference with brain. Create pages for significant new contacts.
每周:
- Brain lint——完整维护:矛盾、陈旧页面、孤儿、缺失交叉引用、MECE 归档违规。发布报告。
- Enrichment sweep——查找 90+ 天未富化的 brain 页面,或许多
[No data yet]节的页面。排队重新富化。 - Contact sync——从联系人拉取近期新增,与 brain 交叉引用。为重要的新联系人创建页面。
How crons feed the brain
The key insight: cron jobs are the autonomous enrichment engine. Without them, the brain only grows when you’re actively talking to the agent. With them:
- The email monitor encounters a person → calls enrich → brain grows
- The meeting ingestion processes a transcript → calls enrich for every attendee → brain grows
- The social radar detects a new notable account → calls enrich → brain grows
- The contact sync finds a new contact → calls enrich → brain grows
- The enrichment sweep finds stale pages → calls enrich with fresh data → brain stays current
The brain compounds 24/7 because the cron jobs are wired to call enrich. The user sleeps; the brain doesn’t.
关键洞见:cron 任务是自主富化引擎。 没有它们,brain 仅在你与智能体对话时增长。有了它们:
- 邮件监控遇到某人 → 调用 enrich → brain 增长
- 会议摄取处理转录 → 对每位与会者调用 enrich → brain 增长
- 社交雷达检测到新的重要账号 → 调用 enrich → brain 增长
- 联系人同步发现新联系人 → 调用 enrich → brain 增长
- 富化扫描发现陈旧页面 → 用新数据调用 enrich → brain 保持最新
Brain 24/7 复利,因为 cron 任务接线调用 enrich。用户睡眠;brain 不睡。
Cron job design rules
Silent when nothing happens. If a cron finds nothing new, it should produce no output. No “nothing to report” messages. This is critical — noisy crons get disabled.
Post to specific channels. Each cron posts to its designated notification channel (e.g., email cron → Emails topic, social radar → Social Alerts topic). Don’t mix signals.
Spawn sub-agents for heavy work. The cron session should stay lightweight. If meeting ingestion needs to process 5 meetings and update 30 entity pages, spawn sub-agents for the entity propagation.
Idempotent and checkpoint-aware. Every cron should track what it’s already processed (in a state file like
meeting-notes-state.json) so it doesn’t redo work on the next run.Respect quiet hours. Don’t post between 11 PM and 7 AM unless something is genuinely urgent. Crons should check the time before posting.
Every ingest cron must call enrich. This is the structural rule. A cron that processes meetings but doesn’t enrich attendees is a bug, not a feature.
无事则静默。 若 cron 未发现新内容,应无输出。不要发「无可报告」消息。这很关键——嘈杂的 cron 会被关掉。
发布到特定渠道。 每个 cron 发布到指定通知渠道(如邮件 cron → Emails 话题,社交雷达 → Social Alerts 话题)。不要混信号。
重活派生子智能体。 Cron 会话应保持轻量。若会议摄取需处理 5 场会议并更新 30 个实体页,为实体传播派生子智能体。
幂等且可检查点。 每个 cron 应跟踪已处理内容(如
meeting-notes-state.json等状态文件),以免下次运行重复工作。遵守安静时段。 除非真正紧急,不要在 23:00–7:00 之间发布。Cron 发布前应检查时间。
每个摄取 cron 必须调用 enrich。 这是结构规则。处理会议但不富化与会者的 cron 是 bug,不是功能。
Example: how it all fits together
A typical afternoon in an autonomous brain system:
自主 brain 系统中一个典型下午:
3:00 PM — Email monitor cron fires. Scans inbox. Finds 3 new emails: a scheduling request, a funding announcement, and a founder asking for advice.
- Handles the scheduling request directly (checks calendar, replies with available times)
- Calls enrich on the company in the funding announcement → updates company page with new round
- Posts the founder’s email to notification channel for the user to handle
15:00——邮件监控 cron 触发。扫描收件箱。发现 3 封新邮件:日程请求、融资公告、创始人求建议。
- 直接处理日程请求(查日历、回复可用时间)
- 对融资公告中的公司调用 enrich → 用新轮次更新公司页
- 将创始人的邮件发到通知渠道供用户处理
3:15 PM — Meeting ingestion cron fires. Finds 2 new meetings from today.
- Creates 2 brain meeting pages with analysis
- Calls enrich for 8 attendees across both meetings → updates 8 people pages
- Calls enrich for 3 companies discussed → updates 3 company pages
- Extracts 4 action items → adds to task list
15:15——会议摄取 cron 触发。发现今日 2 场新会议。
- 创建 2 个带分析的 brain 会议页
- 对两场会议共 8 位与会者调用 enrich → 更新 8 个人物页
- 对讨论的 3 家公司调用 enrich → 更新 3 个公司页
- 提取 4 条行动项 → 加入任务列表
3:30 PM — Social radar cron fires. Detects a journalist writing a thread about the user’s organization.
- Posts alert to Social Alerts channel
- Calls enrich on the journalist → creates/updates their people page with recent activity
15:30——社交雷达 cron 触发。检测到记者撰写关于用户组织的 thread。
- 向 Social Alerts 频道发告警
- 对记者调用 enrich → 用近期活动创建/更新其人物页
4:00 PM — Heartbeat fires. Calendar shows a meeting in 1 hour. Brain page for the attendee was last enriched 3 months ago.
- Triggers a fresh enrichment pass on the attendee
- Posts a prep note: “Meeting with X in 1 hour. Here’s what’s changed since you last met.”
16:00——Heartbeat 触发。日历显示 1 小时后有会。与会者的 brain 页上次富化是 3 个月前。
- 触发对该与会者的新一轮富化
- 发布准备笔记:「1 小时后与 X 会面。自上次见面以来的变化如下。」
The user didn’t ask for any of this. The brain grew by 12 pages and the user walked into their 4:00 PM meeting fully prepared — because the plumbing is wired correctly.
用户未要求任何一项。Brain 增长了 12 页,用户走进 16:00 的会议时已充分准备——因为管线接线正确。
Worked Examples From a Production System
These examples show how the architecture operates end-to-end. Names and specifics are genericized, but the skill chains are exact — every skill call, every file write, every cron trigger is how it actually works.
以下示例展示架构端到端如何运作。姓名与细节已泛化,但技能链是精确的——每次技能调用、每次文件写入、每次 cron 触发均与实际一致。
Example 1: Meeting Ingestion — The Full Chain
A cron job fires at 3:00 PM daily with the prompt: “Read skills/meeting-ingestion/SKILL.md and process today’s meetings.”
每日 15:00 触发的 cron 任务,提示为:「阅读 skills/meeting-ingestion/SKILL.md 并处理今日会议。」
Step 1: Skill chain loads. The meeting-ingestion skill’s preamble says “Read skills/enrich/SKILL.md” — so the agent loads the enrichment protocol before touching any data. This is critical: it means the agent knows how to handle every person and company it encounters.
Step 1: Skill chain loads. Meeting-ingestion 技能的前言写「阅读 skills/enrich/SKILL.md」——故智能体在碰任何数据前加载富化协议。这很关键:意味着智能体知道如何处理遇到的每个人与公司。
Step 2: Pull new meetings. The agent calls the meeting history data source skill (in this system, Circleback). It checks a state file (memory/meeting-notes-state.json) that tracks the last processed meeting ID. Finds 2 new meetings since last run.
Step 2: Pull new meetings. 智能体调用会议历史数据源技能(本系统中为 Circleback)。检查状态文件(memory/meeting-notes-state.json)跟踪上次处理的会议 ID。自上次运行以来发现 2 场新会议。
Step 3: Process Meeting 1 — “Product Review with Sarah Chen and Mike Torres.”
The agent creates brain/meetings/2026-04-07-product-review.md with:
- Its own analysis above the line (not a copy of the AI summary — reframed through what the brain already knows about the attendees and the project)
- Key decisions, action items, and connections to other brain pages
- Full transcript below the line
智能体创建 brain/meetings/2026-04-07-product-review.md,包含:
- 线上方自有分析(非复制 AI 摘要——结合 brain 已知与会者与项目重新框定)
- 关键决策、行动项、与其他 brain 页的连接
- 线下方完整转录
Step 4: Enrich attendees.
For Sarah Chen — the agent searches the brain: grep -rl "Sarah Chen" /data/brain/people/. Finds people/sarah-chen.md. Reads it. Page was last enriched 2 weeks ago and has good coverage. → Tier 3: extract signal from this meeting only. Appends to her timeline: “2026-04-07 | Meeting — Pushed back on timeline for launch, wants more QA. Concerned about API stability.” Updates her Open Threads with the new follow-up item.
For Mike Torres — brain search finds people/mike-torres.md. Page exists but is thin: just a name, title, and one previous meeting entry. → Tier 2: web search + social + brain cross-reference. Agent finds his recent blog posts (feeds into What They Believe), his X activity (feeds into Hobby Horses), and cross-references him with two other brain pages that mention him. Updates compiled truth above the line.
For “Alex from Meridian Labs” (mentioned in the meeting but not an attendee) — brain search finds nothing. → CREATE path:
- Reads RESOLVER.md: “a specific named person” →
people/ - Creates
people/alex-rivera.mdusing the person template from schema.md - Runs Tier 1 enrichment (full pipeline): network search → finds LinkedIn URL. People enrichment API → full structured profile. Semantic search → finds a conference talk. Web search → finds press coverage of Meridian Labs’ recent funding.
- Saves raw API responses to
people/.raw/alex-rivera.json - Cross-references: updates
companies/meridian-labs.mdto link to Alex’s page
对 Sarah Chen——智能体搜索 brain:grep -rl "Sarah Chen" /data/brain/people/。找到 people/sarah-chen.md。阅读。页面上次富化为 2 周前且覆盖良好。→ Tier 3:仅从本场会议提取信号。追加到其时间线:「2026-04-07 | Meeting — 对上线时间线提出异议,希望更多 QA。担心 API 稳定性。」用新跟进项更新其 Open Threads。
对 Mike Torres——brain 搜索找到 people/mike-torres.md。页面存在但单薄:仅姓名、头衔与一条先前会议记录。→ Tier 2:web search + 社交 + brain 交叉引用。智能体找到其近期博文(汇入 What They Believe)、X 动态(汇入 Hobby Horses),并与另两处提及他的 brain 页交叉引用。更新线上方编译真相。
对 「Meridian Labs 的 Alex」(会议提及但非与会者)——brain 搜索无结果。→ CREATE 路径:
- 读 RESOLVER.md:「具名具体人物」→
people/ - 用 schema.md 中人物模板创建
people/alex-rivera.md - 运行 Tier 1 富化(全管道):network search → 找到 LinkedIn URL。人物富化 API → 完整结构化档案。Semantic search → 找到会议演讲。Web search → 找到 Meridian Labs 近期融资的报道。
- 将原始 API 响应保存到
people/.raw/alex-rivera.json - 交叉引用:更新
companies/meridian-labs.md链接到 Alex 的页面
Step 5: Enrich companies discussed. Meridian Labs was discussed extensively. Agent checks companies/meridian-labs.md — exists but funding data is 4 months stale. Calls company intelligence API → gets fresh round data. Updates the page.
Meridian Labs 被大量讨论。智能体检查 companies/meridian-labs.md——存在但融资数据已陈旧 4 个月。调用公司情报 API → 获得新轮次数据。更新页面。
Step 6: Extract action items. Finds 3 action items in the transcript → appends to ops/tasks.md.
在转录中找到 3 条行动项 → 追加到 ops/tasks.md。
Step 7: Repeat for Meeting 2. Same flow.
流程相同。
Step 8: Commit and notify.
cd /data/brain && git add -A && git commit -m "meetings: 2026-04-07 product review, investor sync" && git pushPosts summary to the Meetings notification channel: “Processed 2 meetings. Created 1 new person page (Alex Rivera). Updated 4 entity pages. 5 action items extracted.”
向 Meetings 通知渠道发布摘要:「已处理 2 场会议。新建 1 个人物页(Alex Rivera)。更新 4 个实体页。提取 5 条行动项。」
Files touched in this run:
本轮触及的文件:
brain/
├── meetings/
│ ├── 2026-04-07-product-review.md (CREATED)
│ └── 2026-04-07-investor-sync.md (CREATED)
├── people/
│ ├── sarah-chen.md (UPDATED — timeline + open threads)
│ ├── mike-torres.md (UPDATED — Tier 2 enrichment)
│ ├── alex-rivera.md (CREATED — Tier 1 enrichment)
│ └── .raw/
│ └── alex-rivera.json (CREATED — raw API responses)
├── companies/
│ └── meridian-labs.md (UPDATED — fresh funding data)
ops/
└── tasks.md (UPDATED — 5 new action items)
memory/
└── meeting-notes-state.json (UPDATED — checkpoint)brain/
├── meetings/
│ ├── 2026-04-07-product-review.md (已创建)
│ └── 2026-04-07-investor-sync.md (已创建)
├── people/
│ ├── sarah-chen.md (已更新 — 时间线 + 开放线索)
│ ├── mike-torres.md (已更新 — Tier 2 富化)
│ ├── alex-rivera.md (已创建 — Tier 1 富化)
│ └── .raw/
│ └── alex-rivera.json (已创建 — 原始 API 响应)
├── companies/
│ └── meridian-labs.md (已更新 — 新融资数据)
ops/
└── tasks.md (已更新 — 5 条新行动项)
memory/
└── meeting-notes-state.json (已更新 — 检查点)Example 2: Email Triage — Resolver + Enrichment in Action
An email monitor cron fires at 12:00 PM. Its prompt: “Read skills/executive-assistant/SKILL.md and skills/gmail/SKILL.md. Triage the inbox.”
邮件监控 cron 在 12:00 触发。提示:「阅读 skills/executive-assistant/SKILL.md 与 skills/gmail/SKILL.md。分拣收件箱。」
Step 1: Pull inbox. The agent calls the Gmail data source skill via its workspace integration. Gets 8 new emails since last check.
智能体通过工作区集成调用 Gmail 数据源技能。自上次检查以来有 8 封新邮件。
Step 2: Classify and handle. Most emails are routine: 2 scheduling confirmations (handled directly — checks calendar, sends confirmations), 3 newsletters (archived), 1 internal FYI (noted). But one stands out:
An email from “David Park, GP at Ridgeline Ventures” — subject: “Series A for NovaTech — co-invest opportunity.” The agent has never seen this person before.
多数邮件为常规:2 封日程确认(直接处理——查日历、发送确认),3 封通讯(归档),1 封内部知会(已记录)。但有一封突出:
来自「Ridgeline Ventures 的 GP David Park」的邮件——主题:「NovaTech 的 A 轮——联合投资机会。」智能体从未见过此人。
Step 3: Enrich the unknown sender.
The agent calls the enrich skill. Enrich searches the brain:
智能体调用 enrich 技能。Enrich 搜索 brain:
grep -rl "David Park" /data/brain/people/ --include="*.md" # no results
grep -rl "Ridgeline" /data/brain/companies/ --include="*.md" # no results
grep -rl "david.park@ridgeline" /data/brain/people/ --include="*.md" # no results (alias search)No match. → CREATE path.
- Reads RESOLVER.md: “a specific named person” →
people/ - Runs Tier 2 enrichment (this is an unsolicited email, not a key relationship yet):
- Web search: finds David Park’s profile on Ridgeline’s website. GP, focuses on enterprise SaaS. Previously at two other funds.
- Social search: finds his X account. Recent posts about AI infrastructure, developer tools. Reposted an article about NovaTech last week.
- Brain cross-reference: searches for NovaTech → finds
companies/novatech.mdexists (from a meeting 2 months ago). Cross-links.
- Creates
people/david-park.mdwith what it found — role, fund, investment focus, public voice, connection to NovaTech. - Also checks
companies/ridgeline-ventures.md— doesn’t exist. Creates a thin page with what’s known from the web search.
无匹配。→ CREATE 路径。
- 读 RESOLVER.md:「具名具体人物」→
people/ - 运行 Tier 2 富化(此为不请自来的邮件,尚非关键关系):
- Web search:在 Ridgeline 网站找到 David Park 简介。GP,专注企业 SaaS。此前在另两家基金。
- Social search:找到其 X 账号。近期帖文关于 AI 基础设施、开发者工具。上周转发了关于 NovaTech 的文章。
- Brain 交叉引用:搜索 NovaTech → 找到
companies/novatech.md已存在(来自 2 个月前的会议)。交叉链接。
- 创建
people/david-park.md,写入所发现内容——角色、基金、投资重点、公开声音、与 NovaTech 的连接。 - 同时检查
companies/ridgeline-ventures.md——不存在。用 web search 已知信息创建薄页。
Step 4: Back in the EA skill. Now the agent has context. It classifies the email:
- Priority: Medium (co-invest opportunity, not urgent)
- Context: David Park is a GP at a fund that focuses on enterprise SaaS. NovaTech is already in the brain from a previous meeting.
- Action needed: User should review
Posts to the Emails notification channel:
现在智能体有上下文。对邮件分类:
- 优先级:中(联合投资机会,非紧急)
- 上下文:David Park 是专注企业 SaaS 的基金的 GP。NovaTech 已在 brain 中,来自先前会议。
- 所需行动:用户应审阅
发布到 Emails 通知渠道:
Co-invest opportunity — NovaTech Series A From: David Park, GP at Ridgeline Ventures He’s reaching out about co-investing in NovaTech’s Series A. Ridgeline focuses on enterprise SaaS. NovaTech is already in the brain — you met their founder in February. Open in Gmail
联合投资机会 — NovaTech A 轮 来自:Ridgeline Ventures 的 GP David Park 他联系联合投资 NovaTech 的 A 轮。Ridgeline 专注企业 SaaS。 NovaTech 已在 brain 中——你 2 月见过其创始人。 在 Gmail 中打开
The email monitor didn’t just triage — it grew the brain by two pages (one person, one company) and cross-linked them to an existing entity.
邮件监控不只是分拣——它通过两页(一个人物、一个公司)扩展了 brain 并与已有实体交叉链接。
Example 3: The Compound Effect — How Context Builds Before a Meeting
This example shows how a completely unknown person becomes a rich brain page across 4 autonomous cron runs over 48 hours, with zero manual intervention. The result: you walk into a meeting fully prepared.
本例展示一个完全陌生的人如何在 48 小时内经 4 次自主 cron 运行变为丰富的 brain 页,零手动干预。结果:你走进会议时已充分准备。
Hour 0 — Social radar cron (Tuesday, 3:00 PM)
The social radar cron scans for mentions and engagement on X. It detects a reply to one of the user’s posts from an account named @lena_builds — a thoughtful, technical response about developer tooling that got 50+ likes.
The agent calls enrich. Brain search: no match for “Lena” or “lena_builds.” → CREATE, Tier 3 (minor mention — just a social interaction, not a relationship yet).
Creates people/lena-kovac.md with minimal data: X handle, display name, the reply text, and a note that she seems technical. No API calls — Tier 3 is source-extraction only.
社交雷达 cron 扫描 X 上的提及与互动。检测到用户某条帖子下的回复,账号 @lena_builds——关于开发者工具的有技术深度的回复,获赞 50+。
智能体调用 enrich。Brain 搜索:「Lena」或「lena_builds」无匹配。→ CREATE,Tier 3(次要提及——仅社交互动,尚非关系)。
创建 people/lena-kovac.md,最少数据:X 账号、显示名、回复文本、备注其看似技术向。无 API 调用——Tier 3 仅来源提取。
# Lena Kovac
> Technical builder. Engaged with a post about developer tooling on X.
## State
- **X:** @lena_builds
- **Relationship:** None yet — social interaction only
- **Confidence:** low (1 interaction)
---
## Timeline
- **2026-04-07** | X reply — Replied to post about developer tools.
Thoughtful technical take on compiler-driven UX. 50+ likes.Hour 18 — Email monitor cron (Wednesday, 9:00 AM)
The morning email sweep finds an email from lena@kovac.dev — subject: “Loved your talk at the devtools summit — would love to chat about what we’re building.”
The agent calls enrich. Searches the brain:
早间邮件扫描发现来自 lena@kovac.dev 的邮件——主题:「很喜欢你在 devtools 峰会的演讲——想聊聊我们在做的东西。」
智能体调用 enrich。搜索 brain:
grep -rl "lena" /data/brain/people/ --include="*.md" # finds people/lena-kovac.md
grep -rl "kovac.dev" /data/brain/people/ --include="*.md" # no alias match yetFinds the existing page. Reads it — it’s thin (Tier 3, just the X reply). The email adds a new signal AND an email address. → Upgrade to Tier 2.
- Adds
lena@kovac.devto aliases in frontmatter
找到现有页面。阅读——单薄(Tier 3,仅 X 回复)。邮件带来新信号与邮箱地址。→ 升级为 Tier 2。
- 将
lena@kovac.dev加入 frontmatter 的 aliases - Web search:找到个人站(
kovac.dev)——她在做名为 Lattice 的开发者工具初创。此前在某大厂编译器团队。 - Social search:更深挖 X。她定期发开发者体验、编译器、Rust。约 3K 粉丝。
- Brain 交叉引用:搜索「Lattice」与「compiler」——找到关于开发者工具的概念页,链接到同领域 2 家公司。
- 用实质内容更新
people/lena-kovac.md:职业史、在构建什么、对开发者工具的信念、公开声音。
Hour 26 — Executive assistant cron (Wednesday, 5:00 PM)
The afternoon EA sweep processes scheduling requests. One of the emails it triages is Lena’s — she asked to chat. The user’s calendar is open Thursday at 2 PM.
But the EA skill also checks: is there a calendar event already scheduled with this person? It searches the calendar — finds that Lena’s email (lena@kovac.dev) appears in a calendar event for Thursday at 2 PM (she booked through the user’s public booking link).
The EA skill sees the meeting is tomorrow. Calls enrich again. Page exists and is now Tier 2 with decent coverage, but there’s a meeting tomorrow. → Upgrade to Tier 1.
下午 EA 扫描处理日程请求。其中一封是 Lena 的——她想聊天。用户日历周四 14:00 有空。
但 EA 技能也检查:是否已有与此人的日历事件?搜索日历——发现 Lena 的邮箱(lena@kovac.dev)出现在周四 14:00 的日历事件中(她通过用户的公开预约链接预订)。
EA 技能发现会议在明天。再次调用 enrich。页面存在且为 Tier 2、覆盖尚可,但明天有会。→ 升级为 Tier 1。
- Network search:找到 LinkedIn URL。与用户有 2 位共同好友。
- People enrichment API:完整结构化档案——斯坦福 CS、大厂 4 年、8 个月前创办 Lattice。
- Semantic search:找到她以「为何开发者工具卡在 2015」为题的会议演讲。
- 全部保存到
people/.raw/lena-kovac.json - 用完整 Tier 1 深度更新 brain 页:信念、轨迹、在构建什么、评估、网络连接。
Hour 40 — Morning briefing cron (Thursday, 7:30 AM)
The morning briefing cron builds the daily prep. It reads the calendar: meeting with Lena Kovac at 2 PM. It reads people/lena-kovac.md — which is now a rich page.
Produces a prep note in the daily briefing:
早间简报 cron 构建当日准备。读日历:14:00 与 Lena Kovac 会面。读 people/lena-kovac.md——已是丰富页面。
在每日简报中生成准备笔记:
2:00 PM — Lena Kovac (Lattice) Building a developer tools startup focused on compiler-driven UX. Stanford CS, 4 years on compilers at [major tech co]. Founded Lattice 8 months ago. She replied to your devtools post on X last Tuesday (the technical one about compiler-driven UX that got traction). Then emailed the next morning — “loved your talk, want to chat about what we’re building.” Her public writing argues that developer tools are stuck in a 2015 paradigm and that compiler intelligence should drive the entire editing experience. She gave a talk on this at DevTools Summit. 2 mutual connections. She’s technical, has founder energy, and is building in a space you care about.
14:00 — Lena Kovac(Lattice) 做专注编译器驱动 UX 的开发者工具初创。斯坦福 CS,[大厂] 编译器团队 4 年。8 个月前创办 Lattice。 她上周二回复了你在 X 上关于开发者工具的帖子(关于编译器驱动 UX、有传播的那条)。次日早晨又发邮件——「很喜欢你的演讲,想聊聊我们在做的东西。」 她的公开写作认为开发者工具卡在 2015 范式,编译器智能应驱动整个编辑体验。她在 DevTools Summit 就此做过演讲。 2 位共同好友。她技术强、有创始人能量,且在你关心的领域创业。
The compound effect: Lena went from unknown → thin Tier 3 page → substantive Tier 2 page → rich Tier 1 page → meeting prep note. Four cron runs over 48 hours. Zero manual enrichment requests. The user walks into the meeting knowing exactly who Lena is, what she cares about, and why she reached out — because every pipeline is wired to call enrich, and enrich knows how to escalate tier based on relationship signals.
This is the core insight of the brain system: knowledge compounds autonomously when the plumbing is wired correctly. Each cron job doesn’t just do its own job — it feeds the enrichment pipeline, which feeds every future cron job. The meeting ingestion cron creates pages that the morning briefing cron reads. The email monitor enriches people that the social radar first detected. The whole system is a flywheel.
复利效应: Lena 从未知 → 单薄 Tier 3 页 → 实质 Tier 2 页 → 丰富 Tier 1 页 → 会议准备笔记。48 小时内四次 cron。零手动富化请求。用户走进会议时清楚 Lena 是谁、她关心什么、为何联系——因为每条管道都接线调用 enrich,且 enrich 知道如何根据关系信号升级层级。
这是 brain 系统的核心洞见:接线正确时,知识可自主复利。 每个 cron 不只完成本职工作——还喂养富化流水线,而富化又喂养未来的每个 cron。会议摄取 cron 创建的页面供早间简报 cron 阅读。邮件监控富化社交雷达首先发现的人。整个系统是飞轮。
Ingest Workflows
These are the specific ingest patterns. Each one calls enrich as its terminal step.
以下为具体摄取模式。每一种都以 enrich 为终端步骤。
Meeting ingestion
After every meeting (via Circleback, Otter, Fireflies, or manual notes):
- Pull meeting notes + full transcript
- Create a brain meeting page with your own analysis (not just regurgitated AI summary) — reframe through what you know about the attendees’ world
- Propagate to entity pages — call enrich for every person and company discussed. A meeting is NOT fully ingested until entity pages are updated.
- Extract action items to task list
- Commit
每场会议后(经 Circleback、Otter、Fireflies 或手动笔记):
- 拉取会议笔记 + 完整转录
- 创建 brain 会议页,含你自己的分析(非仅复述 AI 摘要)——结合你对与会者世界的了解重新框定
- 传播到实体页——对讨论的每个人与公司调用 enrich。会议在实体页更新前不算摄取完成。
- 将行动项提取到任务列表
- 提交
Email ingestion
When processing email:
- Extract people and companies mentioned
- Call enrich with email context (tone, requests, relationship signals)
- Note scheduling, commitments, follow-ups
处理邮件时:
- 提取提及的人物与公司
- 带邮件上下文(语气、请求、关系信号)调用 enrich
- 记录日程、承诺、跟进
Social media ingestion
When monitoring social media:
- Capture what people you track are saying publicly (beliefs, projects, opinions)
- Detect engagement patterns (who’s replying to you, who’s amplifying you)
- Call enrich for notable accounts → feed into “What They Believe” and “Hobby Horses” sections
监控社交媒体时:
- 捕获你跟踪的人公开说什么(信念、项目、观点)
- 检测互动模式(谁回复你、谁放大你)
- 对值得关注的账号调用 enrich → 汇入「What They Believe」与「Hobby Horses」节
Manual ingestion
When you mention someone or something in conversation:
- Your own comments are the highest-value signal — always capture these
- “Really sharp on the technical side, could be a good advisor for the infra project” → that goes in the person’s page immediately
- If the brain page is thin, trigger a full enrichment
当你在对话中提到某人或某事时:
- 你自己的评论是最高价值信号——始终捕获
- 「技术非常 sharp,可做 infra 项目的顾问」——立即写入该人物页
- 若 brain 页单薄,触发完整富化
Navigation and Concurrency
index.md — content catalog. Every page listed with a one-line summary. Useful for navigation and query routing.
log.md — chronological record of ingests and updates. Append-only.
At scale (500+ pages), add search tooling (embeddings, BM25, or tools like gbrain). At moderate scale, grep works well.
index.md——内容目录。每页一行摘要。用于导航与查询路由。
log.md——摄取与更新的时间序记录。仅追加。
规模达 500+ 页时,增加搜索工具(嵌入、BM25 或 gbrain 等工具)。中等规模下 grep 很好用。
Write hotspots and concurrency
Once you have cron jobs, ingest jobs, and sub-agents all touching the brain repo, index.md and log.md become merge-conflict magnets. Every workflow wants to append to log.md and update index.md on every commit.
Practical mitigations:
- Treat index.md as derived, not hand-maintained. Instead of updating it in every ingest workflow, rebuild it periodically (daily or on-demand) by scanning the directory tree. This eliminates it as a write hotspot.
- Make log.md append-safe. Each entry is a self-contained line with a timestamp prefix. Concurrent appends to the end of the file rarely conflict. If they do, both sides are correct — just keep both lines.
- Commit in batches, not per-page. When an ingest job updates 10 entity pages, commit once at the end, not 10 times. This reduces conflict surface.
- Pull before push. Every workflow should
git pull --rebasebefore pushing. With append-only log and independent entity pages, rebases almost always auto-resolve. - Entity pages rarely conflict. Two workflows updating
people/jane-doe.mdat the same time is rare because they’re triggered by different signals about different people. The real conflict hotspots are the shared files (index.md, log.md), which is why those should be append-only or derived.
一旦你有 cron、摄取任务与子智能体同时触及 brain 仓库,index.md 与 log.md 会成为合并冲突热点。 每个工作流都想在每次提交时追加 log.md 并更新 index.md。
实用缓解:
- 将 index.md 视为派生,非手维护。 不要在每次摄取工作流中更新它,而是定期(每日或按需)通过扫描目录树重建。消除其作为写热点。
- 使 log.md 可安全追加。 每条目为带时间戳前缀的独立一行。对文件末尾的并发追加很少冲突。若冲突,两边都对——保留两行。
- 批量提交,非每页一交。 当摄取任务更新 10 个实体页时,最后提交一次,而非 10 次。减少冲突面。
- 推送前先 pull。 每个工作流应在推送前
git pull --rebase。在仅追加的 log 与独立实体页上,rebase 几乎总能自动解决。 - 实体页很少冲突。 两个工作流同时更新
people/jane-doe.md罕见,因由不同人物的不同信号触发。真正冲突热点是共享文件(index.md、log.md),故这些应仅追加或派生。
Maintenance (Lint)
Periodically (weekly), the agent should:
- Deduplication scan: Look for potential duplicate pages — similar names, same company, same email across different pages. Merge when confirmed.
- Contradictions: Check for conflicting facts between pages (e.g., two pages listing different roles for the same person at the same company).
- Staleness: Flag State sections superseded by newer Timeline entries.
- Orphans: Find pages with no inbound links.
- Open Threads: Check for items that seem resolved but weren’t moved to Timeline.
- Missing cross-references: Entity A mentions Entity B but doesn’t link to their page.
- Missing pages: Entities mentioned frequently but lacking their own page.
- MECE filing: Flag any pages that seem to be in the wrong directory.
- Source audit: Check people pages for unsourced claims in high-value sections (Beliefs, Motivations, Assessment). Flag claims without source type or date.
- Alias coverage: Check if recent meeting transcripts or emails contain name variants not yet in any page’s aliases field.
定期(每周),智能体应:
- 去重扫描: 查找潜在重复页——相似姓名、同一公司、不同页面上的同一邮箱。确认后合并。
- 矛盾: 检查页面间冲突事实(如同一人在同一公司的不同角色)。
- 陈旧: 标记 State 节被较新时间线条目取代的情况。
- 孤儿: 查找无入链的页面。
- Open Threads: 检查看似已解决但未移到时间线的项。
- 缺失交叉引用: 实体 A 提及实体 B 但未链接其页面。
- 缺失页面: 频繁提及但无独立页面的实体。
- MECE 归档: 标记看似放错目录的页面。
- 来源审计: 检查人物页高价值节(Beliefs、Motivations、Assessment)中无来源的断言。标记无来源类型或日期的断言。
- 别名覆盖: 检查近期会议转录或邮件是否含尚未写入任何页 aliases 字段的名字变体。
What makes this different from RAG
RAG re-derives knowledge from scratch on every query. The brain pre-computes synthesis and keeps it current. Specifically:
- Cross-references are pre-built. You don’t need the LLM to discover that Person A works at Company B and was in Meeting C — that’s already linked.
- Contradictions are pre-flagged. When new data conflicts with old data, the agent resolves or flags it during ingest, not at query time.
- The compilation is persistent. Each source ingested makes the brain richer. Nothing is thrown away or re-derived.
- The structure itself is a prompt. Empty sections (“What They Believe: [No data yet]”) tell the agent what to look for next.
RAG 每次查询从零重新推导知识。Brain 预先计算综合并保持最新。具体而言:
- 交叉引用预先建立。 无需 LLM 去发现人物 A 在公司 B 且在场会议 C——已经链接。
- 矛盾预先标出。 新数据与旧数据冲突时,智能体在摄取时解决或标记,而非查询时。
- 编译是持久的。 每次摄取的来源都使 brain 更丰富。没有丢弃或重新推导。
- 结构本身即提示。 空节(「What They Believe: [No data yet]」)告诉智能体下次寻找什么。
Page Lifecycle
Brain pages can have implicit lifecycle states:
- Active: Current, recently updated, ongoing relationship or relevance
- Dormant: Not updated in 6+ months, relationship cooled, but still potentially relevant
- Archived: Moved to
archive/— dead companies, ended relationships, resolved deals. Historical record only. - Graduated: For ideas that became projects, or projects that became programs — the old page links to the new one
During lint passes, flag pages that haven’t been updated in 6+ months for review. Some should be archived; others just need a fresh enrichment pass.
Brain 页面可有隐式生命周期状态:
- Active: 当前、近期更新、持续的关系或相关性
- Dormant: 6+ 个月未更新,关系降温,但仍可能相关
- Archived: 移至
archive/——倒闭公司、结束的关系、已结案交易。仅历史记录。 - Graduated: 想法变为项目,或项目变为 program——旧页链接到新页
在 lint 中,标记 6+ 个月未更新的页面供审阅。部分应归档;部分只需新一轮富化。
What makes a great brain
A great brain lets you walk into any meeting, call, or decision already knowing:
- Who this person is and what they care about (30 seconds of reading)
- What the company’s actual state is (not what they said 6 months ago)
- What open threads exist between you (promises, follow-ups, deals)
- What changed recently (latest timeline entries)
- What to watch for (patterns, concerns, opportunities)
A bad brain is a pile of LinkedIn scrapes and meeting transcripts nobody reads. A good brain is compiled context that makes you more effective in every interaction.
优秀的 brain 让你走进任何会议、通话或决策时已知晓:
- 此人是谁、关心什么(30 秒阅读)
- 公司实际状态(非 6 个月前他们说的)
- 我们之间有哪些开放线索(承诺、跟进、交易)
- 近期变化(最新时间线条目)
- 需注意什么(模式、关切、机会)
糟糕的 brain 是一堆无人阅读的 LinkedIn 抓取与会议转录。好的 brain 是让你每次互动更有效的编译情境。
The Resolver
When creating or filing a new page, walk this decision tree. Every piece of knowledge has exactly one home.
创建或归档新页面时,按此决策树行走。每条知识有且仅有一个归属。
Decision Tree
Start here: what is the primary subject?
- A specific named person →
people/ - A specific organization (company, fund, nonprofit, government body) →
companies/ - A financial transaction with terms and a decision to make →
deals/ - A record of a specific meeting/call that happened at a specific time →
meetings/ - Something being actively built (has a repo, spec, team, or active work) →
projects/ - A raw possibility that nobody is building yet →
ideas/ - A reusable mental model or thesis about how the world works →
concepts/ - A piece of prose that could be published as a standalone work →
writing/ - Your institution’s strategy, org, processes, internal dynamics →
org/ - Political or civic landscape — policy, legislation, elections, government →
civic/ - Public narrative or content operations — social monitoring, content pipeline, published posts →
media/ - A major life program — an enduring domain of commitment containing multiple projects →
programs/ - Domestic operations — properties, logistics, household management →
household/ - Private notes — health, personal reflections, inner life →
personal/ - A hiring pipeline — candidate evaluations, role specs, interview notes →
hiring/ - A reusable LLM prompt — templates for getting specific outputs from models →
prompts/ - A raw data import or snapshot — bulk exports, API dumps, periodic captures →
sources/ - Agent deliverables — briefings, digests, and research produced by your agent →
agent/ - Unsorted / quick capture — you don’t know where it goes yet →
inbox/ - Dead / no longer relevant — historical pages with no active references →
archive/
从这里开始:主要主题是什么?
- 具名具体人物 →
people/ - 具体组织(公司、基金、非营利、政府机构)→
companies/ - 含条款与待决策的金融交易 →
deals/ - 在特定时间发生的特定会议/通话记录 →
meetings/ - 正在积极构建的事项(有仓库、规格、团队或活跃工作)→
projects/ - 尚无人构建的原始可能性 →
ideas/ - 关于世界如何运作的可复用心智模型或论点 →
concepts/ - 可作为独立作品发表的散文 →
writing/ - 你所在机构的战略、组织、流程、内部动态 →
org/ - 政治或公民版图——政策、立法、选举、政府 →
civic/ - 公共叙事或内容运营——社交监控、内容管道、已发帖子 →
media/ - 人生主要 program——包含多个项目的长期承诺领域 →
programs/ - 家庭运营——房产、后勤、家庭管理 →
household/ - 私人笔记——健康、个人反思、内在生活 →
personal/ - 招聘管道——候选人评估、岗位说明、面试笔记 →
hiring/ - 可复用 LLM 提示词——从模型获得特定输出的模板 →
prompts/ - 原始数据导入或快照——批量导出、API dump、定期抓取 →
sources/ - 智能体产出——简报、摘要、研究 →
agent/ - 未分类 / 快速捕获——尚不知放哪 →
inbox/ - 失效 / 不再相关——无活跃引用的历史页 →
archive/
Disambiguation Rules
When two directories seem to fit, apply these tiebreakers:
当两个目录看似都适用时,使用这些决胜规则:
Person vs. Company: If the page is about them as a human (beliefs, relationship, trajectory), it’s people/. If it’s about the organization they run, it’s companies/. Both pages link to each other.
Concept vs. Idea: Could you teach it to someone as a framework? Concept. Could you build it? Idea.
Concept vs. Personal: Would you share it in a professional talk? Concept. Is it private reflection? Personal.
Idea vs. Project: Is anyone working on it? If yes, project. If no, idea. The graduation moment is when work starts.
Writing vs. Concepts: Concepts are distilled (200 words of compiled truth). Writing is developed prose (argument, narrative, story).
Writing vs. Media: Writing is the artifact. Media is the production and distribution infrastructure.
Org vs. Programs: org/ is institutional knowledge about your organization. programs/ is about your personal role and priorities within it.
Civic vs. People: Political figures get people/ pages. Their legislative agenda and political positioning as civic actors goes in civic/.
Household vs. Personal: If a PA would execute on it, it’s household (operational). If it’s private reflection, it’s personal (inner life).
Sources vs. .raw/ sidecars: Per-entity enrichment data → .raw/ sidecar next to the entity. Bulk multi-entity imports → sources/.
Agent vs. Sources: Sources feed into the brain. Agent deliverables are synthesized output that feeds into your reading.
Person vs. Company: 若页面关于作为人的他们(信念、关系、轨迹),为 people/。若关于他们运营的组织,为 companies/。两页相互链接。
Concept vs. Idea: 你能把它教成框架吗?Concept。能做出它吗?Idea。
Concept vs. Personal: 你会在专业演讲中分享吗?Concept。私人反思吗?Personal。
Idea vs. Project: 有人在做了吗?是则 project,否则 idea。毕业时刻是工作开始。
Writing vs. Concepts: Concepts 是提炼的(200 字编译真相)。Writing 是展开的散文(论证、叙事、故事)。
Writing vs. Media: Writing 是成品。Media 是生产与分发基础设施。
Org vs. Programs: org/ 是关于你机构的制度知识。programs/ 关于你在其中的个人角色与优先级。
Civic vs. People: 政治人物有 people/ 页。其立法议程与作为公民行动者的政治定位入 civic/。
Household vs. Personal: 若私人助理会执行,为 household(运营)。私人反思则为 personal(内在生活)。
Sources vs. .raw/ sidecars: 按实体的富化数据 → 实体旁的
.raw/侧车。批量多实体导入 → sources/。Agent vs. Sources: Sources 喂入 brain。智能体产出是综合输出,喂入你的阅读。
Special directories (not knowledge)
These exist in the brain repo but aren’t knowledge directories:
- templates/ — page templates for each type (structural, not content)
- attachments/ — binary attachments (images, PDFs). Managed by your editor, not by the agent.
这些存在于 brain 仓库中,但不是知识目录:
- templates/——各类页面模板(结构性,非内容)
- attachments/——二进制附件(图片、PDF)。由编辑器管理,非智能体。
MECE Check
Every piece of knowledge should pass through the decision tree above and land in exactly one directory. If you find something that genuinely doesn’t fit any category, file it in inbox/ and flag it — that’s a signal the schema needs to evolve.
每条知识应经上述决策树落在唯一目录。若发现真正无法归类的内容,归档到 inbox/ 并标记——这是模式需要演进的信号。
Getting started
- Create the directory structure above (or let your agent create it)
- Write a
RESOLVER.mddecision tree and aREADME.mdresolver for each directory - Write a
schema.mdwith your page conventions and templates - Add the brain rules to your agent’s config (AGENTS.md or equivalent) as hard rules
- Start with one meeting transcript or one person you want to track
- Let the agent build the first few pages, review them, and iterate on the schema
- Wire up your meeting tool to trigger ingestion
- Wire up enrichment to fire on every new person/company signal
- The brain compounds from there
The human’s job: curate sources, direct analysis, ask good questions, and think about what it all means. The agent’s job: everything else.
- 按上文创建目录结构(或让智能体创建)
- 编写
RESOLVER.md决策树及每个目录的README.md解析器 - 编写含页面约定与模板的
schema.md - 将 brain 规则以硬规则加入智能体配置(AGENTS.md 或同等文件)
- 从一场会议转录或一个你想跟踪的人开始
- 让智能体构建前几页,你审阅并迭代 schema
- 将会议工具接线以触发摄取
- 将富化接线为每条新人物/公司信号触发
- Brain 由此复利
人的工作:策展来源、引导分析、提出好问题、思考这一切意味着什么。智能体的工作:其余一切。