321 lines
13 KiB
Markdown
321 lines
13 KiB
Markdown
---
|
|
name: Memory Builder
|
|
description: >
|
|
Extracts structured repository knowledge from source code, git history, ADRs, and
|
|
conversations, then writes it into the layered memory system under docs/agent/.
|
|
Use this skill when asked to "build memory", "update memory", "extract knowledge",
|
|
"refresh context files", "rebuild repository docs", "generate memory", "update context",
|
|
or any variation of rebuilding or refreshing the project's documentation context.
|
|
Also use when a significant code change has landed and the docs/agent/ files may be stale.
|
|
version: 1.0.0
|
|
---
|
|
|
|
# Memory Builder
|
|
|
|
Build, update, and audit the project's structured memory system. The output files
|
|
are the primary context for all future agent sessions — they must be accurate enough
|
|
that an agent with zero prior context can understand the project and make correct decisions.
|
|
|
|
## Memory Layout
|
|
|
|
```
|
|
docs/agent/
|
|
├── CURRENT_STATE.md # Layer 1: Live State (changes every session)
|
|
├── context/ # Layer 2: Structured Knowledge (per milestone)
|
|
│ ├── ARCHITECTURE.md # System design, workflows, data flow, pipeline
|
|
│ ├── CONVENTIONS.md # Coding patterns, rules, gotchas
|
|
│ ├── COMPONENTS.md # File map, extension points, test organization
|
|
│ ├── TECH_STACK.md # Dependencies, APIs, providers, versions
|
|
│ └── GLOSSARY.md # Term definitions, acronyms, data classes
|
|
├── decisions/ # Layer 3: Decisions (append-only ADRs)
|
|
│ └── NNN-short-name.md
|
|
└── plans/ # Layer 4: Plans (implementation checklists)
|
|
└── NNN-plan-name.md
|
|
```
|
|
|
|
Layers 1-2 are what this skill builds and maintains. Layers 3-4 are written by
|
|
other workflows (architecture-coordinator, planning sessions) and are read-only
|
|
inputs for this skill.
|
|
|
|
## Operational Modes
|
|
|
|
Pick the mode that fits the request:
|
|
|
|
### Mode 1: Full Build
|
|
|
|
Use when: no context files exist, or the user says "rebuild" / "build memory from scratch".
|
|
|
|
1. **Audit existing** — read any current `docs/agent/context/*.md` files. Note what exists.
|
|
2. **Discover sources** — dynamically scan the codebase (see Discovery below).
|
|
3. **Extract** — populate each context file (see Extraction Rules below).
|
|
4. **Cross-reference** — verify no contradictions between files.
|
|
5. **Quality gate** — run every check (see Quality Gate below).
|
|
6. **Report** — output the build report.
|
|
|
|
### Mode 2: Targeted Update
|
|
|
|
Use when: specific code changed and the user says "update memory" or similar.
|
|
|
|
1. Identify which source files changed (check `git diff`, user description, or recent commits).
|
|
2. Map changed files to affected context files using the extraction table below.
|
|
3. Read only the affected sections, update them, leave the rest untouched.
|
|
4. Update the freshness marker on modified files only.
|
|
|
|
### Mode 3: Audit
|
|
|
|
Use when: the user says "audit memory" or "verify context files".
|
|
|
|
1. For each of the 5 context files, pick 5 factual claims at random.
|
|
2. Verify each claim against the actual source code.
|
|
3. Report: claim, source file checked, pass/fail, correction if needed.
|
|
4. Fix any failures found.
|
|
|
|
---
|
|
|
|
## Discovery (Step 1)
|
|
|
|
Never hardcode file lists. Discover the codebase dynamically:
|
|
|
|
```
|
|
# Find all Python source files
|
|
find tradingagents/ cli/ -name "*.py" -type f
|
|
|
|
# Find config and metadata
|
|
ls pyproject.toml .env.example tradingagents/default_config.py
|
|
|
|
# Find all class definitions
|
|
grep -rn "^class " tradingagents/ cli/
|
|
|
|
# Find all @tool decorated functions
|
|
grep -rn "@tool" tradingagents/
|
|
|
|
# Find state/data classes
|
|
grep -rn "@dataclass" tradingagents/ cli/
|
|
```
|
|
|
|
### High-Signal Files (read these first)
|
|
|
|
| File | Why |
|
|
|------|-----|
|
|
| `tradingagents/default_config.py` | All config keys, defaults, env var pattern |
|
|
| `tradingagents/graph/trading_graph.py` | Trading workflow, agent wiring |
|
|
| `tradingagents/graph/scanner_graph.py` | Scanner workflow, parallel execution |
|
|
| `tradingagents/graph/setup.py` | Agent factory creation, LLM tiers |
|
|
| `tradingagents/agents/utils/tool_runner.py` | Inline tool execution loop |
|
|
| `cli/main.py` | CLI commands, entry points |
|
|
| `pyproject.toml` | Dependencies, versions, Python constraint |
|
|
| `docs/agent/decisions/*.md` | Architectural constraints (binding rules) |
|
|
|
|
### Source Priority
|
|
|
|
When information conflicts, trust in this order:
|
|
1. Source code (always wins)
|
|
2. ADRs in `docs/agent/decisions/`
|
|
3. Test files
|
|
4. Git history
|
|
5. Config files
|
|
6. README / other docs
|
|
|
|
---
|
|
|
|
## Extraction Rules (Step 2)
|
|
|
|
### CURRENT_STATE.md
|
|
|
|
Three sections only. Max 30 lines total.
|
|
|
|
```markdown
|
|
# Current Milestone
|
|
[One paragraph: what's the active focus and next deliverable]
|
|
|
|
# Recent Progress
|
|
[Bullet list: what shipped recently, merged PRs, key fixes]
|
|
|
|
# Active Blockers
|
|
[Bullet list: what's stuck or fragile, with brief context]
|
|
```
|
|
|
|
Source: `git log --oneline -20`, open PRs, known issues.
|
|
|
|
### ARCHITECTURE.md
|
|
|
|
System-level design. Someone reading only this file should understand how the
|
|
system works end-to-end.
|
|
|
|
| Section | What to extract | Source |
|
|
|---------|----------------|--------|
|
|
| System Description | One paragraph: what the project is, agent count, vendor count, provider count | `setup.py`, `default_config.py` |
|
|
| 3-Tier LLM System | Table: tier name, config key, default model, purpose | `default_config.py` |
|
|
| LLM Provider Factory | Table: provider, config value, client class | `setup.py` or wherever `get_llm_client` lives |
|
|
| Data Vendor Routing | Table: vendor, capabilities, role (primary/fallback) | `dataflows/`, vendor modules |
|
|
| Trading Pipeline | ASCII diagram: analysts → debate → trader → risk → judge | `trading_graph.py` |
|
|
| Scanner Pipeline | ASCII diagram: parallel scanners → deep dive → synthesis | `scanner_graph.py` |
|
|
| Pipeline Bridge | How scanner output feeds into per-ticker analysis | `macro_bridge.py` or pipeline module |
|
|
| CLI Architecture | Commands, UI components (Rich, MessageBuffer) | `cli/main.py` |
|
|
| Key Source Files | Table: file path, purpose (10-15 files max) | Discovery step |
|
|
|
|
### CONVENTIONS.md
|
|
|
|
Rules and patterns. Written as imperatives — "Always...", "Never...", "Use...".
|
|
Every convention must cite its source file.
|
|
|
|
| Section | What to extract |
|
|
|---------|----------------|
|
|
| Configuration | Env var override pattern, per-tier overrides, `.env` loading |
|
|
| Agent Creation | Factory closure pattern `create_X(llm)`, tool binding rules |
|
|
| Tool Execution | Trading: `ToolNode` in graph. Scanners: `run_tool_loop()`. Constants: `MAX_TOOL_ROUNDS`, `MIN_REPORT_LENGTH` |
|
|
| Vendor Routing | Fail-fast default (ADR 011), `FALLBACK_ALLOWED` whitelist (list all methods), exception types to catch |
|
|
| yfinance Gotchas | `top_companies` has ticker as INDEX not column, `Sector.overview` has no perf data, ETF proxies |
|
|
| LangGraph State | Parallel writes need reducers, `_last_value` reducer, list all state classes |
|
|
| Threading | Rate limiter: never hold lock during sleep/IO, rate limits per vendor |
|
|
| Ollama | Never hardcode `localhost:11434`, use configured `base_url` |
|
|
| CLI Patterns | Typer commands, Rich UI, `MessageBuffer`, `StatsCallbackHandler` |
|
|
| Pipeline Patterns | `MacroBridge`, `ConvictionLevel`, `extract_json()` |
|
|
| Testing | pytest commands, markers, mocking patterns (`VENDOR_METHODS` dict patching), env isolation |
|
|
| Error Handling | Fail-fast by default, exception hierarchies per vendor, `raise from` chaining |
|
|
|
|
### COMPONENTS.md
|
|
|
|
Concrete inventory. The reader should be able to find any file or class quickly.
|
|
|
|
| Section | What to extract |
|
|
|---------|----------------|
|
|
| Directory Tree | Run `find` and format as tree. Verify against actual filesystem. |
|
|
| Class Inventory | Table: class name, file, purpose. Use `grep "^class "` to discover. |
|
|
| Extension Guides | Step-by-step: how to add a new analyst, scanner, vendor, config key, LLM provider |
|
|
| CLI Commands | Table: command name, description, entry point |
|
|
| Test Organization | Table: test file, type (unit/integration), what it covers, markers |
|
|
|
|
### TECH_STACK.md
|
|
|
|
Dependencies and external services. All version constraints come from `pyproject.toml`.
|
|
|
|
| Section | What to extract |
|
|
|---------|----------------|
|
|
| Core Dependencies | Table: package, version constraint (from pyproject.toml), purpose |
|
|
| External APIs | Table: service, auth env var, rate limit, primary use |
|
|
| LLM Providers | Table: provider, config value, client class, example models |
|
|
| Python Version | From pyproject.toml `requires-python` |
|
|
| Dev Tools | pytest version, conda, etc. |
|
|
|
|
Do NOT include packages that aren't in `pyproject.toml`. Do NOT list aspirational
|
|
or unused dependencies.
|
|
|
|
### GLOSSARY.md
|
|
|
|
Every project-specific term, organized by domain. Each term cites its source file.
|
|
|
|
| Domain Section | Terms to include |
|
|
|---------------|-----------------|
|
|
| Agents & Workflows | Trading Graph, Scanner Graph, Agent Factory, ToolNode, run_tool_loop, Nudge |
|
|
| Data Layer | route_to_vendor, VENDOR_METHODS, FALLBACK_ALLOWED, ETF Proxy, etc. |
|
|
| Configuration | quick_think, mid_think, deep_think, _env(), _env_int() |
|
|
| Vendor-Specific | Exception types (AlphaVantageError, FinnhubError, etc.) |
|
|
| State & Data Classes | All @dataclass classes, state types |
|
|
| Pipeline | MacroBridge, MacroContext, StockCandidate, TickerResult, ConvictionLevel |
|
|
| CLI | MessageBuffer, StatsCallbackHandler, AnalystType, FIXED_AGENTS |
|
|
| Constants | All significant constants with actual values and source line |
|
|
|
|
---
|
|
|
|
## Quality Gate (Step 3)
|
|
|
|
Every context file must pass ALL of these before you're done:
|
|
|
|
1. **Accurate** — Every statement is verifiable in the current source code. If you wrote
|
|
"17 agents", count them. If you wrote ">=3.10", check pyproject.toml.
|
|
2. **Current** — Reflects the latest code on the working branch, not an old snapshot.
|
|
3. **Complete** — All 8 subsystems covered: agents, dataflows, graphs, pipeline, CLI,
|
|
LLM clients, config, tests. If a subsystem is missing, the gate fails.
|
|
4. **Concise** — No information duplicated across context files. Each fact lives in
|
|
exactly one file.
|
|
5. **Navigable** — A reader can find any specific fact within 2 scrolls or searches.
|
|
6. **Quantified** — Constants use actual values from source code (e.g., `MAX_TOOL_ROUNDS=5`),
|
|
never vague descriptions ("a maximum number of rounds").
|
|
7. **Cross-referenced** — Every convention cites a source file. Every glossary term
|
|
links to where it's defined.
|
|
|
|
### Mandatory Verification Steps
|
|
|
|
These checks catch the most common errors. Run them before declaring the gate passed.
|
|
|
|
1. **Dependency verification**: Parse `pyproject.toml` `[project.dependencies]` and
|
|
`[project.optional-dependencies]`. Only list packages that actually appear there.
|
|
If a package exists in source imports but not in pyproject.toml, flag it as
|
|
"undeclared dependency" — do not silently add it to TECH_STACK.
|
|
|
|
2. **Model name verification**: Read `default_config.py` and extract the actual model
|
|
identifiers (e.g., `"gpt-4o-mini"`, not guessed names). Cross-check any model names
|
|
in ARCHITECTURE.md against what's actually in the config.
|
|
|
|
3. **Agent count verification**: Run `grep -rn "def create_" tradingagents/agents/` and
|
|
count unique agent factory functions. Use the real count, not an estimate.
|
|
|
|
4. **ADR cross-reference verification**: Every ADR cited in context files (e.g., "ADR 011")
|
|
must exist in `docs/agent/decisions/`. Run `ls docs/agent/decisions/` and confirm.
|
|
|
|
5. **Class existence verification**: Every class listed in COMPONENTS.md must exist in
|
|
the codebase. Run `grep -rn "^class ClassName" tradingagents/ cli/` for each one.
|
|
|
|
### What NOT to do
|
|
|
|
- Do not copy code blocks into docs — reference the file and line instead
|
|
- Do not describe aspirational or planned features as current facts
|
|
- Do not use stale information from old branches or outdated READMEs
|
|
- Do not round numbers — use the exact values from source
|
|
- Do not skip CLI or pipeline subsystems (common oversight)
|
|
- Do not list dependencies without version constraints from pyproject.toml
|
|
- Do not list model names you haven't verified in default_config.py
|
|
- Do not include packages from imports that aren't declared in pyproject.toml
|
|
- Do not exceed 200 lines per context file — if you're over, split or trim
|
|
|
|
## Freshness Markers (Step 4)
|
|
|
|
Every context file starts with:
|
|
```
|
|
<!-- Last verified: YYYY-MM-DD -->
|
|
```
|
|
Set to today's date when creating or updating the file.
|
|
|
|
## Build Report (Step 5)
|
|
|
|
After completing any mode, output:
|
|
|
|
```
|
|
## Memory Build Report
|
|
|
|
**Mode**: Build | Update | Audit
|
|
**Date**: YYYY-MM-DD
|
|
|
|
### Files
|
|
| File | Status | Lines |
|
|
|------|--------|-------|
|
|
| CURRENT_STATE.md | created/updated/unchanged | N |
|
|
| context/ARCHITECTURE.md | created/updated/unchanged | N |
|
|
| ... | ... | ... |
|
|
|
|
### Sources Consulted
|
|
- [list of key files read]
|
|
|
|
### Quality Gate
|
|
| Criterion | Status |
|
|
|-----------|--------|
|
|
| Accurate | pass/fail |
|
|
| Current | pass/fail |
|
|
| Complete | pass/fail |
|
|
| Concise | pass/fail |
|
|
| Navigable | pass/fail |
|
|
| Quantified | pass/fail |
|
|
| Cross-referenced | pass/fail |
|
|
|
|
### Subsystem Coverage
|
|
- [x] Agents
|
|
- [x] Dataflows
|
|
- [x] Graphs
|
|
- [x] Pipeline
|
|
- [x] CLI
|
|
- [x] LLM Clients
|
|
- [x] Config
|
|
- [x] Tests
|
|
```
|