13 KiB

Raw Blame History

name	description	version
Memory Builder	Extracts structured repository knowledge from source code, git history, ADRs, and conversations, then writes it into the layered memory system under docs/agent/. Use this skill when asked to "build memory", "update memory", "extract knowledge", "refresh context files", "rebuild repository docs", "generate memory", "update context", or any variation of rebuilding or refreshing the project's documentation context. Also use when a significant code change has landed and the docs/agent/ files may be stale.	1.0.0

Memory Builder

Build, update, and audit the project's structured memory system. The output files are the primary context for all future agent sessions — they must be accurate enough that an agent with zero prior context can understand the project and make correct decisions.

Memory Layout

docs/agent/
├── CURRENT_STATE.md              # Layer 1: Live State (changes every session)
├── context/                      # Layer 2: Structured Knowledge (per milestone)
│   ├── ARCHITECTURE.md           #   System design, workflows, data flow, pipeline
│   ├── CONVENTIONS.md            #   Coding patterns, rules, gotchas
│   ├── COMPONENTS.md             #   File map, extension points, test organization
│   ├── TECH_STACK.md             #   Dependencies, APIs, providers, versions
│   └── GLOSSARY.md               #   Term definitions, acronyms, data classes
├── decisions/                    # Layer 3: Decisions (append-only ADRs)
│   └── NNN-short-name.md
└── plans/                        # Layer 4: Plans (implementation checklists)
    └── NNN-plan-name.md

Layers 1-2 are what this skill builds and maintains. Layers 3-4 are written by other workflows (architecture-coordinator, planning sessions) and are read-only inputs for this skill.

Operational Modes

Pick the mode that fits the request:

Mode 1: Full Build

Use when: no context files exist, or the user says "rebuild" / "build memory from scratch".

Audit existing — read any current docs/agent/context/*.md files. Note what exists.
Discover sources — dynamically scan the codebase (see Discovery below).
Extract — populate each context file (see Extraction Rules below).
Cross-reference — verify no contradictions between files.
Quality gate — run every check (see Quality Gate below).
Report — output the build report.

Mode 2: Targeted Update

Use when: specific code changed and the user says "update memory" or similar.

Identify which source files changed (check git diff, user description, or recent commits).
Map changed files to affected context files using the extraction table below.
Read only the affected sections, update them, leave the rest untouched.
Update the freshness marker on modified files only.

Mode 3: Audit

Use when: the user says "audit memory" or "verify context files".

For each of the 5 context files, pick 5 factual claims at random.
Verify each claim against the actual source code.
Report: claim, source file checked, pass/fail, correction if needed.
Fix any failures found.

Discovery (Step 1)

Never hardcode file lists. Discover the codebase dynamically:

# Find all Python source files
find tradingagents/ cli/ -name "*.py" -type f

# Find config and metadata
ls pyproject.toml .env.example tradingagents/default_config.py

# Find all class definitions
grep -rn "^class " tradingagents/ cli/

# Find all @tool decorated functions
grep -rn "@tool" tradingagents/

# Find state/data classes
grep -rn "@dataclass" tradingagents/ cli/

High-Signal Files (read these first)

File	Why
`tradingagents/default_config.py`	All config keys, defaults, env var pattern
`tradingagents/graph/trading_graph.py`	Trading workflow, agent wiring
`tradingagents/graph/scanner_graph.py`	Scanner workflow, parallel execution
`tradingagents/graph/setup.py`	Agent factory creation, LLM tiers
`tradingagents/agents/utils/tool_runner.py`	Inline tool execution loop
`cli/main.py`	CLI commands, entry points
`pyproject.toml`	Dependencies, versions, Python constraint
`docs/agent/decisions/*.md`	Architectural constraints (binding rules)

Source Priority

When information conflicts, trust in this order:

Source code (always wins)
ADRs in docs/agent/decisions/
Test files
Git history
Config files
README / other docs

Extraction Rules (Step 2)

CURRENT_STATE.md

Three sections only. Max 30 lines total.

# Current Milestone
[One paragraph: what's the active focus and next deliverable]

# Recent Progress
[Bullet list: what shipped recently, merged PRs, key fixes]

# Active Blockers
[Bullet list: what's stuck or fragile, with brief context]

Source: git log --oneline -20, open PRs, known issues.

ARCHITECTURE.md

System-level design. Someone reading only this file should understand how the system works end-to-end.

Section	What to extract	Source
System Description	One paragraph: what the project is, agent count, vendor count, provider count	`setup.py`, `default_config.py`
3-Tier LLM System	Table: tier name, config key, default model, purpose	`default_config.py`
LLM Provider Factory	Table: provider, config value, client class	`setup.py` or wherever `get_llm_client` lives
Data Vendor Routing	Table: vendor, capabilities, role (primary/fallback)	`dataflows/`, vendor modules
Trading Pipeline	ASCII diagram: analysts → debate → trader → risk → judge	`trading_graph.py`
Scanner Pipeline	ASCII diagram: parallel scanners → deep dive → synthesis	`scanner_graph.py`
Pipeline Bridge	How scanner output feeds into per-ticker analysis	`macro_bridge.py` or pipeline module
CLI Architecture	Commands, UI components (Rich, MessageBuffer)	`cli/main.py`
Key Source Files	Table: file path, purpose (10-15 files max)	Discovery step

CONVENTIONS.md

Rules and patterns. Written as imperatives — "Always...", "Never...", "Use...". Every convention must cite its source file.

Section	What to extract
Configuration	Env var override pattern, per-tier overrides, `.env` loading
Agent Creation	Factory closure pattern `create_X(llm)`, tool binding rules
Tool Execution	Trading: `ToolNode` in graph. Scanners: `run_tool_loop()`. Constants: `MAX_TOOL_ROUNDS`, `MIN_REPORT_LENGTH`
Vendor Routing	Fail-fast default (ADR 011), `FALLBACK_ALLOWED` whitelist (list all methods), exception types to catch
yfinance Gotchas	`top_companies` has ticker as INDEX not column, `Sector.overview` has no perf data, ETF proxies
LangGraph State	Parallel writes need reducers, `_last_value` reducer, list all state classes
Threading	Rate limiter: never hold lock during sleep/IO, rate limits per vendor
Ollama	Never hardcode `localhost:11434`, use configured `base_url`
CLI Patterns	Typer commands, Rich UI, `MessageBuffer`, `StatsCallbackHandler`
Pipeline Patterns	`MacroBridge`, `ConvictionLevel`, `extract_json()`
Testing	pytest commands, markers, mocking patterns (`VENDOR_METHODS` dict patching), env isolation
Error Handling	Fail-fast by default, exception hierarchies per vendor, `raise from` chaining

COMPONENTS.md

Concrete inventory. The reader should be able to find any file or class quickly.

Section	What to extract
Directory Tree	Run `find` and format as tree. Verify against actual filesystem.
Class Inventory	Table: class name, file, purpose. Use `grep "^class "` to discover.
Extension Guides	Step-by-step: how to add a new analyst, scanner, vendor, config key, LLM provider
CLI Commands	Table: command name, description, entry point
Test Organization	Table: test file, type (unit/integration), what it covers, markers

TECH_STACK.md

Dependencies and external services. All version constraints come from pyproject.toml.

Section	What to extract
Core Dependencies	Table: package, version constraint (from pyproject.toml), purpose
External APIs	Table: service, auth env var, rate limit, primary use
LLM Providers	Table: provider, config value, client class, example models
Python Version	From pyproject.toml `requires-python`
Dev Tools	pytest version, conda, etc.

Do NOT include packages that aren't in pyproject.toml. Do NOT list aspirational or unused dependencies.

GLOSSARY.md

Every project-specific term, organized by domain. Each term cites its source file.

Domain Section	Terms to include
Agents & Workflows	Trading Graph, Scanner Graph, Agent Factory, ToolNode, run_tool_loop, Nudge
Data Layer	route_to_vendor, VENDOR_METHODS, FALLBACK_ALLOWED, ETF Proxy, etc.
Configuration	quick_think, mid_think, deep_think, _env(), _env_int()
Vendor-Specific	Exception types (AlphaVantageError, FinnhubError, etc.)
State & Data Classes	All @dataclass classes, state types
Pipeline	MacroBridge, MacroContext, StockCandidate, TickerResult, ConvictionLevel
CLI	MessageBuffer, StatsCallbackHandler, AnalystType, FIXED_AGENTS
Constants	All significant constants with actual values and source line

Quality Gate (Step 3)

Every context file must pass ALL of these before you're done:

Accurate — Every statement is verifiable in the current source code. If you wrote "17 agents", count them. If you wrote ">=3.10", check pyproject.toml.
Current — Reflects the latest code on the working branch, not an old snapshot.
Complete — All 8 subsystems covered: agents, dataflows, graphs, pipeline, CLI, LLM clients, config, tests. If a subsystem is missing, the gate fails.
Concise — No information duplicated across context files. Each fact lives in exactly one file.
Navigable — A reader can find any specific fact within 2 scrolls or searches.
Quantified — Constants use actual values from source code (e.g., MAX_TOOL_ROUNDS=5), never vague descriptions ("a maximum number of rounds").
Cross-referenced — Every convention cites a source file. Every glossary term links to where it's defined.

Mandatory Verification Steps

These checks catch the most common errors. Run them before declaring the gate passed.

Dependency verification: Parse pyproject.toml [project.dependencies] and [project.optional-dependencies]. Only list packages that actually appear there. If a package exists in source imports but not in pyproject.toml, flag it as "undeclared dependency" — do not silently add it to TECH_STACK.
Model name verification: Read default_config.py and extract the actual model identifiers (e.g., "gpt-4o-mini", not guessed names). Cross-check any model names in ARCHITECTURE.md against what's actually in the config.
Agent count verification: Run grep -rn "def create_" tradingagents/agents/ and count unique agent factory functions. Use the real count, not an estimate.
ADR cross-reference verification: Every ADR cited in context files (e.g., "ADR 011") must exist in docs/agent/decisions/. Run ls docs/agent/decisions/ and confirm.
Class existence verification: Every class listed in COMPONENTS.md must exist in the codebase. Run grep -rn "^class ClassName" tradingagents/ cli/ for each one.

What NOT to do

Do not copy code blocks into docs — reference the file and line instead
Do not describe aspirational or planned features as current facts
Do not use stale information from old branches or outdated READMEs
Do not round numbers — use the exact values from source
Do not skip CLI or pipeline subsystems (common oversight)
Do not list dependencies without version constraints from pyproject.toml
Do not list model names you haven't verified in default_config.py
Do not include packages from imports that aren't declared in pyproject.toml
Do not exceed 200 lines per context file — if you're over, split or trim

Freshness Markers (Step 4)

Every context file starts with:

<!-- Last verified: YYYY-MM-DD -->

Set to today's date when creating or updating the file.

Build Report (Step 5)

After completing any mode, output:

## Memory Build Report

**Mode**: Build | Update | Audit
**Date**: YYYY-MM-DD

### Files
| File | Status | Lines |
|------|--------|-------|
| CURRENT_STATE.md | created/updated/unchanged | N |
| context/ARCHITECTURE.md | created/updated/unchanged | N |
| ... | ... | ... |

### Sources Consulted
- [list of key files read]

### Quality Gate
| Criterion | Status |
|-----------|--------|
| Accurate | pass/fail |
| Current | pass/fail |
| Complete | pass/fail |
| Concise | pass/fail |
| Navigable | pass/fail |
| Quantified | pass/fail |
| Cross-referenced | pass/fail |

### Subsystem Coverage
- [x] Agents
- [x] Dataflows
- [x] Graphs
- [x] Pipeline
- [x] CLI
- [x] LLM Clients
- [x] Config
- [x] Tests

13 KiB Raw Blame History