TradingAgents/docs/STRUCTURED_OUTPUT_GUIDE.md

142 lines
4.1 KiB
Markdown

# Structured Output Implementation Guide
This guide shows how to use structured outputs in TradingAgents to eliminate manual parsing and improve reliability.
## Overview
Structured outputs use Pydantic schemas to ensure LLM responses match expected formats. This eliminates:
- Manual JSON parsing
- String manipulation errors
- Type validation issues
- Response format inconsistencies
## Available Schemas
Located in `tradingagents/schemas/llm_outputs.py`:
- **TradeDecision**: Trading decisions (BUY/SELL/HOLD) with rationale
- **TickerList**: List of validated ticker symbols
- **MarketMovers**: Market gainers and losers
- **InvestmentOpportunity**: Ranked investment opportunities
- **RankedOpportunities**: Multiple opportunities with market context
- **DebateDecision**: Research manager debate decisions
- **RiskAssessment**: Risk management decisions
## Usage Examples
### Basic Usage
```python
from tradingagents.schemas import TickerList
from tradingagents.utils.structured_output import get_structured_llm
# Configure LLM for structured output
structured_llm = get_structured_llm(llm, TickerList)
# Get structured response
response = structured_llm.invoke([HumanMessage(content=prompt)])
# Response is a dict matching the schema
tickers = response.get("tickers", [])
```
### Discovery Graph Example
```python
# Before (manual parsing):
response = llm.invoke([HumanMessage(content=prompt)])
content = response.content.replace("```json", "").replace("```", "").strip()
movers = json.loads(content) # Can fail!
# After (structured output):
from tradingagents.schemas import MarketMovers
structured_llm = llm.with_structured_output(
schema=MarketMovers.model_json_schema(),
method="json_schema"
)
response = structured_llm.invoke([HumanMessage(content=prompt)])
movers = response.get("movers", []) # Always valid!
```
### Trade Decision Example
```python
from tradingagents.schemas import TradeDecision
structured_llm = get_structured_llm(llm, TradeDecision)
prompt = "Based on this analysis, should I buy AAPL?"
response = structured_llm.invoke(prompt)
# Guaranteed structure:
decision = response["decision"] # "BUY", "SELL", or "HOLD"
rationale = response["rationale"] # string
confidence = response["confidence"] # "high", "medium", or "low"
key_factors = response["key_factors"] # list of strings
```
## Implementation Checklist
When adding structured outputs to a new area:
1. **Define Schema**: Create or use existing Pydantic model in `schemas/llm_outputs.py`
2. **Update Prompt**: Modify prompt to request JSON output matching schema
3. **Configure LLM**: Use `with_structured_output()` or `get_structured_llm()`
4. **Access Response**: Use dict access instead of parsing
5. **Remove Parsing**: Delete old JSON parsing, regex, or string manipulation code
## Current Implementation Status
**Implemented**:
- Discovery Graph ticker extraction (Reddit, Twitter)
- Discovery Graph market movers parsing
🔄 **Recommended Next**:
- Trader final decision extraction
- Research manager debate decisions
- Risk manager assessments
- Discovery ranker output
## Benefits
- **Type Safety**: Pydantic validates all fields
- **No Parsing Errors**: No more `json.loads()` failures
- **Better Prompts**: Schema defines exact output format
- **Easier Testing**: Mock responses match schema
- **Self-Documenting**: Schema shows expected structure
## Adding New Schemas
1. Define in `tradingagents/schemas/llm_outputs.py`:
```python
class MySchema(BaseModel):
field1: str = Field(description="What this field contains")
field2: Literal["option1", "option2"] = Field(description="Limited choices")
field3: List[str] = Field(description="List of items")
```
2. Export in `tradingagents/schemas/__init__.py`
3. Use in your code:
```python
from tradingagents.schemas import MySchema
structured_llm = llm.with_structured_output(
schema=MySchema.model_json_schema(),
method="json_schema"
)
```
## Provider Support
Structured outputs work with:
- ✅ OpenAI (GPT-4, GPT-3.5)
- ✅ Google (Gemini models)
- ✅ Anthropic (Claude models)
- ✅ Local models via Ollama/OpenRouter
All use the same `with_structured_output()` API.