10 KiB
10 KiB
Product Requirements Document: FundamentalDataService Completion
Overview
Complete the FundamentalDataService to provide strongly-typed fundamental financial data to trading agents using a local-first data strategy with gap detection and intelligent caching.
Current State Analysis
Issues to Fix
- CRITICAL: Service calls
FinnhubClientmethods with string dates but client expectsdateobjects - CRITICAL: References non-existent
self.simfin_clientinstead ofself.finnhub_client - Missing strongly-typed interfaces between components
- Incomplete local-first strategy implementation
- No concrete gap detection logic
- Missing error recovery for partial data
What Works
- ✅
FinnhubClientfully implemented with strictdateobject interface - ✅
FundamentalDataRepositorywith dataclass-based storage - ✅
FundamentalContextPydantic model for agent consumption - ✅ Basic service structure and error handling
Technical Requirements
1. Strongly-Typed Interfaces
Client → Service Interface
# FinnhubClient methods (already implemented)
def get_balance_sheet(symbol: str, frequency: str, report_date: date) -> dict[str, Any]
def get_income_statement(symbol: str, frequency: str, report_date: date) -> dict[str, Any]
def get_cash_flow(symbol: str, frequency: str, report_date: date) -> dict[str, Any]
Service → Repository Interface
# Repository methods (already implemented)
def has_data_for_period(symbol: str, start_date: str, end_date: str, frequency: str) -> bool
def get_data(symbol: str, start_date: str, end_date: str, frequency: str) -> dict[str, Any]
def store_data(symbol: str, cache_data: dict, frequency: str, overwrite: bool) -> bool
def clear_data(symbol: str, start_date: str, end_date: str, frequency: str) -> bool
Service → Agent Interface
# Service output (already defined)
def get_context(symbol: str, start_date: str, end_date: str, frequency: str, force_refresh: bool) -> FundamentalContext
2. Local-First Data Strategy
Flow
- Repository Lookup: Check
FundamentalDataRepository.has_data_for_period() - Gap Detection: Identify missing data periods using
detect_fundamental_gaps() - Selective Fetching: Fetch only missing data from
FinnhubClient - Cache Updates: Store new data via
repository.store_data() - Context Assembly: Return validated
FundamentalContext
Gap Detection Implementation
def detect_fundamental_gaps(self, symbol: str, start_date: str, end_date: str, frequency: str) -> list[str]:
"""
Returns list of report dates that need fetching.
Example: If requesting quarterly from 2024-01-01 to 2024-12-31
and cache has Q1 and Q3, returns ["2024-06-30", "2024-09-30", "2024-12-31"]
For quarterly: Check for Q1 (Mar 31), Q2 (Jun 30), Q3 (Sep 30), Q4 (Dec 31)
For annual: Check for fiscal year ends
"""
# Implementation should:
# 1. Get existing report dates from repository
# 2. Calculate expected report dates in requested period
# 3. Return difference between expected and existing
Force Refresh Support
force_refresh=Truebypasses local data completely- Clears existing cache before fetching fresh data
- Stores refreshed data with metadata indicating refresh
Cache Invalidation Strategy
- Fundamental data is immutable: Once a report is filed, it doesn't change
- No staleness checks needed: Reports are valid indefinitely
- Only fetch if missing: Never re-fetch existing reports
3. Date Object Conversion
Service Boundary Conversion
# Service receives string dates from agents
def get_context(self, symbol: str, start_date: str, end_date: str, ...) -> FundamentalContext:
# Validate date strings
try:
start_dt = date.fromisoformat(start_date)
end_dt = date.fromisoformat(end_date)
except ValueError as e:
raise ValueError(f"Invalid date format: {e}")
# Check date order
if end_dt < start_dt:
raise ValueError(f"End date {end_date} is before start date {start_date}")
# Use date objects when calling FinnhubClient
data = self.finnhub_client.get_balance_sheet(symbol, frequency, end_dt)
4. Error Recovery and Partial Data
def handle_partial_statements(
self,
balance_sheet: dict | None,
income_statement: dict | None,
cash_flow: dict | None
) -> FundamentalContext:
"""
Create context even if some statements are missing.
- If all statements fail: Raise exception
- If some statements succeed: Return partial context
- Mark missing statements in metadata
"""
metadata = {
"has_balance_sheet": balance_sheet is not None,
"has_income_statement": income_statement is not None,
"has_cash_flow": cash_flow is not None,
"partial_data": any(s is None for s in [balance_sheet, income_statement, cash_flow])
}
# Convert available statements to FinancialStatement objects
# Return FundamentalContext with available data
5. Pydantic Validation
Context Structure
@dataclass
class FundamentalContext(BaseModel):
symbol: str
period: dict[str, str] # {"start": "2024-01-01", "end": "2024-01-31"}
balance_sheet: FinancialStatement | None
income_statement: FinancialStatement | None
cash_flow: FinancialStatement | None
key_ratios: dict[str, float]
metadata: dict[str, Any]
@validator('period')
def validate_period(cls, v):
# Ensure start and end dates are present and valid
return v
Implementation Tasks
Phase 1: Fix Critical Issues
-
Date Conversion Fix
- Add
date.fromisoformat()conversion in service methods - Add date validation (format, order)
- Update all
FinnhubClientmethod calls to usedateobjects - File:
tradingagents/services/fundamental_data_service.py:153, 164, 175
- Add
-
Client Reference Fix
- Replace
self.simfin_clientwithself.finnhub_client - File:
tradingagents/services/fundamental_data_service.py:375
- Replace
Phase 2: Enhanced Local-First Strategy
-
Gap Detection Logic
- Implement
detect_fundamental_gaps()method - Calculate expected report dates based on frequency
- Compare with cached data to find gaps
- Handle fiscal year variations
- Implement
-
Partial Data Handling
- Implement
handle_partial_statements()method - Continue processing if some statements succeed
- Mark missing data in metadata
- Only fail if all statements fail
- Implement
Phase 3: Type Safety & Validation
-
Comprehensive Type Checking
- Run
mise run typecheck- must pass with 0 errors - Validate all
dateobject conversions - Ensure Pydantic model compliance
- Run
-
Enhanced Testing
- Update existing tests for new date handling
- Add gap detection test scenarios
- Test partial data scenarios
- Test force refresh behavior
- Test date validation edge cases
Testing Scenarios
Integration Tests
-
Gap Detection
- Test with empty cache (should fetch all)
- Test with partial cache (should fetch only missing)
- Test with complete cache (should fetch none)
-
Partial Data Recovery
- Test when balance sheet API fails but others succeed
- Test when only one statement type is available
- Test when all APIs fail (should raise exception)
-
Date Handling
- Test invalid date formats
- Test end_date < start_date
- Test boundary conditions (year start/end)
-
Force Refresh
- Test that force_refresh=True clears cache
- Test that new data is fetched and stored
Success Criteria
Functional Requirements
- ✅ Service successfully calls
FinnhubClientwithdateobjects - ✅ Gap detection correctly identifies missing reports
- ✅ Partial data scenarios handled gracefully
- ✅ Local-first strategy works: checks cache → identifies gaps → fetches missing → stores updates
- ✅ Returns properly validated
FundamentalContextto agents - ✅ Force refresh bypasses cache and refreshes data
Technical Requirements
- ✅ Zero type checking errors:
mise run typecheck - ✅ Zero linting errors:
mise run lint - ✅ All existing tests pass
- ✅ No runtime errors with date conversions
- ✅ Proper error messages for validation failures
Quality Requirements
- ✅ Strongly-typed interfaces between all components
- ✅ Comprehensive error handling and logging
- ✅ Efficient caching with minimal API calls
- ✅ Clear separation of concerns between service, client, and repository
Dependencies
Completed
- ✅
FinnhubClientwithdateobject interface - ✅
FundamentalDataRepositorywith dataclass storage - ✅
FundamentalContextPydantic model
Required
- Working
FinnhubClientinstance with valid API key - Writable data directory for repository storage
Timeline
Immediate (Today)
- Fix critical date conversion and reference issues
- Implement basic gap detection
- Add date validation
Next Steps
- Implement partial data handling
- Comprehensive testing
- Integration with agent workflows
Acceptance Criteria
Must Have
- Type Safety: Service passes
mise run typecheckwith zero errors - Client Integration: All
FinnhubClientcalls usedateobjects correctly - Gap Detection: Correctly identifies missing report periods
- Partial Data: Service returns partial context when some statements fail
- Local-First: Service checks repository before API calls
- Context Validation: Returns valid
FundamentalContextwith Pydantic validation - Error Handling: Graceful handling of API failures and missing data
Should Have
- Cache Efficiency: Minimal redundant API calls
- Force Refresh: Complete cache bypass when requested
- Data Quality: Metadata indicating data completeness
- Clear Error Messages: Informative errors for date validation failures
Nice to Have
- Performance Metrics: Timing and cache hit rate logging
- Fiscal Year Handling: Support for non-calendar fiscal years
- Bulk Operations: Fetch multiple symbols efficiently
This PRD focuses on completing the FundamentalDataService as a strongly-typed, local-first data service that seamlessly integrates with the existing FinnhubClient and FundamentalDataRepository components while providing robust gap detection and partial data handling.