98 lines
4.0 KiB
Markdown
98 lines
4.0 KiB
Markdown
# MarketData Domain - PostgreSQL Migration (Lite Spec)
|
|
|
|
## Migration Overview
|
|
|
|
**Project**: 85% complete MarketData domain → PostgreSQL + TimescaleDB + pgvectorscale
|
|
**Objective**: 10x performance + RAG capabilities while preserving 100% API compatibility
|
|
**Pattern**: Follow news domain PostgreSQL implementation for architectural consistency
|
|
|
|
## Key Requirements
|
|
|
|
### Performance Targets
|
|
- Sub-100ms market data queries (10x improvement from CSV)
|
|
- Sub-200ms RAG vector similarity search
|
|
- Support 500+ tickers with concurrent agent access
|
|
|
|
### API Preservation (Critical)
|
|
- **MarketDataService**: All existing methods preserved
|
|
- **FundamentalDataService**: Complete compatibility maintained
|
|
- **InsiderDataService**: Zero breaking changes
|
|
- **20 TA-Lib indicators**: Full functionality preserved
|
|
|
|
### Data Sources & Collection
|
|
- **yfinance**: Daily OHLC data via Dagster pipelines
|
|
- **FinnHub**: Insider transactions + fundamental data
|
|
- **TimescaleDB hypertables**: market_data, fundamental_data, insider_data
|
|
- **Vector storage**: pgvectorscale for RAG pattern matching
|
|
|
|
## Technical Implementation
|
|
|
|
### Database Schema (TimescaleDB)
|
|
```sql
|
|
-- Hypertables for time-series optimization
|
|
market_data (symbol, date, ohlc, volume) - 10 year retention
|
|
fundamental_data (symbol, report_date, metrics) - 5 year retention
|
|
insider_data (symbol, transaction_date, person, shares) - 3 year retention
|
|
technical_indicators (symbol, date, values, pattern_embedding) - RAG support
|
|
```
|
|
|
|
### Entity Models
|
|
- **MarketDataEntity**: OHLC + validation + database conversion
|
|
- **FundamentalDataEntity**: Financial statement data
|
|
- **InsiderDataEntity**: SEC transaction records
|
|
- **TechnicalIndicatorEntity**: Calculated values + vector embeddings
|
|
|
|
### Repository Pattern (Async PostgreSQL)
|
|
```python
|
|
class MarketDataRepository:
|
|
async def get_ohlc_data(symbol, start, end) -> List[MarketDataEntity]
|
|
async def bulk_upsert_market_data(entities) -> int # Dagster ingestion
|
|
async def find_similar_patterns(embedding, limit) -> List[Dict] # RAG
|
|
```
|
|
|
|
### Service Layer (100% Compatible)
|
|
```python
|
|
class MarketDataService:
|
|
async def get_stock_data(symbol, period) -> pd.DataFrame # Preserved API
|
|
async def calculate_technical_indicators(symbol, indicators) -> Dict # 20 TA-Lib
|
|
async def get_trading_style_preset(style) -> Dict # Existing presets
|
|
```
|
|
|
|
## Migration Strategy
|
|
|
|
### Phase 1: Entities & Schema
|
|
1. Create SQLAlchemy entities following news domain patterns
|
|
2. Setup TimescaleDB hypertables with proper indexing
|
|
3. Configure pgvectorscale for vector embeddings
|
|
|
|
### Phase 2: Repository Migration
|
|
1. Implement async PostgreSQL repositories (mirror NewsRepository pattern)
|
|
2. Create data migration scripts (CSV → PostgreSQL)
|
|
3. Add vector embedding generation for RAG
|
|
|
|
### Phase 3: Service Preservation
|
|
1. Update services to use PostgreSQL repositories
|
|
2. Maintain exact API signatures and return types
|
|
3. Add RAG-enhanced pattern analysis capabilities
|
|
|
|
### Phase 4: Integration & Testing
|
|
1. Real PostgreSQL tests for repositories
|
|
2. Preserve pytest-vcr for YFinanceClient/FinnhubClient
|
|
3. Validate 100% API compatibility with existing agents
|
|
|
|
## Ready Dependencies
|
|
- YFinanceClient + FinnhubClient (fully implemented)
|
|
- PostgreSQL + TimescaleDB + pgvectorscale (established)
|
|
- DatabaseManager async operations (available)
|
|
- News domain patterns for consistency (reference implementation)
|
|
|
|
## Success Metrics
|
|
- **Performance**: 10x query improvement, sub-100ms operations
|
|
- **Compatibility**: Zero API breaking changes, seamless agent migration
|
|
- **Scalability**: 500+ concurrent tickers, efficient bulk ingestion
|
|
- **Quality**: 85%+ test coverage, comprehensive validation
|
|
|
|
## Implementation Approach
|
|
**Follow news domain patterns** → Create entities → Migrate repositories → Preserve service APIs → Enhance with vector RAG → Integrate Dagster pipelines
|
|
|
|
This migration provides the high-performance, RAG-enabled market data foundation essential for sophisticated multi-agent trading analysis while maintaining complete backward compatibility. |