8.1 KiB
8.1 KiB
TradingAgents Product Definition
Product Overview
TradingAgents is a personal fork of the multi-agent LLM financial trading framework designed for individual trading research and data infrastructure development. This fork focuses on PostgreSQL + TimescaleDB + pgvectorscale architecture with RAG-powered agents for enhanced decision making through historical context and pattern recognition.
Target User
Primary User
- Single Developer/Researcher: Individual focused on personal trading research, strategy development, and building robust data infrastructure for financial analysis
Use Cases
- Personal Trading Research: Developing and testing proprietary trading strategies with AI-powered analysis
- Data Infrastructure Development: Building scalable time-series and vector search capabilities for financial data
- RAG Implementation: Experimenting with retrieval-augmented generation for context-aware trading decisions
- Academic Research: Individual research projects exploring AI applications in financial markets
Core Value Proposition
This personal fork transforms the original TradingAgents framework into a focused research and development platform that:
- Enables Personal Research: Provides a complete data infrastructure for individual trading research and strategy development
- Implements Modern Architecture: PostgreSQL + TimescaleDB + pgvectorscale stack for efficient time-series and vector operations
- Supports RAG-Powered Decisions: Agents leverage historical context through vector similarity search for informed decisions
- Streamlines Data Collection: Automated daily/twice-daily data pipelines with Dagster orchestration
- Unifies LLM Access: Single OpenRouter integration for consistent model access across all agents
Key Features
Enhanced Data Architecture
- PostgreSQL Foundation: Robust relational database for structured financial data
- TimescaleDB Integration: Optimized time-series storage and querying for market data
- pgvectorscale Extension: High-performance vector search for RAG and similarity matching
- Automated Migrations: Database schema versioning and management
RAG-Powered Multi-Agent System
- Context-Aware Analysis: Agents use vector similarity search to find relevant historical patterns
- Enhanced Decision Making: Retrieval-augmented generation provides historical context for trading decisions
- Pattern Recognition: Semantic similarity matching for comparable market conditions
- Learning from History: Agents reference past decisions and outcomes for improved analysis
Automated Data Collection
- Dagster Orchestration: Daily/twice-daily data collection pipelines with monitoring and alerting
- Quality Assurance: Automated data validation, gap detection, and backfill capabilities
- Domain Coverage: Comprehensive data collection for news (95% complete), market data, and social media domains
- Scalable Processing: Efficient batch processing with dependency management
Unified LLM Provider
- OpenRouter Integration: Single provider for all model access, reducing API complexity
- Cost Optimization: Strategic model selection with clear separation between analysis and data processing models
- Model Flexibility: Easy switching between different models through OpenRouter's unified interface
Business Context
Research Focus Areas
- Individual Strategy Development: Personal trading algorithm research and backtesting
- Data Infrastructure: Building scalable financial data storage and retrieval systems
- AI/ML in Finance: Experimenting with RAG, vector search, and multi-agent systems
- Time-Series Analysis: Advanced market data analysis with TimescaleDB optimization
Technical Advantages
- Modern Data Stack: PostgreSQL + TimescaleDB + pgvectorscale provides production-grade data infrastructure
- RAG Implementation: Real-world application of retrieval-augmented generation in financial decision making
- Comprehensive Testing: Maintains 85%+ test coverage with pragmatic TDD approach
- Scalable Architecture: Domain-driven design supports extensibility and maintainability
Development Metrics
- Code Quality: 85%+ test coverage, comprehensive type checking, automated formatting
- Data Pipeline Health: Automated monitoring and alerting for data collection processes
- Performance: Optimized queries with TimescaleDB, fast vector search with pgvectorscale
- Maintainability: Clean architecture patterns, comprehensive documentation
Technical Constraints
Requirements
- Database: PostgreSQL with TimescaleDB and pgvectorscale extensions
- Python Environment: Python 3.13+ with comprehensive dependency management
- API Access: OpenRouter API key for LLM access, optional FinnHub for real-time data
- Infrastructure: Docker Compose for local development, Dagster for data orchestration
Architectural Decisions
- Single Developer Focus: Optimized for individual use rather than multi-user collaboration
- PostgreSQL-First: All data persistence through PostgreSQL with appropriate extensions
- OpenRouter Exclusive: Unified LLM provider reduces complexity and improves consistency
- Domain Completion: Sequential domain development (news 95% → marketdata → socialmedia)
Project Scope
Current Implementation Status
- News Domain: 95% complete with comprehensive article scraping and sentiment analysis
- Core Infrastructure: PostgreSQL + TimescaleDB + pgvectorscale foundation established
- Agent Framework: RAG-powered agents with vector search capabilities
- Data Pipelines: Dagster orchestration for automated data collection
Included Features
- Complete PostgreSQL-based data architecture with time-series and vector extensions
- RAG-enhanced multi-agent analysis framework with historical context
- Automated data collection pipelines with Dagster orchestration
- OpenRouter integration for unified LLM access
- Comprehensive test suite with domain-specific testing strategies
- CLI interface for interactive analysis and debugging
Excluded Features
- Multi-user collaboration features
- Real money trading capabilities
- Production-grade risk management for live trading
- Multiple database backend support
- Legacy LLM provider integrations (focus on OpenRouter only)
Development Phases
Phase 1: News Domain Completion (Current - 95% Complete)
- Finalize news article scraping and processing
- Complete sentiment analysis pipeline
- Optimize news data storage and retrieval
- Implement comprehensive testing for news domain
Phase 2: Market Data Domain + PostgreSQL Migration
- Complete market data collection and processing
- Implement TimescaleDB optimizations for price data
- Add technical analysis calculations
- Migrate all data persistence to PostgreSQL
Phase 3: Social Media Domain
- Implement Reddit and Twitter data collection
- Add social sentiment analysis
- Complete the three-domain architecture
- Optimize cross-domain data relationships
Phase 4: Dagster Pipeline Implementation
- Daily/twice-daily data collection automation
- Comprehensive monitoring and alerting
- Data quality validation and gap detection
- Performance optimization and scaling
Phase 5: RAG Enhancement and OpenRouter Migration
- Complete RAG implementation for all agents
- Migrate to OpenRouter as sole LLM provider
- Optimize vector search performance
- Implement advanced pattern recognition
Success Criteria
This personal fork is successful when it provides:
- Robust Data Infrastructure: PostgreSQL + TimescaleDB + pgvectorscale handling all financial data efficiently
- Intelligent Decision Making: RAG-powered agents making context-aware trading recommendations
- Reliable Data Collection: Automated pipelines collecting high-quality data consistently
- Research Capability: Complete platform for individual trading strategy research and development
- Maintainable Codebase: 85%+ test coverage with clear architecture and comprehensive documentation
The fork serves as both a practical trading research platform and a demonstration of modern data architecture patterns applied to financial AI systems.