8.1 KiB

Raw Blame History

TradingAgents Product Definition

Product Overview

TradingAgents is a personal fork of the multi-agent LLM financial trading framework designed for individual trading research and data infrastructure development. This fork focuses on PostgreSQL + TimescaleDB + pgvectorscale architecture with RAG-powered agents for enhanced decision making through historical context and pattern recognition.

Target User

Primary User

Single Developer/Researcher: Individual focused on personal trading research, strategy development, and building robust data infrastructure for financial analysis

Use Cases

Personal Trading Research: Developing and testing proprietary trading strategies with AI-powered analysis
Data Infrastructure Development: Building scalable time-series and vector search capabilities for financial data
RAG Implementation: Experimenting with retrieval-augmented generation for context-aware trading decisions
Academic Research: Individual research projects exploring AI applications in financial markets

Core Value Proposition

This personal fork transforms the original TradingAgents framework into a focused research and development platform that:

Enables Personal Research: Provides a complete data infrastructure for individual trading research and strategy development
Implements Modern Architecture: PostgreSQL + TimescaleDB + pgvectorscale stack for efficient time-series and vector operations
Supports RAG-Powered Decisions: Agents leverage historical context through vector similarity search for informed decisions
Streamlines Data Collection: Automated daily/twice-daily data pipelines with Dagster orchestration
Unifies LLM Access: Single OpenRouter integration for consistent model access across all agents

Key Features

Enhanced Data Architecture

PostgreSQL Foundation: Robust relational database for structured financial data
TimescaleDB Integration: Optimized time-series storage and querying for market data
pgvectorscale Extension: High-performance vector search for RAG and similarity matching
Automated Migrations: Database schema versioning and management

RAG-Powered Multi-Agent System

Context-Aware Analysis: Agents use vector similarity search to find relevant historical patterns
Enhanced Decision Making: Retrieval-augmented generation provides historical context for trading decisions
Pattern Recognition: Semantic similarity matching for comparable market conditions
Learning from History: Agents reference past decisions and outcomes for improved analysis

Automated Data Collection

Dagster Orchestration: Daily/twice-daily data collection pipelines with monitoring and alerting
Quality Assurance: Automated data validation, gap detection, and backfill capabilities
Domain Coverage: Comprehensive data collection for news (95% complete), market data, and social media domains
Scalable Processing: Efficient batch processing with dependency management

Unified LLM Provider

OpenRouter Integration: Single provider for all model access, reducing API complexity
Cost Optimization: Strategic model selection with clear separation between analysis and data processing models
Model Flexibility: Easy switching between different models through OpenRouter's unified interface

Business Context

Research Focus Areas

Individual Strategy Development: Personal trading algorithm research and backtesting
Data Infrastructure: Building scalable financial data storage and retrieval systems
AI/ML in Finance: Experimenting with RAG, vector search, and multi-agent systems
Time-Series Analysis: Advanced market data analysis with TimescaleDB optimization

Technical Advantages

Modern Data Stack: PostgreSQL + TimescaleDB + pgvectorscale provides production-grade data infrastructure
RAG Implementation: Real-world application of retrieval-augmented generation in financial decision making
Comprehensive Testing: Maintains 85%+ test coverage with pragmatic TDD approach
Scalable Architecture: Domain-driven design supports extensibility and maintainability

Development Metrics

Code Quality: 85%+ test coverage, comprehensive type checking, automated formatting
Data Pipeline Health: Automated monitoring and alerting for data collection processes
Performance: Optimized queries with TimescaleDB, fast vector search with pgvectorscale
Maintainability: Clean architecture patterns, comprehensive documentation

Technical Constraints

Requirements

Database: PostgreSQL with TimescaleDB and pgvectorscale extensions
Python Environment: Python 3.13+ with comprehensive dependency management
API Access: OpenRouter API key for LLM access, optional FinnHub for real-time data
Infrastructure: Docker Compose for local development, Dagster for data orchestration

Architectural Decisions

Single Developer Focus: Optimized for individual use rather than multi-user collaboration
PostgreSQL-First: All data persistence through PostgreSQL with appropriate extensions
OpenRouter Exclusive: Unified LLM provider reduces complexity and improves consistency
Domain Completion: Sequential domain development (news 95% → marketdata → socialmedia)

Project Scope

Current Implementation Status

News Domain: 95% complete with comprehensive article scraping and sentiment analysis
Core Infrastructure: PostgreSQL + TimescaleDB + pgvectorscale foundation established
Agent Framework: RAG-powered agents with vector search capabilities
Data Pipelines: Dagster orchestration for automated data collection

Included Features

Complete PostgreSQL-based data architecture with time-series and vector extensions
RAG-enhanced multi-agent analysis framework with historical context
Automated data collection pipelines with Dagster orchestration
OpenRouter integration for unified LLM access
Comprehensive test suite with domain-specific testing strategies
CLI interface for interactive analysis and debugging

Excluded Features

Multi-user collaboration features
Real money trading capabilities
Production-grade risk management for live trading
Multiple database backend support
Legacy LLM provider integrations (focus on OpenRouter only)

Development Phases

Phase 1: News Domain Completion (Current - 95% Complete)

Finalize news article scraping and processing
Complete sentiment analysis pipeline
Optimize news data storage and retrieval
Implement comprehensive testing for news domain

Phase 2: Market Data Domain + PostgreSQL Migration

Complete market data collection and processing
Implement TimescaleDB optimizations for price data
Add technical analysis calculations
Migrate all data persistence to PostgreSQL

Implement Reddit and Twitter data collection
Add social sentiment analysis
Complete the three-domain architecture
Optimize cross-domain data relationships

Phase 4: Dagster Pipeline Implementation

Daily/twice-daily data collection automation
Comprehensive monitoring and alerting
Data quality validation and gap detection
Performance optimization and scaling

Phase 5: RAG Enhancement and OpenRouter Migration

Complete RAG implementation for all agents
Migrate to OpenRouter as sole LLM provider
Optimize vector search performance
Implement advanced pattern recognition

Success Criteria

This personal fork is successful when it provides:

Robust Data Infrastructure: PostgreSQL + TimescaleDB + pgvectorscale handling all financial data efficiently
Intelligent Decision Making: RAG-powered agents making context-aware trading recommendations
Reliable Data Collection: Automated pipelines collecting high-quality data consistently
Research Capability: Complete platform for individual trading strategy research and development
Maintainable Codebase: 85%+ test coverage with clear architecture and comprehensive documentation

The fork serves as both a practical trading research platform and a demonstration of modern data architecture patterns applied to financial AI systems.

8.1 KiB Raw Blame History