TradingAgents/spec.md at c20771bf2050533df22570afd6492721e8afa221

28 KiB

Raw Blame History

Feature Overview

Complete implementation of social media data collection and analysis - Transform the current stub implementation into a production-ready social media domain that provides comprehensive Reddit sentiment analysis for trading agents.

User Story

As a Dagster pipeline, I want to collect Reddit posts from financial subreddits with LLM sentiment analysis and vector embeddings, so that AI Agents can access comprehensive social media context for ticker-specific trading decisions through RAG-powered queries.

Acceptance Criteria

Daily Data Collection

GIVEN a scheduled Dagster pipeline WHEN it executes daily THEN it collects Reddit posts from configured financial subreddits without manual intervention
GIVEN Reddit posts are collected WHEN processed THEN they are stored in PostgreSQL with TimescaleDB optimization and vector embeddings for semantic search

LLM Sentiment Analysis

GIVEN social media posts WHEN processed THEN each post receives OpenRouter LLM sentiment analysis with structured scores (positive/negative/neutral with confidence)

Agent Integration

GIVEN a ticker symbol WHEN AI agents request social context THEN they receive relevant Reddit posts with sentiment scores and vector similarity ranking within 2 seconds
GIVEN social media data WHEN agents query THEN AgentToolkit provides RAG-enhanced context including post content, sentiment trends, and engagement metrics

Business Rules and Constraints

Data Collection Rules

Daily automated collection from configured financial subreddits (wallstreetbets, investing, stocks, SecurityAnalysis)
OpenRouter LLM sentiment analysis for all posts with confidence scoring
Vector embeddings generation for semantic similarity search
Post deduplication by Reddit post ID to prevent duplicates
Rate limiting compliance with Reddit API terms of service

Data Management

Data retention policy: 90 days for social media posts
Best effort processing: API failures or rate limits don't block other posts

Scope Definition

Included Features ✅

Complete socialmedia domain implementation from stub to production
PostgreSQL migration from current file-based storage
Reddit API integration using PRAW or Reddit API client
OpenRouter LLM sentiment analysis integration
Vector embeddings generation and similarity search
AgentToolkit integration with get_reddit_news and get_reddit_stock_info methods
Dagster pipeline for scheduled daily collection
SQLAlchemy entities with TimescaleDB and pgvectorscale support
Comprehensive test coverage with pytest-vcr for API mocking

Excluded Features ❌

Other social media platforms beyond Reddit (Twitter, LinkedIn, etc.)
Real-time social media streaming (batch processing only)
Custom sentiment models (use OpenRouter LLMs only)
Social media influence scoring or user reputation tracking
Multi-language post support (English only)
Historical Reddit data backfilling beyond 30 days

Technical Implementation Details

Architecture Pattern

Router → Service → Repository → Entity → Database (matching news domain)

Current Implementation Status

Basic stub implementation - requires complete rebuild

Missing Components

PostgreSQL database migration from file storage
Reddit API client implementation (RedditClient is empty stub)
SQLAlchemy entity models for social posts with vector fields
LLM sentiment analysis integration via OpenRouter
Vector embedding generation and similarity search
AgentToolkit RAG methods (get_reddit_news, get_reddit_stock_info)
Dagster pipeline for scheduled data collection
Comprehensive test suite with domain-specific patterns

Existing Stub Components

SocialMediaService with empty method stubs
SocialRepository with file-based JSON storage
Basic data models: SocialPost, PostData, SocialContext
Empty RedditClient class requiring full implementation
Agent references to social methods (not yet implemented)

Database Integration

PostgreSQL Schema Design

-- Social media posts table with TimescaleDB optimization
CREATE TABLE social_media_posts (
    id SERIAL PRIMARY KEY,
    post_id VARCHAR(50) UNIQUE NOT NULL,           -- Reddit post ID
    ticker VARCHAR(10),                            -- Associated ticker
    subreddit VARCHAR(50) NOT NULL,                -- Source subreddit
    title TEXT NOT NULL,                           -- Post title
    content TEXT,                                  -- Post content
    author VARCHAR(50),                            -- Reddit username
    created_at TIMESTAMPTZ NOT NULL,               -- Post creation time
    collected_at TIMESTAMPTZ DEFAULT NOW(),        -- Data collection time
    upvotes INTEGER DEFAULT 0,                     -- Reddit upvotes
    downvotes INTEGER DEFAULT 0,                   -- Reddit downvotes
    comment_count INTEGER DEFAULT 0,               -- Number of comments
    url TEXT,                                      -- Reddit URL
    permalink TEXT,                                -- Reddit permalink
    
    -- Sentiment analysis fields
    sentiment_score DECIMAL(3,2),                  -- -1.0 to +1.0
    sentiment_label VARCHAR(20),                   -- positive/negative/neutral
    sentiment_confidence DECIMAL(3,2),             -- 0.0 to 1.0
    
    -- Vector embeddings
    embedding vector(1536),                        -- pgvectorscale embedding
    
    -- Metadata
    data_quality_score DECIMAL(3,2) DEFAULT 1.0,
    processing_status VARCHAR(20) DEFAULT 'pending',
    error_message TEXT
);

-- TimescaleDB hypertable for time-series optimization
SELECT create_hypertable('social_media_posts', 'created_at');

-- Vector similarity index
CREATE INDEX idx_social_posts_embedding ON social_media_posts USING vectors (embedding vector_cosine_ops);

-- Performance indexes
CREATE INDEX idx_social_posts_ticker ON social_media_posts (ticker, created_at DESC);
CREATE INDEX idx_social_posts_subreddit ON social_media_posts (subreddit, created_at DESC);
CREATE INDEX idx_social_posts_sentiment ON social_media_posts (sentiment_label, sentiment_score);

Entity Model

# tradingagents/domains/socialmedia/entities.py
from sqlalchemy import Column, Integer, String, Text, DECIMAL, TIMESTAMP, Index
from sqlalchemy.dialects.postgresql import VECTOR
from tradingagents.database import Base
from typing import Optional, Dict, Any
import json

class SocialMediaPostEntity(Base):
    __tablename__ = 'social_media_posts'
    
    id = Column(Integer, primary_key=True)
    post_id = Column(String(50), unique=True, nullable=False)
    ticker = Column(String(10), index=True)
    subreddit = Column(String(50), nullable=False, index=True)
    title = Column(Text, nullable=False)
    content = Column(Text)
    author = Column(String(50))
    created_at = Column(TIMESTAMP(timezone=True), nullable=False, index=True)
    collected_at = Column(TIMESTAMP(timezone=True), server_default='NOW()')
    upvotes = Column(Integer, default=0)
    downvotes = Column(Integer, default=0)
    comment_count = Column(Integer, default=0)
    url = Column(Text)
    permalink = Column(Text)
    
    # Sentiment analysis
    sentiment_score = Column(DECIMAL(3,2))
    sentiment_label = Column(String(20))
    sentiment_confidence = Column(DECIMAL(3,2))
    
    # Vector embeddings
    embedding = Column(VECTOR(1536))
    
    # Metadata
    data_quality_score = Column(DECIMAL(3,2), default=1.0)
    processing_status = Column(String(20), default='pending')
    error_message = Column(Text)
    
    def to_domain(self) -> 'SocialPost':
        """Convert entity to domain model"""
        return SocialPost(
            post_id=self.post_id,
            ticker=self.ticker,
            subreddit=self.subreddit,
            title=self.title,
            content=self.content,
            author=self.author,
            created_at=self.created_at,
            upvotes=self.upvotes,
            downvotes=self.downvotes,
            comment_count=self.comment_count,
            url=self.url,
            sentiment_score=float(self.sentiment_score) if self.sentiment_score else None,
            sentiment_label=self.sentiment_label,
            sentiment_confidence=float(self.sentiment_confidence) if self.sentiment_confidence else None
        )
    
    @classmethod
    def from_domain(cls, post: 'SocialPost', embedding: Optional[list] = None) -> 'SocialMediaPostEntity':
        """Create entity from domain model"""
        return cls(
            post_id=post.post_id,
            ticker=post.ticker,
            subreddit=post.subreddit,
            title=post.title,
            content=post.content,
            author=post.author,
            created_at=post.created_at,
            upvotes=post.upvotes,
            downvotes=post.downvotes,
            comment_count=post.comment_count,
            url=post.url,
            sentiment_score=post.sentiment_score,
            sentiment_label=post.sentiment_label,
            sentiment_confidence=post.sentiment_confidence,
            embedding=embedding
        )

Reddit API Integration

RedditClient Implementation

# tradingagents/domains/socialmedia/clients.py
import praw
from typing import List, Optional, Dict, Any
from datetime import datetime, timedelta
import asyncio
import aiohttp
from tradingagents.config import TradingAgentsConfig

class RedditClient:
    def __init__(self, config: TradingAgentsConfig):
        self.config = config
        self.reddit = praw.Reddit(
            client_id=config.reddit_client_id,
            client_secret=config.reddit_client_secret,
            user_agent=config.reddit_user_agent
        )
        
    async def fetch_financial_posts(
        self,
        subreddits: List[str],
        ticker: Optional[str] = None,
        limit: int = 100,
        time_filter: str = "day"
    ) -> List[Dict[str, Any]]:
        """Fetch financial posts from specified subreddits"""
        posts = []
        
        for subreddit_name in subreddits:
            try:
                subreddit = self.reddit.subreddit(subreddit_name)
                submissions = subreddit.hot(limit=limit)
                
                for submission in submissions:
                    # Filter by ticker if specified
                    if ticker and ticker.upper() not in submission.title.upper():
                        continue
                        
                    post_data = {
                        'post_id': submission.id,
                        'subreddit': subreddit_name,
                        'title': submission.title,
                        'content': submission.selftext,
                        'author': str(submission.author),
                        'created_at': datetime.fromtimestamp(submission.created_utc),
                        'upvotes': submission.ups,
                        'downvotes': submission.downs,
                        'comment_count': submission.num_comments,
                        'url': submission.url,
                        'permalink': submission.permalink
                    }
                    posts.append(post_data)
                    
            except Exception as e:
                # Log error but continue processing other subreddits
                print(f"Error fetching from {subreddit_name}: {e}")
                continue
                
        return posts

LLM Sentiment Analysis

OpenRouter Integration

# tradingagents/domains/socialmedia/services.py
from typing import Dict, Any, Optional, Tuple
import openai
from tradingagents.config import TradingAgentsConfig

class SentimentAnalyzer:
    def __init__(self, config: TradingAgentsConfig):
        self.config = config
        self.client = openai.OpenAI(
            base_url="https://openrouter.ai/api/v1",
            api_key=config.openrouter_api_key
        )
    
    async def analyze_sentiment(self, text: str) -> Tuple[float, str, float]:
        """
        Analyze sentiment of social media post
        Returns: (score, label, confidence)
        """
        prompt = f"""
        Analyze the financial sentiment of this social media post.
        
        Post: "{text}"
        
        Return sentiment as JSON with:
        - score: float from -1.0 (very negative) to +1.0 (very positive)
        - label: "positive", "negative", or "neutral"  
        - confidence: float from 0.0 to 1.0 indicating confidence
        
        Focus on financial and trading sentiment, not general sentiment.
        """
        
        try:
            response = await self.client.chat.completions.create(
                model=self.config.quick_think_llm,
                messages=[{"role": "user", "content": prompt}],
                max_tokens=100,
                temperature=0.1
            )
            
            result = json.loads(response.choices[0].message.content)
            return result['score'], result['label'], result['confidence']
            
        except Exception as e:
            # Return neutral sentiment on error
            return 0.0, "neutral", 0.0

Vector Embeddings and Search

Embedding Generation

# tradingagents/domains/socialmedia/embeddings.py
import openai
from typing import List, Optional
from tradingagents.config import TradingAgentsConfig

class EmbeddingGenerator:
    def __init__(self, config: TradingAgentsConfig):
        self.config = config
        self.client = openai.OpenAI(
            base_url="https://openrouter.ai/api/v1",
            api_key=config.openrouter_api_key
        )
    
    async def generate_embedding(self, text: str) -> Optional[List[float]]:
        """Generate vector embedding for text"""
        try:
            response = await self.client.embeddings.create(
                model="text-embedding-3-small",
                input=text,
                encoding_format="float"
            )
            return response.data[0].embedding
        except Exception as e:
            print(f"Embedding generation failed: {e}")
            return None
    
    def prepare_text_for_embedding(self, post: Dict[str, Any]) -> str:
        """Combine title and content for embedding"""
        title = post.get('title', '')
        content = post.get('content', '')
        return f"{title} {content}".strip()

Repository Implementation

SocialRepository with PostgreSQL

# tradingagents/domains/socialmedia/repositories.py
from typing import List, Optional, Dict, Any
from sqlalchemy.orm import Session
from sqlalchemy import desc, and_, text
from tradingagents.domains.socialmedia.entities import SocialMediaPostEntity
from tradingagents.domains.socialmedia.models import SocialPost, SocialContext
from tradingagents.database import get_db_session
from datetime import datetime, timedelta

class SocialRepository:
    def __init__(self):
        self.session = get_db_session()
    
    async def save_posts(self, posts: List[SocialPost]) -> List[str]:
        """Save social media posts with deduplication"""
        saved_ids = []
        
        for post in posts:
            # Check for existing post
            existing = self.session.query(SocialMediaPostEntity).filter(
                SocialMediaPostEntity.post_id == post.post_id
            ).first()
            
            if existing:
                continue  # Skip duplicates
                
            entity = SocialMediaPostEntity.from_domain(post)
            self.session.add(entity)
            saved_ids.append(post.post_id)
        
        self.session.commit()
        return saved_ids
    
    async def get_posts_for_ticker(
        self, 
        ticker: str, 
        days: int = 7,
        limit: int = 50
    ) -> List[SocialPost]:
        """Get social media posts for specific ticker"""
        cutoff_date = datetime.now() - timedelta(days=days)
        
        results = self.session.query(SocialMediaPostEntity).filter(
            and_(
                SocialMediaPostEntity.ticker == ticker,
                SocialMediaPostEntity.created_at >= cutoff_date
            )
        ).order_by(desc(SocialMediaPostEntity.created_at)).limit(limit).all()
        
        return [entity.to_domain() for entity in results]
    
    async def vector_similarity_search(
        self, 
        query_embedding: List[float], 
        ticker: Optional[str] = None,
        limit: int = 10
    ) -> List[SocialPost]:
        """Find similar posts using vector search"""
        query = self.session.query(SocialMediaPostEntity)
        
        if ticker:
            query = query.filter(SocialMediaPostEntity.ticker == ticker)
        
        # Vector similarity search using pgvectorscale
        query = query.order_by(
            text(f"embedding <-> '{query_embedding}'")
        ).limit(limit)
        
        results = query.all()
        return [entity.to_domain() for entity in results]

Service Layer

SocialMediaService

# tradingagents/domains/socialmedia/services.py
from typing import List, Optional, Dict, Any
from tradingagents.domains.socialmedia.repositories import SocialRepository
from tradingagents.domains.socialmedia.clients import RedditClient
from tradingagents.domains.socialmedia.models import SocialPost, SocialContext
from tradingagents.config import TradingAgentsConfig

class SocialMediaService:
    def __init__(self, config: TradingAgentsConfig):
        self.config = config
        self.repository = SocialRepository()
        self.reddit_client = RedditClient(config)
        self.sentiment_analyzer = SentimentAnalyzer(config)
        self.embedding_generator = EmbeddingGenerator(config)
    
    async def collect_social_data(
        self, 
        ticker: Optional[str] = None,
        subreddits: Optional[List[str]] = None
    ) -> SocialContext:
        """Main entry point for social media data collection"""
        
        if not subreddits:
            subreddits = ['wallstreetbets', 'investing', 'stocks', 'SecurityAnalysis']
        
        # Fetch posts from Reddit
        raw_posts = await self.reddit_client.fetch_financial_posts(
            subreddits=subreddits,
            ticker=ticker,
            limit=100
        )
        
        # Process posts: sentiment analysis + embeddings
        processed_posts = []
        for raw_post in raw_posts:
            # Generate sentiment
            text = f"{raw_post['title']} {raw_post['content']}"
            score, label, confidence = await self.sentiment_analyzer.analyze_sentiment(text)
            
            # Generate embedding
            embedding = await self.embedding_generator.generate_embedding(text)
            
            post = SocialPost(
                **raw_post,
                sentiment_score=score,
                sentiment_label=label,
                sentiment_confidence=confidence
            )
            processed_posts.append(post)
        
        # Save to database
        await self.repository.save_posts(processed_posts)
        
        # Return context
        return SocialContext(
            posts=processed_posts,
            ticker=ticker,
            total_posts=len(processed_posts),
            sentiment_summary=self._calculate_sentiment_summary(processed_posts)
        )
    
    def _calculate_sentiment_summary(self, posts: List[SocialPost]) -> Dict[str, Any]:
        """Calculate aggregate sentiment metrics"""
        if not posts:
            return {}
        
        scores = [p.sentiment_score for p in posts if p.sentiment_score is not None]
        labels = [p.sentiment_label for p in posts if p.sentiment_label]
        
        return {
            'avg_sentiment': sum(scores) / len(scores) if scores else 0.0,
            'positive_count': labels.count('positive'),
            'negative_count': labels.count('negative'),
            'neutral_count': labels.count('neutral'),
            'total_posts': len(posts)
        }

AgentToolkit Integration

RAG-Enhanced Methods

# tradingagents/agents/libs/agent_toolkit.py (additions)

async def get_reddit_news(self, ticker: str, days: int = 7) -> str:
    """Get Reddit posts related to a ticker with RAG context"""
    try:
        # Get recent posts for ticker
        posts = await self.social_service.repository.get_posts_for_ticker(
            ticker=ticker,
            days=days,
            limit=20
        )
        
        if not posts:
            return f"No Reddit posts found for {ticker} in the last {days} days."
        
        # Format for agent consumption
        context = f"Reddit Social Media Context for {ticker} ({len(posts)} posts):\n\n"
        
        for post in posts[:10]:  # Limit to top 10
            sentiment_emoji = {"positive": "📈", "negative": "📉", "neutral": "➡️"}.get(post.sentiment_label, "")
            context += f"{sentiment_emoji} r/{post.subreddit} - {post.title}\n"
            context += f"   Sentiment: {post.sentiment_label} ({post.sentiment_score:.2f})\n"
            context += f"   Engagement: {post.upvotes} upvotes, {post.comment_count} comments\n"
            if post.content:
                context += f"   Content: {post.content[:200]}...\n"
            context += "\n"
        
        return context
        
    except Exception as e:
        return f"Error fetching Reddit data for {ticker}: {str(e)}"

async def get_reddit_stock_info(self, ticker: str, query: Optional[str] = None) -> str:
    """Get Reddit stock information with semantic search"""
    try:
        if query:
            # Generate embedding for semantic search
            query_embedding = await self.social_service.embedding_generator.generate_embedding(query)
            if query_embedding:
                posts = await self.social_service.repository.vector_similarity_search(
                    query_embedding=query_embedding,
                    ticker=ticker,
                    limit=10
                )
            else:
                posts = await self.social_service.repository.get_posts_for_ticker(ticker, days=7)
        else:
            posts = await self.social_service.repository.get_posts_for_ticker(ticker, days=7)
        
        if not posts:
            return f"No relevant Reddit discussions found for {ticker}."
        
        # Aggregate sentiment and key insights
        sentiment_summary = self.social_service._calculate_sentiment_summary(posts)
        
        context = f"Reddit Stock Analysis for {ticker}:\n\n"
        context += f"Overall Sentiment: {sentiment_summary.get('avg_sentiment', 0):.2f}/1.0\n"
        context += f"Posts: {sentiment_summary.get('positive_count', 0)} positive, "
        context += f"{sentiment_summary.get('negative_count', 0)} negative, "
        context += f"{sentiment_summary.get('neutral_count', 0)} neutral\n\n"
        
        context += "Key Discussions:\n"
        for post in posts[:5]:
            context += f"• {post.title} (r/{post.subreddit})\n"
            context += f"  Sentiment: {post.sentiment_label} ({post.sentiment_score:.2f})\n"
        
        return context
        
    except Exception as e:
        return f"Error analyzing Reddit stock info for {ticker}: {str(e)}"

Dagster Pipeline

# tradingagents/data/assets/social_media.py
from dagster import asset, AssetExecutionContext
from tradingagents.domains.socialmedia.services import SocialMediaService
from tradingagents.config import TradingAgentsConfig

@asset(
    group_name="social_media",
    description="Collect Reddit posts from financial subreddits with sentiment analysis"
)
async def reddit_financial_posts(context: AssetExecutionContext) -> Dict[str, Any]:
    """Daily collection of Reddit financial posts"""
    
    config = TradingAgentsConfig.from_env()
    social_service = SocialMediaService(config)
    
    # Collect from financial subreddits
    subreddits = ['wallstreetbets', 'investing', 'stocks', 'SecurityAnalysis']
    
    total_collected = 0
    results = {}
    
    for subreddit in subreddits:
        try:
            social_context = await social_service.collect_social_data(
                subreddits=[subreddit]
            )
            
            results[subreddit] = {
                'posts_collected': len(social_context.posts),
                'sentiment_summary': social_context.sentiment_summary
            }
            total_collected += len(social_context.posts)
            
            context.log.info(f"Collected {len(social_context.posts)} posts from r/{subreddit}")
            
        except Exception as e:
            context.log.error(f"Failed to collect from r/{subreddit}: {e}")
            results[subreddit] = {'error': str(e)}
    
    context.log.info(f"Total posts collected: {total_collected}")
    return results

Testing Strategy

Test Structure

tests/domains/socialmedia/
├── conftest.py                    # Fixtures and test setup
├── test_reddit_client.py          # API integration tests with VCR
├── test_social_repository.py      # PostgreSQL database tests  
├── test_social_service.py         # Business logic with mocks
├── test_sentiment_analyzer.py     # LLM sentiment analysis tests
├── test_embedding_generator.py    # Vector embedding tests
└── fixtures/                      # VCR cassettes and test data
    └── reddit_api_responses.yaml

Key Test Patterns

# tests/domains/socialmedia/test_social_service.py
import pytest
from unittest.mock import AsyncMock, MagicMock
from tradingagents.domains.socialmedia.services import SocialMediaService

@pytest.mark.asyncio
async def test_collect_social_data_success(mock_social_service):
    """Test successful social media data collection"""
    # Mock Reddit API response
    mock_posts = [
        {
            'post_id': 'abc123',
            'title': 'AAPL to the moon!',
            'subreddit': 'wallstreetbets',
            # ... other fields
        }
    ]
    
    mock_social_service.reddit_client.fetch_financial_posts.return_value = mock_posts
    mock_social_service.sentiment_analyzer.analyze_sentiment.return_value = (0.8, 'positive', 0.9)
    
    result = await mock_social_service.collect_social_data(ticker='AAPL')
    
    assert len(result.posts) == 1
    assert result.posts[0].sentiment_label == 'positive'
    assert result.sentiment_summary['positive_count'] == 1

Dependencies

Technical Dependencies

Reddit API access (PRAW or Reddit API client)
OpenRouter API for LLM sentiment analysis
PostgreSQL with TimescaleDB and pgvectorscale extensions
Existing database infrastructure from news domain
OpenRouter configuration in TradingAgentsConfig
Dagster orchestration framework for scheduled execution

Reference Implementations

News domain patterns: Follow NewsService, NewsRepository, NewsArticleEntity patterns for consistency
Database schema: Mirror NewsArticleEntity vector embedding approach for social posts
Agent integration: Follow existing AgentToolkit get_news() pattern for social media methods
Testing approach: Apply news domain testing patterns: VCR for API, real DB for repositories

Success Criteria

Functionality

Daily Reddit collection with sentiment analysis and vector search
Seamless integration with existing multi-agent trading framework
RAG-enhanced social context for AI agents

Performance

< 2 second social context queries
< 100ms repository operations
Efficient vector similarity search

Quality

85%+ test coverage matching project standards
Comprehensive error handling and resilience
Data quality monitoring and validation

Integration

Seamless AgentToolkit RAG integration for AI agents
Architecture and patterns match successful news domain implementation
Consistent with existing TradingAgents configuration and conventions

Implementation Approach

Complete domain implementation following successful news domain patterns:

Database migration from file storage to PostgreSQL
Entity models with TimescaleDB and vector support
Reddit client implementation with rate limiting
Repository layer with vector search capabilities
Service layer with sentiment analysis and embedding generation
AgentToolkit integration with RAG-enhanced methods
Dagster pipeline for automated daily collection
Comprehensive testing with VCR mocking and real database tests

This comprehensive implementation transforms the social media domain from basic stubs into a production-ready system that seamlessly integrates with the existing TradingAgents framework.

28 KiB Raw Blame History

Social Media Domain Specification

Feature Overview

User Story

Acceptance Criteria

Daily Data Collection

LLM Sentiment Analysis

Agent Integration

Business Rules and Constraints

Data Collection Rules

Data Management

Scope Definition

Included Features ✅

Excluded Features ❌

Technical Implementation Details

Architecture Pattern

Current Implementation Status

Missing Components

Existing Stub Components

Database Integration

PostgreSQL Schema Design

Entity Model

Reddit API Integration

RedditClient Implementation

LLM Sentiment Analysis

OpenRouter Integration

Vector Embeddings and Search

Embedding Generation

Repository Implementation

SocialRepository with PostgreSQL

Service Layer

SocialMediaService

AgentToolkit Integration

RAG-Enhanced Methods

Dagster Pipeline

Social Media Collection Asset

Testing Strategy

Test Structure

Key Test Patterns

Dependencies

Technical Dependencies

Reference Implementations

Success Criteria

Functionality

Performance

Quality

Integration

Implementation Approach

28 KiB

Raw Blame History