TradingAgents/docs/EMBEDDING_MIGRATION.md

11 KiB

Embedding Provider Separation - Implementation Summary

Overview

This document summarizes the changes made to separate embedding configuration from chat model configuration in the TradingAgents framework.

Branch Information

  • Branch Name: feature/separate-embedding-client
  • Base Branch: main
  • Status: Ready for review/merge

Problem Statement

Previously, the TradingAgents memory system used the same backend_url for both chat models and embeddings. This caused critical failures when:

  1. Using OpenRouter for chat (doesn't support OpenAI embedding endpoints)
  2. Using Anthropic/Google for chat (don't provide embeddings)
  3. The embedding endpoint returned HTML error pages instead of JSON
  4. Users wanted to mix providers (e.g., OpenRouter for chat, OpenAI for embeddings)

Example Error:

AttributeError: 'str' object has no attribute 'data'
# Caused by: OpenRouter returned HTML page instead of embedding JSON

Solution

Implemented a comprehensive separation of embedding and chat model configurations with three key features:

1. Separate Embedding Client Configuration

New configuration parameters independent of chat LLM settings:

config = {
    # Chat LLM settings
    "llm_provider": "openrouter",
    "backend_url": "https://openrouter.ai/api/v1",
    
    # NEW: Separate embedding settings
    "embedding_provider": "openai",
    "embedding_backend_url": "https://api.openai.com/v1",
    "embedding_model": "text-embedding-3-small",
    "enable_memory": True,
}

2. Multiple Provider Support

  • OpenAI: Production-grade embeddings (recommended)
  • Ollama: Local embeddings for offline/development use
  • None: Disable memory system entirely

3. Graceful Fallback

  • System continues to operate when embeddings fail
  • Comprehensive error logging
  • Memory operations return empty results instead of crashing
  • Agents function without historical context when memory is disabled

Files Modified

Core Framework

  1. tradingagents/default_config.py

    • Added 4 new configuration parameters for embeddings
    • Maintains backward compatibility with existing configs
  2. tradingagents/agents/utils/memory.py

    • Complete refactor of FinancialSituationMemory class
    • Added provider-specific initialization logic
    • Implemented graceful error handling
    • Added is_enabled() method
    • Added comprehensive logging
    • All methods now return safe defaults on failure
  3. tradingagents/graph/trading_graph.py

    • Added _configure_embeddings() method for smart defaults
    • Separated chat LLM initialization from embedding setup
    • Added memory status logging
    • Updated reflect_and_remember() to respect memory settings

CLI/User Interface

  1. cli/utils.py

    • Added select_embedding_provider() function
    • Returns tuple: (provider, backend_url, model)
    • Interactive selection with clear descriptions
    • Code formatting improvements
  2. cli/main.py

    • Added Step 7: Embedding Provider selection
    • Updated get_user_selections() to include embedding settings
    • Updated run_analysis() to configure embedding from user selections
    • Improved formatting and code style consistency

Documentation

  1. docs/EMBEDDING_CONFIGURATION.md (NEW)

    • Comprehensive guide for embedding configuration
    • Common scenarios and examples
    • Troubleshooting section
    • API reference
    • Migration guide
  2. docs/EMBEDDING_MIGRATION.md (THIS FILE)

    • Implementation summary
    • Technical details
    • Testing recommendations

Technical Details

Configuration Priority

The system applies configuration in this order:

  1. Explicit user configuration (highest priority)
  2. Provider-specific defaults
  3. Fallback defaults (lowest priority)

Example logic:

def _configure_embeddings(self):
    if "embedding_provider" not in self.config:
        self.config["embedding_provider"] = "openai"  # Safe default
    
    if "embedding_backend_url" not in self.config:
        if self.config["embedding_provider"] == "ollama":
            self.config["embedding_backend_url"] = "http://localhost:11434/v1"
        else:
            self.config["embedding_backend_url"] = "https://api.openai.com/v1"

Error Handling Strategy

Memory system implements defensive programming:

def get_embedding(self, text: str) -> Optional[List[float]]:
    if not self.enabled or not self.client:
        return None  # Safe fallback
    
    try:
        response = self.client.embeddings.create(...)
        return response.data[0].embedding
    except Exception as e:
        logger.error(f"Failed to get embedding: {e}")
        return None  # Never crash, return None

All callers handle None gracefully:

def add_situations(...):
    for situation in situations:
        embedding = self.get_embedding(situation)
        if embedding is None:
            logger.warning("Skipping situation due to embedding failure")
            continue  # Skip this item, process others

Backward Compatibility

Existing configurations continue to work without modification:

Old config (still works):

config = {
    "llm_provider": "openai",
    "backend_url": "https://api.openai.com/v1",
}
# Embeddings auto-configured to use OpenAI

New config (explicit control):

config = {
    "llm_provider": "openrouter",
    "backend_url": "https://openrouter.ai/api/v1",
    "embedding_provider": "openai",
    "embedding_backend_url": "https://api.openai.com/v1",
}
# Full control over both chat and embeddings

Testing Recommendations

Unit Tests

# Test memory initialization with different providers
def test_memory_openai_provider():
    config = {
        "embedding_provider": "openai",
        "embedding_backend_url": "https://api.openai.com/v1",
        "enable_memory": True,
    }
    memory = FinancialSituationMemory("test", config)
    assert memory.is_enabled()

def test_memory_disabled():
    config = {"embedding_provider": "none", "enable_memory": False}
    memory = FinancialSituationMemory("test", config)
    assert not memory.is_enabled()
    assert memory.get_memories("test") == []

def test_memory_graceful_failure():
    config = {
        "embedding_provider": "openai",
        "embedding_backend_url": "https://invalid-url.example/v1",
        "enable_memory": True,
    }
    memory = FinancialSituationMemory("test", config)
    # Should disable itself on connection failure
    result = memory.get_memories("test")
    assert result == []

Integration Tests

# Test full graph with different configurations
def test_graph_with_openrouter_and_openai_embeddings():
    config = {
        "llm_provider": "openrouter",
        "backend_url": "https://openrouter.ai/api/v1",
        "embedding_provider": "openai",
        "embedding_backend_url": "https://api.openai.com/v1",
    }
    graph = TradingAgentsGraph(["market"], config=config)
    # Should initialize without errors
    assert graph.bull_memory.is_enabled()

def test_graph_with_disabled_memory():
    config = {
        "llm_provider": "openai",
        "backend_url": "https://api.openai.com/v1",
        "enable_memory": False,
    }
    graph = TradingAgentsGraph(["market"], config=config)
    # Should work without memory
    assert not graph.bull_memory.is_enabled()

Manual Testing Scenarios

  1. OpenRouter + OpenAI embeddings

    export OPENROUTER_API_KEY="sk-or-..."
    export OPENAI_API_KEY="sk-..."
    python -m cli.main
    # Select OpenRouter for chat, OpenAI for embeddings
    
  2. All Ollama (local)

    ollama pull llama3.1 nomic-embed-text
    python -m cli.main
    # Select Ollama for both chat and embeddings
    
  3. Disabled memory

    python -m cli.main
    # Select any chat provider, disable memory
    # Verify agents work without errors
    

Breaking Changes

None - This is a backward-compatible enhancement.

Existing code continues to work without modification. New features are opt-in.

Dependencies

No new dependencies added. Uses existing packages:

  • openai (already required)
  • chromadb (already required)

Performance Impact

  • Minimal: Embedding initialization is one-time cost
  • Memory: No additional memory overhead when disabled
  • Latency: No impact on chat model latency
  • Cost: Allows optimization by choosing cheaper embedding providers

Security Considerations

  • API keys for different providers should be stored separately
  • Follow least-privilege principle: use separate keys for chat vs embeddings
  • Embedding data sent to configured provider (ensure compliance)

Example .env:

# Separate keys for different services
OPENAI_API_KEY="sk-..."          # For embeddings
OPENROUTER_API_KEY="sk-or-..."   # For chat models

Future Enhancements

Potential improvements for future versions:

  1. Additional embedding providers:

    • HuggingFace embeddings
    • Cohere embeddings
    • Azure OpenAI embeddings
  2. Embedding caching:

    • Cache embeddings to disk
    • Reduce API calls for repeated situations
  3. Embedding fine-tuning:

    • Support for custom fine-tuned embedding models
    • Domain-specific financial embeddings
  4. Async embeddings:

    • Batch embedding requests
    • Parallel processing for large memory operations
  5. Embedding quality metrics:

    • Track similarity score distributions
    • Alert on low-quality matches

Migration Checklist

For users upgrading to this version:

  • Review current configuration
  • Identify chat provider (OpenRouter, Anthropic, etc.)
  • Decide on embedding strategy:
    • Use OpenAI for embeddings (recommended)
    • Use Ollama for local embeddings
    • Disable memory if not needed
  • Update .env with necessary API keys
  • Test configuration in development
  • Monitor logs for embedding-related warnings
  • Verify memory is working as expected

Rollback Plan

If issues arise:

  1. Immediate: Set enable_memory: False to disable embeddings
  2. Code: Remove embedding-specific config, system uses defaults
  3. Branch: Revert to previous commit before this feature

Support

For questions or issues:

  1. Check docs/EMBEDDING_CONFIGURATION.md for detailed guide
  2. Review error logs for specific failure messages
  3. Try with enable_memory: False to isolate issue
  4. Open GitHub issue with:
    • Configuration used
    • Error messages/logs
    • Provider information

Conclusion

This implementation successfully addresses the embedding/chat provider separation issue while maintaining backward compatibility and adding robust error handling. The system now supports flexible provider configurations and gracefully handles failures.

Key Achievements:

  • Separate embedding and chat configurations
  • Multiple embedding provider support
  • Graceful degradation on failures
  • Backward compatible
  • Comprehensive documentation
  • CLI integration
  • Zero new dependencies