11 KiB

Raw Blame History

Embedding Provider Separation - Implementation Summary

Overview

This document summarizes the changes made to separate embedding configuration from chat model configuration in the TradingAgents framework.

Branch Information

Branch Name: feature/separate-embedding-client
Base Branch: main
Status: Ready for review/merge

Problem Statement

Previously, the TradingAgents memory system used the same backend_url for both chat models and embeddings. This caused critical failures when:

Using OpenRouter for chat (doesn't support OpenAI embedding endpoints)
Using Anthropic/Google for chat (don't provide embeddings)
The embedding endpoint returned HTML error pages instead of JSON
Users wanted to mix providers (e.g., OpenRouter for chat, OpenAI for embeddings)

Example Error:

AttributeError: 'str' object has no attribute 'data'
# Caused by: OpenRouter returned HTML page instead of embedding JSON

Solution

Implemented a comprehensive separation of embedding and chat model configurations with three key features:

1. Separate Embedding Client Configuration

New configuration parameters independent of chat LLM settings:

config = {
    # Chat LLM settings
    "llm_provider": "openrouter",
    "backend_url": "https://openrouter.ai/api/v1",
    
    # NEW: Separate embedding settings
    "embedding_provider": "openai",
    "embedding_backend_url": "https://api.openai.com/v1",
    "embedding_model": "text-embedding-3-small",
    "enable_memory": True,
}

2. Multiple Provider Support

OpenAI: Production-grade embeddings (recommended)
Ollama: Local embeddings for offline/development use
None: Disable memory system entirely

3. Graceful Fallback

System continues to operate when embeddings fail
Comprehensive error logging
Memory operations return empty results instead of crashing
Agents function without historical context when memory is disabled

Files Modified

Core Framework

tradingagents/default_config.py
- Added 4 new configuration parameters for embeddings
- Maintains backward compatibility with existing configs
tradingagents/agents/utils/memory.py
- Complete refactor of FinancialSituationMemory class
- Added provider-specific initialization logic
- Implemented graceful error handling
- Added is_enabled() method
- Added comprehensive logging
- All methods now return safe defaults on failure
tradingagents/graph/trading_graph.py
- Added _configure_embeddings() method for smart defaults
- Separated chat LLM initialization from embedding setup
- Added memory status logging
- Updated reflect_and_remember() to respect memory settings

CLI/User Interface

cli/utils.py
- Added select_embedding_provider() function
- Returns tuple: (provider, backend_url, model)
- Interactive selection with clear descriptions
- Code formatting improvements
cli/main.py
- Added Step 7: Embedding Provider selection
- Updated get_user_selections() to include embedding settings
- Updated run_analysis() to configure embedding from user selections
- Improved formatting and code style consistency

Documentation

docs/EMBEDDING_CONFIGURATION.md (NEW)
- Comprehensive guide for embedding configuration
- Common scenarios and examples
- Troubleshooting section
- API reference
- Migration guide
docs/EMBEDDING_MIGRATION.md (THIS FILE)
- Implementation summary
- Technical details
- Testing recommendations

Technical Details

Configuration Priority

The system applies configuration in this order:

Explicit user configuration (highest priority)
Provider-specific defaults
Fallback defaults (lowest priority)

Example logic:

def _configure_embeddings(self):
    if "embedding_provider" not in self.config:
        self.config["embedding_provider"] = "openai"  # Safe default
    
    if "embedding_backend_url" not in self.config:
        if self.config["embedding_provider"] == "ollama":
            self.config["embedding_backend_url"] = "http://localhost:11434/v1"
        else:
            self.config["embedding_backend_url"] = "https://api.openai.com/v1"

Error Handling Strategy

Memory system implements defensive programming:

def get_embedding(self, text: str) -> Optional[List[float]]:
    if not self.enabled or not self.client:
        return None  # Safe fallback
    
    try:
        response = self.client.embeddings.create(...)
        return response.data[0].embedding
    except Exception as e:
        logger.error(f"Failed to get embedding: {e}")
        return None  # Never crash, return None

All callers handle None gracefully:

def add_situations(...):
    for situation in situations:
        embedding = self.get_embedding(situation)
        if embedding is None:
            logger.warning("Skipping situation due to embedding failure")
            continue  # Skip this item, process others

Backward Compatibility

Existing configurations continue to work without modification:

Old config (still works):

config = {
    "llm_provider": "openai",
    "backend_url": "https://api.openai.com/v1",
}
# Embeddings auto-configured to use OpenAI

New config (explicit control):

config = {
    "llm_provider": "openrouter",
    "backend_url": "https://openrouter.ai/api/v1",
    "embedding_provider": "openai",
    "embedding_backend_url": "https://api.openai.com/v1",
}
# Full control over both chat and embeddings

Testing Recommendations

Unit Tests

# Test memory initialization with different providers
def test_memory_openai_provider():
    config = {
        "embedding_provider": "openai",
        "embedding_backend_url": "https://api.openai.com/v1",
        "enable_memory": True,
    }
    memory = FinancialSituationMemory("test", config)
    assert memory.is_enabled()

def test_memory_disabled():
    config = {"embedding_provider": "none", "enable_memory": False}
    memory = FinancialSituationMemory("test", config)
    assert not memory.is_enabled()
    assert memory.get_memories("test") == []

def test_memory_graceful_failure():
    config = {
        "embedding_provider": "openai",
        "embedding_backend_url": "https://invalid-url.example/v1",
        "enable_memory": True,
    }
    memory = FinancialSituationMemory("test", config)
    # Should disable itself on connection failure
    result = memory.get_memories("test")
    assert result == []

Integration Tests

# Test full graph with different configurations
def test_graph_with_openrouter_and_openai_embeddings():
    config = {
        "llm_provider": "openrouter",
        "backend_url": "https://openrouter.ai/api/v1",
        "embedding_provider": "openai",
        "embedding_backend_url": "https://api.openai.com/v1",
    }
    graph = TradingAgentsGraph(["market"], config=config)
    # Should initialize without errors
    assert graph.bull_memory.is_enabled()

def test_graph_with_disabled_memory():
    config = {
        "llm_provider": "openai",
        "backend_url": "https://api.openai.com/v1",
        "enable_memory": False,
    }
    graph = TradingAgentsGraph(["market"], config=config)
    # Should work without memory
    assert not graph.bull_memory.is_enabled()

Manual Testing Scenarios

OpenRouter + OpenAI embeddings

export OPENROUTER_API_KEY="sk-or-..."
export OPENAI_API_KEY="sk-..."
python -m cli.main
# Select OpenRouter for chat, OpenAI for embeddings

All Ollama (local)

ollama pull llama3.1 nomic-embed-text
python -m cli.main
# Select Ollama for both chat and embeddings

Disabled memory

python -m cli.main
# Select any chat provider, disable memory
# Verify agents work without errors

Breaking Changes

None - This is a backward-compatible enhancement.

Existing code continues to work without modification. New features are opt-in.

Dependencies

No new dependencies added. Uses existing packages:

openai (already required)
chromadb (already required)

Performance Impact

Minimal: Embedding initialization is one-time cost
Memory: No additional memory overhead when disabled
Latency: No impact on chat model latency
Cost: Allows optimization by choosing cheaper embedding providers

Security Considerations

API keys for different providers should be stored separately
Follow least-privilege principle: use separate keys for chat vs embeddings
Embedding data sent to configured provider (ensure compliance)

Example .env:

# Separate keys for different services
OPENAI_API_KEY="sk-..."          # For embeddings
OPENROUTER_API_KEY="sk-or-..."   # For chat models

Future Enhancements

Potential improvements for future versions:

Additional embedding providers:
- HuggingFace embeddings
- Cohere embeddings
- Azure OpenAI embeddings
Embedding caching:
- Cache embeddings to disk
- Reduce API calls for repeated situations
Embedding fine-tuning:
- Support for custom fine-tuned embedding models
- Domain-specific financial embeddings
Async embeddings:
- Batch embedding requests
- Parallel processing for large memory operations
Embedding quality metrics:
- Track similarity score distributions
- Alert on low-quality matches

Migration Checklist

For users upgrading to this version:

Review current configuration
Identify chat provider (OpenRouter, Anthropic, etc.)
Decide on embedding strategy:
- Use OpenAI for embeddings (recommended)
- Use Ollama for local embeddings
- Disable memory if not needed
Update .env with necessary API keys
Test configuration in development
Monitor logs for embedding-related warnings
Verify memory is working as expected

Rollback Plan

If issues arise:

Immediate: Set enable_memory: False to disable embeddings
Code: Remove embedding-specific config, system uses defaults
Branch: Revert to previous commit before this feature

Support

For questions or issues:

Check docs/EMBEDDING_CONFIGURATION.md for detailed guide
Review error logs for specific failure messages
Try with enable_memory: False to isolate issue
Open GitHub issue with:
- Configuration used
- Error messages/logs
- Provider information

Conclusion

This implementation successfully addresses the embedding/chat provider separation issue while maintaining backward compatibility and adding robust error handling. The system now supports flexible provider configurations and gracefully handles failures.

Key Achievements:

✅ Separate embedding and chat configurations
✅ Multiple embedding provider support
✅ Graceful degradation on failures
✅ Backward compatible
✅ Comprehensive documentation
✅ CLI integration
✅ Zero new dependencies

11 KiB Raw Blame History