374 lines
11 KiB
Markdown
374 lines
11 KiB
Markdown
# Embedding Provider Separation - Implementation Summary
|
|
|
|
## Overview
|
|
|
|
This document summarizes the changes made to separate embedding configuration from chat model configuration in the TradingAgents framework.
|
|
|
|
## Branch Information
|
|
|
|
- **Branch Name**: `feature/separate-embedding-client`
|
|
- **Base Branch**: `main`
|
|
- **Status**: Ready for review/merge
|
|
|
|
## Problem Statement
|
|
|
|
Previously, the TradingAgents memory system used the same `backend_url` for both chat models and embeddings. This caused critical failures when:
|
|
|
|
1. Using **OpenRouter** for chat (doesn't support OpenAI embedding endpoints)
|
|
2. Using **Anthropic/Google** for chat (don't provide embeddings)
|
|
3. The embedding endpoint returned HTML error pages instead of JSON
|
|
4. Users wanted to mix providers (e.g., OpenRouter for chat, OpenAI for embeddings)
|
|
|
|
**Example Error**:
|
|
```python
|
|
AttributeError: 'str' object has no attribute 'data'
|
|
# Caused by: OpenRouter returned HTML page instead of embedding JSON
|
|
```
|
|
|
|
## Solution
|
|
|
|
Implemented a comprehensive separation of embedding and chat model configurations with three key features:
|
|
|
|
### 1. Separate Embedding Client Configuration
|
|
|
|
New configuration parameters independent of chat LLM settings:
|
|
|
|
```python
|
|
config = {
|
|
# Chat LLM settings
|
|
"llm_provider": "openrouter",
|
|
"backend_url": "https://openrouter.ai/api/v1",
|
|
|
|
# NEW: Separate embedding settings
|
|
"embedding_provider": "openai",
|
|
"embedding_backend_url": "https://api.openai.com/v1",
|
|
"embedding_model": "text-embedding-3-small",
|
|
"enable_memory": True,
|
|
}
|
|
```
|
|
|
|
### 2. Multiple Provider Support
|
|
|
|
- **OpenAI**: Production-grade embeddings (recommended)
|
|
- **Ollama**: Local embeddings for offline/development use
|
|
- **None**: Disable memory system entirely
|
|
|
|
### 3. Graceful Fallback
|
|
|
|
- System continues to operate when embeddings fail
|
|
- Comprehensive error logging
|
|
- Memory operations return empty results instead of crashing
|
|
- Agents function without historical context when memory is disabled
|
|
|
|
## Files Modified
|
|
|
|
### Core Framework
|
|
|
|
1. **`tradingagents/default_config.py`**
|
|
- Added 4 new configuration parameters for embeddings
|
|
- Maintains backward compatibility with existing configs
|
|
|
|
2. **`tradingagents/agents/utils/memory.py`**
|
|
- Complete refactor of `FinancialSituationMemory` class
|
|
- Added provider-specific initialization logic
|
|
- Implemented graceful error handling
|
|
- Added `is_enabled()` method
|
|
- Added comprehensive logging
|
|
- All methods now return safe defaults on failure
|
|
|
|
3. **`tradingagents/graph/trading_graph.py`**
|
|
- Added `_configure_embeddings()` method for smart defaults
|
|
- Separated chat LLM initialization from embedding setup
|
|
- Added memory status logging
|
|
- Updated `reflect_and_remember()` to respect memory settings
|
|
|
|
### CLI/User Interface
|
|
|
|
4. **`cli/utils.py`**
|
|
- Added `select_embedding_provider()` function
|
|
- Returns tuple: (provider, backend_url, model)
|
|
- Interactive selection with clear descriptions
|
|
- Code formatting improvements
|
|
|
|
5. **`cli/main.py`**
|
|
- Added Step 7: Embedding Provider selection
|
|
- Updated `get_user_selections()` to include embedding settings
|
|
- Updated `run_analysis()` to configure embedding from user selections
|
|
- Improved formatting and code style consistency
|
|
|
|
### Documentation
|
|
|
|
6. **`docs/EMBEDDING_CONFIGURATION.md`** (NEW)
|
|
- Comprehensive guide for embedding configuration
|
|
- Common scenarios and examples
|
|
- Troubleshooting section
|
|
- API reference
|
|
- Migration guide
|
|
|
|
7. **`docs/EMBEDDING_MIGRATION.md`** (THIS FILE)
|
|
- Implementation summary
|
|
- Technical details
|
|
- Testing recommendations
|
|
|
|
## Technical Details
|
|
|
|
### Configuration Priority
|
|
|
|
The system applies configuration in this order:
|
|
|
|
1. **Explicit user configuration** (highest priority)
|
|
2. **Provider-specific defaults**
|
|
3. **Fallback defaults** (lowest priority)
|
|
|
|
Example logic:
|
|
```python
|
|
def _configure_embeddings(self):
|
|
if "embedding_provider" not in self.config:
|
|
self.config["embedding_provider"] = "openai" # Safe default
|
|
|
|
if "embedding_backend_url" not in self.config:
|
|
if self.config["embedding_provider"] == "ollama":
|
|
self.config["embedding_backend_url"] = "http://localhost:11434/v1"
|
|
else:
|
|
self.config["embedding_backend_url"] = "https://api.openai.com/v1"
|
|
```
|
|
|
|
### Error Handling Strategy
|
|
|
|
Memory system implements defensive programming:
|
|
|
|
```python
|
|
def get_embedding(self, text: str) -> Optional[List[float]]:
|
|
if not self.enabled or not self.client:
|
|
return None # Safe fallback
|
|
|
|
try:
|
|
response = self.client.embeddings.create(...)
|
|
return response.data[0].embedding
|
|
except Exception as e:
|
|
logger.error(f"Failed to get embedding: {e}")
|
|
return None # Never crash, return None
|
|
```
|
|
|
|
All callers handle `None` gracefully:
|
|
|
|
```python
|
|
def add_situations(...):
|
|
for situation in situations:
|
|
embedding = self.get_embedding(situation)
|
|
if embedding is None:
|
|
logger.warning("Skipping situation due to embedding failure")
|
|
continue # Skip this item, process others
|
|
```
|
|
|
|
### Backward Compatibility
|
|
|
|
Existing configurations continue to work without modification:
|
|
|
|
**Old config** (still works):
|
|
```python
|
|
config = {
|
|
"llm_provider": "openai",
|
|
"backend_url": "https://api.openai.com/v1",
|
|
}
|
|
# Embeddings auto-configured to use OpenAI
|
|
```
|
|
|
|
**New config** (explicit control):
|
|
```python
|
|
config = {
|
|
"llm_provider": "openrouter",
|
|
"backend_url": "https://openrouter.ai/api/v1",
|
|
"embedding_provider": "openai",
|
|
"embedding_backend_url": "https://api.openai.com/v1",
|
|
}
|
|
# Full control over both chat and embeddings
|
|
```
|
|
|
|
## Testing Recommendations
|
|
|
|
### Unit Tests
|
|
|
|
```python
|
|
# Test memory initialization with different providers
|
|
def test_memory_openai_provider():
|
|
config = {
|
|
"embedding_provider": "openai",
|
|
"embedding_backend_url": "https://api.openai.com/v1",
|
|
"enable_memory": True,
|
|
}
|
|
memory = FinancialSituationMemory("test", config)
|
|
assert memory.is_enabled()
|
|
|
|
def test_memory_disabled():
|
|
config = {"embedding_provider": "none", "enable_memory": False}
|
|
memory = FinancialSituationMemory("test", config)
|
|
assert not memory.is_enabled()
|
|
assert memory.get_memories("test") == []
|
|
|
|
def test_memory_graceful_failure():
|
|
config = {
|
|
"embedding_provider": "openai",
|
|
"embedding_backend_url": "https://invalid-url.example/v1",
|
|
"enable_memory": True,
|
|
}
|
|
memory = FinancialSituationMemory("test", config)
|
|
# Should disable itself on connection failure
|
|
result = memory.get_memories("test")
|
|
assert result == []
|
|
```
|
|
|
|
### Integration Tests
|
|
|
|
```python
|
|
# Test full graph with different configurations
|
|
def test_graph_with_openrouter_and_openai_embeddings():
|
|
config = {
|
|
"llm_provider": "openrouter",
|
|
"backend_url": "https://openrouter.ai/api/v1",
|
|
"embedding_provider": "openai",
|
|
"embedding_backend_url": "https://api.openai.com/v1",
|
|
}
|
|
graph = TradingAgentsGraph(["market"], config=config)
|
|
# Should initialize without errors
|
|
assert graph.bull_memory.is_enabled()
|
|
|
|
def test_graph_with_disabled_memory():
|
|
config = {
|
|
"llm_provider": "openai",
|
|
"backend_url": "https://api.openai.com/v1",
|
|
"enable_memory": False,
|
|
}
|
|
graph = TradingAgentsGraph(["market"], config=config)
|
|
# Should work without memory
|
|
assert not graph.bull_memory.is_enabled()
|
|
```
|
|
|
|
### Manual Testing Scenarios
|
|
|
|
1. **OpenRouter + OpenAI embeddings**
|
|
```bash
|
|
export OPENROUTER_API_KEY="sk-or-..."
|
|
export OPENAI_API_KEY="sk-..."
|
|
python -m cli.main
|
|
# Select OpenRouter for chat, OpenAI for embeddings
|
|
```
|
|
|
|
2. **All Ollama (local)**
|
|
```bash
|
|
ollama pull llama3.1 nomic-embed-text
|
|
python -m cli.main
|
|
# Select Ollama for both chat and embeddings
|
|
```
|
|
|
|
3. **Disabled memory**
|
|
```bash
|
|
python -m cli.main
|
|
# Select any chat provider, disable memory
|
|
# Verify agents work without errors
|
|
```
|
|
|
|
## Breaking Changes
|
|
|
|
**None** - This is a backward-compatible enhancement.
|
|
|
|
Existing code continues to work without modification. New features are opt-in.
|
|
|
|
## Dependencies
|
|
|
|
No new dependencies added. Uses existing packages:
|
|
- `openai` (already required)
|
|
- `chromadb` (already required)
|
|
|
|
## Performance Impact
|
|
|
|
- **Minimal**: Embedding initialization is one-time cost
|
|
- **Memory**: No additional memory overhead when disabled
|
|
- **Latency**: No impact on chat model latency
|
|
- **Cost**: Allows optimization by choosing cheaper embedding providers
|
|
|
|
## Security Considerations
|
|
|
|
- API keys for different providers should be stored separately
|
|
- Follow least-privilege principle: use separate keys for chat vs embeddings
|
|
- Embedding data sent to configured provider (ensure compliance)
|
|
|
|
Example `.env`:
|
|
```bash
|
|
# Separate keys for different services
|
|
OPENAI_API_KEY="sk-..." # For embeddings
|
|
OPENROUTER_API_KEY="sk-or-..." # For chat models
|
|
```
|
|
|
|
## Future Enhancements
|
|
|
|
Potential improvements for future versions:
|
|
|
|
1. **Additional embedding providers**:
|
|
- HuggingFace embeddings
|
|
- Cohere embeddings
|
|
- Azure OpenAI embeddings
|
|
|
|
2. **Embedding caching**:
|
|
- Cache embeddings to disk
|
|
- Reduce API calls for repeated situations
|
|
|
|
3. **Embedding fine-tuning**:
|
|
- Support for custom fine-tuned embedding models
|
|
- Domain-specific financial embeddings
|
|
|
|
4. **Async embeddings**:
|
|
- Batch embedding requests
|
|
- Parallel processing for large memory operations
|
|
|
|
5. **Embedding quality metrics**:
|
|
- Track similarity score distributions
|
|
- Alert on low-quality matches
|
|
|
|
## Migration Checklist
|
|
|
|
For users upgrading to this version:
|
|
|
|
- [ ] Review current configuration
|
|
- [ ] Identify chat provider (OpenRouter, Anthropic, etc.)
|
|
- [ ] Decide on embedding strategy:
|
|
- [ ] Use OpenAI for embeddings (recommended)
|
|
- [ ] Use Ollama for local embeddings
|
|
- [ ] Disable memory if not needed
|
|
- [ ] Update `.env` with necessary API keys
|
|
- [ ] Test configuration in development
|
|
- [ ] Monitor logs for embedding-related warnings
|
|
- [ ] Verify memory is working as expected
|
|
|
|
## Rollback Plan
|
|
|
|
If issues arise:
|
|
|
|
1. **Immediate**: Set `enable_memory: False` to disable embeddings
|
|
2. **Code**: Remove embedding-specific config, system uses defaults
|
|
3. **Branch**: Revert to previous commit before this feature
|
|
|
|
## Support
|
|
|
|
For questions or issues:
|
|
|
|
1. Check `docs/EMBEDDING_CONFIGURATION.md` for detailed guide
|
|
2. Review error logs for specific failure messages
|
|
3. Try with `enable_memory: False` to isolate issue
|
|
4. Open GitHub issue with:
|
|
- Configuration used
|
|
- Error messages/logs
|
|
- Provider information
|
|
|
|
## Conclusion
|
|
|
|
This implementation successfully addresses the embedding/chat provider separation issue while maintaining backward compatibility and adding robust error handling. The system now supports flexible provider configurations and gracefully handles failures.
|
|
|
|
**Key Achievements**:
|
|
- ✅ Separate embedding and chat configurations
|
|
- ✅ Multiple embedding provider support
|
|
- ✅ Graceful degradation on failures
|
|
- ✅ Backward compatible
|
|
- ✅ Comprehensive documentation
|
|
- ✅ CLI integration
|
|
- ✅ Zero new dependencies |