11 KiB
Embedding Configuration Guide
Overview
This guide explains the new separated embedding configuration feature in TradingAgents. The system now allows you to use different providers for chat models and embeddings, enabling more flexible deployment scenarios.
Key Features
- Separate Embedding Client: Chat models and embedding models use independent configurations
- Multiple Embedding Providers: Support for OpenAI, Ollama (local), or disabled memory
- Graceful Fallback: System continues to operate even when embeddings are unavailable
- Provider Independence: Use OpenRouter/Anthropic for chat while using OpenAI for embeddings
Why This Matters
Previously, the memory system used the same backend URL as the chat model, causing issues when:
- Using OpenRouter (which doesn't support OpenAI embedding endpoints)
- Using Anthropic or Google for chat (which don't provide embeddings)
- Running in environments without embedding access
Now you can:
- Use OpenRouter/Anthropic/Google for chat models
- Use OpenAI for embeddings (recommended)
- Use Ollama for local embeddings
- Disable memory entirely if needed
Configuration Options
Via CLI (Interactive)
When running the CLI, you'll see a new Step 7 for embedding configuration:
python -m cli.main
You'll be prompted to select:
- OpenAI (recommended) - Uses OpenAI's embedding API
- Ollama (local) - Uses local Ollama embedding models
- Disable Memory - Runs without memory/context retrieval
Via Code (Direct Configuration)
Update your configuration dictionary:
from tradingagents.graph.trading_graph import TradingAgentsGraph
config = {
# Chat LLM settings (can be any provider)
"llm_provider": "openrouter",
"backend_url": "https://openrouter.ai/api/v1",
"deep_think_llm": "deepseek/deepseek-chat-v3-0324:free",
"quick_think_llm": "meta-llama/llama-3.3-8b-instruct:free",
# Embedding settings (separate from chat)
"embedding_provider": "openai",
"embedding_backend_url": "https://api.openai.com/v1",
"embedding_model": "text-embedding-3-small",
"enable_memory": True,
# Other settings...
}
graph = TradingAgentsGraph(selected_analysts=["market", "news"], config=config)
Configuration Parameters
embedding_provider
- Type:
string - Options:
"openai","ollama","none" - Default:
"openai" - Description: The embedding service provider
embedding_backend_url
- Type:
string - Default:
"https://api.openai.com/v1"(for OpenAI) - Description: API endpoint URL for embeddings
embedding_model
- Type:
string - Default:
"text-embedding-3-small"(for OpenAI) - Description: The embedding model to use
enable_memory
- Type:
boolean - Default:
True - Description: Enable/disable the memory system
Common Scenarios
Scenario 1: OpenRouter for Chat + OpenAI for Embeddings
Best for: Cost-effective chat with reliable embeddings
config = {
"llm_provider": "openrouter",
"backend_url": "https://openrouter.ai/api/v1",
"deep_think_llm": "deepseek/deepseek-chat-v3-0324:free",
"quick_think_llm": "meta-llama/llama-3.3-8b-instruct:free",
"embedding_provider": "openai",
"embedding_backend_url": "https://api.openai.com/v1",
"embedding_model": "text-embedding-3-small",
"enable_memory": True,
}
Required API Keys:
OPENROUTER_API_KEY(for chat)OPENAI_API_KEY(for embeddings)
Scenario 2: All Local with Ollama
Best for: Complete offline/local deployment
config = {
"llm_provider": "ollama",
"backend_url": "http://localhost:11434/v1",
"deep_think_llm": "llama3.1",
"quick_think_llm": "llama3.2",
"embedding_provider": "ollama",
"embedding_backend_url": "http://localhost:11434/v1",
"embedding_model": "nomic-embed-text",
"enable_memory": True,
}
Prerequisites:
- Ollama installed and running
- Models pulled:
ollama pull llama3.1 llama3.2 nomic-embed-text
Scenario 3: Anthropic for Chat, No Memory
Best for: Using providers without embedding support
config = {
"llm_provider": "anthropic",
"backend_url": "https://api.anthropic.com/",
"deep_think_llm": "claude-sonnet-4-0",
"quick_think_llm": "claude-3-5-haiku-latest",
"embedding_provider": "none",
"enable_memory": False,
}
Note: Memory and context retrieval will be disabled.
Scenario 4: OpenAI for Everything (Default)
Best for: Simplicity and full feature support
config = {
"llm_provider": "openai",
"backend_url": "https://api.openai.com/v1",
"deep_think_llm": "o4-mini",
"quick_think_llm": "gpt-4o-mini",
# Embeddings will auto-configure to use OpenAI
}
Environment Variables
Set the appropriate API keys based on your configuration:
# For OpenAI (chat or embeddings)
export OPENAI_API_KEY="sk-..."
# For OpenRouter (chat)
export OPENROUTER_API_KEY="sk-or-..."
# For Anthropic (chat)
export ANTHROPIC_API_KEY="sk-ant-..."
# For Google (chat)
export GOOGLE_API_KEY="..."
Graceful Degradation
The memory system gracefully handles failures:
- Embedding API Unavailable: Returns empty memories, logs warning, continues execution
- Invalid Configuration: Disables memory, logs error, continues execution
- Network Errors: Skips memory operations, logs error, continues execution
Example log output when embeddings fail:
WARNING: Failed to initialize embedding client: Connection error. Memory will be disabled.
INFO: Memory disabled for bull_memory
INFO: Memory disabled for bear_memory
...
The agents continue to function without memory-based context.
Checking Memory Status
You can check if memory is enabled:
# After initializing the graph
print(f"Bull memory enabled: {graph.bull_memory.is_enabled()}")
print(f"Bear memory enabled: {graph.bear_memory.is_enabled()}")
Migration Guide
From Previous Version
If you have existing code using the old configuration:
Old (single backend for everything):
config = {
"llm_provider": "openai",
"backend_url": "https://api.openai.com/v1",
}
New (explicit embedding config):
config = {
"llm_provider": "openai",
"backend_url": "https://api.openai.com/v1",
# Add these for explicit control:
"embedding_provider": "openai",
"embedding_backend_url": "https://api.openai.com/v1",
"embedding_model": "text-embedding-3-small",
}
Note: The old configuration still works! The system auto-configures embeddings based on smart defaults.
Smart Defaults
If you don't specify embedding configuration, the system applies these rules:
- embedding_provider: Defaults to
"openai" - embedding_backend_url:
"openai"→"https://api.openai.com/v1""ollama"→"http://localhost:11434/v1"
- embedding_model:
"openai"→"text-embedding-3-small""ollama"→"nomic-embed-text"
- enable_memory: Defaults to
True
Troubleshooting
Issue: "Failed to get embedding: 401 Unauthorized"
Cause: Missing or invalid API key for embedding provider
Solution:
export OPENAI_API_KEY="your-actual-key"
Issue: "Memory disabled for all agents"
Cause: Embedding provider set to "none" or initialization failed
Solution: Check your embedding_provider setting and API keys
Issue: OpenRouter returns HTML instead of embeddings
Cause: Trying to use OpenRouter backend for embeddings (not supported)
Solution: Set separate embedding provider:
config["embedding_provider"] = "openai"
config["embedding_backend_url"] = "https://api.openai.com/v1"
Issue: "ChromaDB collection creation failed"
Cause: ChromaDB initialization error
Solution:
- Ensure ChromaDB is installed:
pip install chromadb - Check disk space and permissions
- Set
enable_memory: Falseto bypass
Performance Considerations
Embedding Costs
| Provider | Model | Cost per 1M tokens | Speed |
|---|---|---|---|
| OpenAI | text-embedding-3-small | ~$0.02 | Fast |
| OpenAI | text-embedding-3-large | ~$0.13 | Fast |
| Ollama | nomic-embed-text | Free | Medium (local) |
Memory Impact
- With Memory: Agents use historical context, better decisions
- Without Memory: Faster initialization, no embedding costs, stateless
Best Practices
- Production: Use OpenAI embeddings for reliability
- Development: Use Ollama for cost-free testing
- CI/CD: Disable memory (
enable_memory: False) for faster tests - Multi-provider: Use different providers for chat and embeddings to optimize cost/performance
API Reference
FinancialSituationMemory
class FinancialSituationMemory:
def __init__(self, name: str, config: Dict[str, Any])
def is_enabled(self) -> bool:
"""Check if memory is enabled and functioning."""
def add_situations(self, situations_and_advice: List[Tuple[str, str]]) -> bool:
"""Add financial situations and recommendations to memory."""
def get_memories(self, current_situation: str, n_matches: int = 1) -> List[Dict]:
"""Retrieve matching memories for the current situation."""
Example Usage
from tradingagents.agents.utils.memory import FinancialSituationMemory
config = {
"embedding_provider": "openai",
"embedding_backend_url": "https://api.openai.com/v1",
"embedding_model": "text-embedding-3-small",
"enable_memory": True,
}
memory = FinancialSituationMemory("test_memory", config)
if memory.is_enabled():
# Add memories
memory.add_situations([
("High volatility market", "Reduce position sizes"),
("Strong uptrend", "Consider scaling in"),
])
# Query memories
matches = memory.get_memories("Market showing volatility", n_matches=2)
for match in matches:
print(f"Score: {match['similarity_score']:.2f}")
print(f"Recommendation: {match['recommendation']}")
Support
For issues or questions:
- Check the main README
- Review error logs for specific failure messages
- Open an issue on GitHub with configuration details
Changelog
Version 2.0 (Current)
- ✅ Separated embedding configuration from chat LLM
- ✅ Support for multiple embedding providers
- ✅ Graceful fallback when embeddings unavailable
- ✅ CLI step for embedding provider selection
- ✅ Smart defaults for backward compatibility
Version 1.0 (Legacy)
- Single backend URL for all operations
- Embedding failures caused system crashes
- No provider flexibility