TradingAgents/docs/EMBEDDING_CONFIGURATION.md

11 KiB

Embedding Configuration Guide

Overview

This guide explains the new separated embedding configuration feature in TradingAgents. The system now allows you to use different providers for chat models and embeddings, enabling more flexible deployment scenarios.

Key Features

  1. Separate Embedding Client: Chat models and embedding models use independent configurations
  2. Multiple Embedding Providers: Support for OpenAI, Ollama (local), or disabled memory
  3. Graceful Fallback: System continues to operate even when embeddings are unavailable
  4. Provider Independence: Use OpenRouter/Anthropic for chat while using OpenAI for embeddings

Why This Matters

Previously, the memory system used the same backend URL as the chat model, causing issues when:

  • Using OpenRouter (which doesn't support OpenAI embedding endpoints)
  • Using Anthropic or Google for chat (which don't provide embeddings)
  • Running in environments without embedding access

Now you can:

  • Use OpenRouter/Anthropic/Google for chat models
  • Use OpenAI for embeddings (recommended)
  • Use Ollama for local embeddings
  • Disable memory entirely if needed

Configuration Options

Via CLI (Interactive)

When running the CLI, you'll see a new Step 7 for embedding configuration:

python -m cli.main

You'll be prompted to select:

  1. OpenAI (recommended) - Uses OpenAI's embedding API
  2. Ollama (local) - Uses local Ollama embedding models
  3. Disable Memory - Runs without memory/context retrieval

Via Code (Direct Configuration)

Update your configuration dictionary:

from tradingagents.graph.trading_graph import TradingAgentsGraph

config = {
    # Chat LLM settings (can be any provider)
    "llm_provider": "openrouter",
    "backend_url": "https://openrouter.ai/api/v1",
    "deep_think_llm": "deepseek/deepseek-chat-v3-0324:free",
    "quick_think_llm": "meta-llama/llama-3.3-8b-instruct:free",
    
    # Embedding settings (separate from chat)
    "embedding_provider": "openai",
    "embedding_backend_url": "https://api.openai.com/v1",
    "embedding_model": "text-embedding-3-small",
    "enable_memory": True,
    
    # Other settings...
}

graph = TradingAgentsGraph(selected_analysts=["market", "news"], config=config)

Configuration Parameters

embedding_provider

  • Type: string
  • Options: "openai", "ollama", "none"
  • Default: "openai"
  • Description: The embedding service provider

embedding_backend_url

  • Type: string
  • Default: "https://api.openai.com/v1" (for OpenAI)
  • Description: API endpoint URL for embeddings

embedding_model

  • Type: string
  • Default: "text-embedding-3-small" (for OpenAI)
  • Description: The embedding model to use

enable_memory

  • Type: boolean
  • Default: True
  • Description: Enable/disable the memory system

Common Scenarios

Scenario 1: OpenRouter for Chat + OpenAI for Embeddings

Best for: Cost-effective chat with reliable embeddings

config = {
    "llm_provider": "openrouter",
    "backend_url": "https://openrouter.ai/api/v1",
    "deep_think_llm": "deepseek/deepseek-chat-v3-0324:free",
    "quick_think_llm": "meta-llama/llama-3.3-8b-instruct:free",
    
    "embedding_provider": "openai",
    "embedding_backend_url": "https://api.openai.com/v1",
    "embedding_model": "text-embedding-3-small",
    "enable_memory": True,
}

Required API Keys:

  • OPENROUTER_API_KEY (for chat)
  • OPENAI_API_KEY (for embeddings)

Scenario 2: All Local with Ollama

Best for: Complete offline/local deployment

config = {
    "llm_provider": "ollama",
    "backend_url": "http://localhost:11434/v1",
    "deep_think_llm": "llama3.1",
    "quick_think_llm": "llama3.2",
    
    "embedding_provider": "ollama",
    "embedding_backend_url": "http://localhost:11434/v1",
    "embedding_model": "nomic-embed-text",
    "enable_memory": True,
}

Prerequisites:

  • Ollama installed and running
  • Models pulled: ollama pull llama3.1 llama3.2 nomic-embed-text

Scenario 3: Anthropic for Chat, No Memory

Best for: Using providers without embedding support

config = {
    "llm_provider": "anthropic",
    "backend_url": "https://api.anthropic.com/",
    "deep_think_llm": "claude-sonnet-4-0",
    "quick_think_llm": "claude-3-5-haiku-latest",
    
    "embedding_provider": "none",
    "enable_memory": False,
}

Note: Memory and context retrieval will be disabled.

Scenario 4: OpenAI for Everything (Default)

Best for: Simplicity and full feature support

config = {
    "llm_provider": "openai",
    "backend_url": "https://api.openai.com/v1",
    "deep_think_llm": "o4-mini",
    "quick_think_llm": "gpt-4o-mini",
    
    # Embeddings will auto-configure to use OpenAI
}

Environment Variables

Set the appropriate API keys based on your configuration:

# For OpenAI (chat or embeddings)
export OPENAI_API_KEY="sk-..."

# For OpenRouter (chat)
export OPENROUTER_API_KEY="sk-or-..."

# For Anthropic (chat)
export ANTHROPIC_API_KEY="sk-ant-..."

# For Google (chat)
export GOOGLE_API_KEY="..."

Graceful Degradation

The memory system gracefully handles failures:

  1. Embedding API Unavailable: Returns empty memories, logs warning, continues execution
  2. Invalid Configuration: Disables memory, logs error, continues execution
  3. Network Errors: Skips memory operations, logs error, continues execution

Example log output when embeddings fail:

WARNING: Failed to initialize embedding client: Connection error. Memory will be disabled.
INFO: Memory disabled for bull_memory
INFO: Memory disabled for bear_memory
...

The agents continue to function without memory-based context.

Checking Memory Status

You can check if memory is enabled:

# After initializing the graph
print(f"Bull memory enabled: {graph.bull_memory.is_enabled()}")
print(f"Bear memory enabled: {graph.bear_memory.is_enabled()}")

Migration Guide

From Previous Version

If you have existing code using the old configuration:

Old (single backend for everything):

config = {
    "llm_provider": "openai",
    "backend_url": "https://api.openai.com/v1",
}

New (explicit embedding config):

config = {
    "llm_provider": "openai",
    "backend_url": "https://api.openai.com/v1",
    # Add these for explicit control:
    "embedding_provider": "openai",
    "embedding_backend_url": "https://api.openai.com/v1",
    "embedding_model": "text-embedding-3-small",
}

Note: The old configuration still works! The system auto-configures embeddings based on smart defaults.

Smart Defaults

If you don't specify embedding configuration, the system applies these rules:

  1. embedding_provider: Defaults to "openai"
  2. embedding_backend_url:
    • "openai""https://api.openai.com/v1"
    • "ollama""http://localhost:11434/v1"
  3. embedding_model:
    • "openai""text-embedding-3-small"
    • "ollama""nomic-embed-text"
  4. enable_memory: Defaults to True

Troubleshooting

Issue: "Failed to get embedding: 401 Unauthorized"

Cause: Missing or invalid API key for embedding provider

Solution:

export OPENAI_API_KEY="your-actual-key"

Issue: "Memory disabled for all agents"

Cause: Embedding provider set to "none" or initialization failed

Solution: Check your embedding_provider setting and API keys

Issue: OpenRouter returns HTML instead of embeddings

Cause: Trying to use OpenRouter backend for embeddings (not supported)

Solution: Set separate embedding provider:

config["embedding_provider"] = "openai"
config["embedding_backend_url"] = "https://api.openai.com/v1"

Issue: "ChromaDB collection creation failed"

Cause: ChromaDB initialization error

Solution:

  • Ensure ChromaDB is installed: pip install chromadb
  • Check disk space and permissions
  • Set enable_memory: False to bypass

Performance Considerations

Embedding Costs

Provider Model Cost per 1M tokens Speed
OpenAI text-embedding-3-small ~$0.02 Fast
OpenAI text-embedding-3-large ~$0.13 Fast
Ollama nomic-embed-text Free Medium (local)

Memory Impact

  • With Memory: Agents use historical context, better decisions
  • Without Memory: Faster initialization, no embedding costs, stateless

Best Practices

  1. Production: Use OpenAI embeddings for reliability
  2. Development: Use Ollama for cost-free testing
  3. CI/CD: Disable memory (enable_memory: False) for faster tests
  4. Multi-provider: Use different providers for chat and embeddings to optimize cost/performance

API Reference

FinancialSituationMemory

class FinancialSituationMemory:
    def __init__(self, name: str, config: Dict[str, Any])
    
    def is_enabled(self) -> bool:
        """Check if memory is enabled and functioning."""
    
    def add_situations(self, situations_and_advice: List[Tuple[str, str]]) -> bool:
        """Add financial situations and recommendations to memory."""
    
    def get_memories(self, current_situation: str, n_matches: int = 1) -> List[Dict]:
        """Retrieve matching memories for the current situation."""

Example Usage

from tradingagents.agents.utils.memory import FinancialSituationMemory

config = {
    "embedding_provider": "openai",
    "embedding_backend_url": "https://api.openai.com/v1",
    "embedding_model": "text-embedding-3-small",
    "enable_memory": True,
}

memory = FinancialSituationMemory("test_memory", config)

if memory.is_enabled():
    # Add memories
    memory.add_situations([
        ("High volatility market", "Reduce position sizes"),
        ("Strong uptrend", "Consider scaling in"),
    ])
    
    # Query memories
    matches = memory.get_memories("Market showing volatility", n_matches=2)
    for match in matches:
        print(f"Score: {match['similarity_score']:.2f}")
        print(f"Recommendation: {match['recommendation']}")

Support

For issues or questions:

  1. Check the main README
  2. Review error logs for specific failure messages
  3. Open an issue on GitHub with configuration details

Changelog

Version 2.0 (Current)

  • Separated embedding configuration from chat LLM
  • Support for multiple embedding providers
  • Graceful fallback when embeddings unavailable
  • CLI step for embedding provider selection
  • Smart defaults for backward compatibility

Version 1.0 (Legacy)

  • Single backend URL for all operations
  • Embedding failures caused system crashes
  • No provider flexibility