TradingAgents/OLLAMA_MODELS.md

263 lines
4.9 KiB
Markdown

# Ollama Models for TradingAgents
## ✅ Verified Tool-Compatible Models
These models support **tool calling / function calling** which is required for TradingAgents to work:
### Recommended Models
| Model | Size | Speed | Quality | Command |
|-------|------|-------|---------|---------|
| **llama3.2** ⭐ | 3B | Fast | Good | `ollama pull llama3.2` |
| llama3.2:1b | 1B | Fastest | Moderate | `ollama pull llama3.2:1b` |
| llama3.1 | 8B | Medium | Better | `ollama pull llama3.1` |
| mistral-nemo | 12B | Medium | Better | `ollama pull mistral-nemo` |
| qwen2.5 | 7B | Fast | Good | `ollama pull qwen2.5` |
### ⭐ Best Choice for Most Users
```bash
ollama pull llama3.2
```
**Why llama3.2?**
- ✅ Supports tool calling
- ✅ Fast inference
- ✅ Good quality
- ✅ Reasonable memory usage (~4GB)
## Model Details
### llama3.2 (RECOMMENDED)
- **Variants**: 1B, 3B (default)
- **Best For**: General trading analysis
- **Memory**: ~2-4GB
- **Speed**: 2-3 minutes per analysis
- **Tools**: ✅ Full support
```bash
# Default (3B)
ollama pull llama3.2
# Smallest (1B) - fastest
ollama pull llama3.2:1b
```
### llama3.1
- **Variants**: 8B, 70B, 405B
- **Best For**: Higher quality analysis
- **Memory**: ~8GB+ (for 8B)
- **Speed**: 3-5 minutes per analysis
- **Tools**: ✅ Full support
```bash
# Most common (8B)
ollama pull llama3.1
# High quality (70B) - requires powerful GPU
ollama pull llama3.1:70b
```
### mistral-nemo
- **Size**: 12B
- **Best For**: Balanced quality/speed
- **Memory**: ~12GB
- **Speed**: 3-4 minutes per analysis
- **Tools**: ✅ Full support
```bash
ollama pull mistral-nemo
```
### qwen2.5
- **Variants**: 0.5B, 1.5B, 3B, 7B, 14B, 32B, 72B
- **Best For**: Good multilingual support
- **Memory**: Varies (7B ~7GB)
- **Speed**: Fast
- **Tools**: ✅ Full support
```bash
# Default (7B)
ollama pull qwen2.5
# Smaller variants
ollama pull qwen2.5:3b
ollama pull qwen2.5:1.5b
```
## ❌ Models That DON'T Support Tools
These models will **NOT work** with TradingAgents:
-`llama3` (original)
-`llama2`
-`mistral` (v0.1-0.2)
-`codellama` (designed for code, not tools)
- ❌ Most older models
## Quick Start
### 1. Install Ollama
Download from: https://ollama.ai
### 2. Pull a Model
```bash
# RECOMMENDED
ollama pull llama3.2
# OR for better quality (slower)
ollama pull llama3.1
# OR for Mistral
ollama pull mistral-nemo
```
### 3. Verify Model Works
```bash
ollama list
```
You should see your model listed.
### 4. Use in TradingAgents
When running the CLI, select:
- **Provider**: Ollama
- **Quick-Thinking LLM**: llama3.2 (or your choice)
- **Deep-Thinking LLM**: llama3.2 (or your choice)
## Performance Comparison
### Speed Test (Single AAPL Analysis)
| Model | Time | Memory | Quality |
|-------|------|--------|---------|
| llama3.2:1b | ~1-2 min | 2GB | ⭐⭐⭐ |
| llama3.2 (3B) | ~2-3 min | 4GB | ⭐⭐⭐⭐ |
| llama3.1 (8B) | ~3-5 min | 8GB | ⭐⭐⭐⭐⭐ |
| mistral-nemo | ~3-4 min | 12GB | ⭐⭐⭐⭐⭐ |
| qwen2.5 | ~2-3 min | 7GB | ⭐⭐⭐⭐ |
*Times approximate on modern consumer hardware (RTX 3060+)*
## Advanced Options
### Different Model Sizes
Many models have variants. List all available versions:
```bash
ollama list | grep llama3.2
```
Pull specific variants:
```bash
# Smallest llama3.2
ollama pull llama3.2:1b
# Default llama3.2
ollama pull llama3.2
# Latest llama3.2
ollama pull llama3.2:latest
```
### Check Model Info
```bash
ollama show llama3.2
```
### Remove Models
```bash
ollama rm llama3
ollama rm mistral
```
## Troubleshooting
### Error: "does not support tools"
**Problem**: You're using a model that doesn't support function calling.
**Solution**: Switch to a supported model:
```bash
ollama pull llama3.2
```
### Slow Performance
**Solution 1**: Use a smaller model
```bash
ollama pull llama3.2:1b
```
**Solution 2**: Check GPU usage
```bash
# Make sure Ollama is using GPU
ollama show llama3.2 | grep gpu
```
### Out of Memory
**Solution**: Use smaller model or reduce context
```bash
# Smallest option
ollama pull llama3.2:1b
```
## Recommendations by Use Case
### Development & Testing
**Fastest**: `llama3.2:1b`
```bash
ollama pull llama3.2:1b
```
### Production (Free/Local)
**Balanced**: `llama3.2` (3B default)
```bash
ollama pull llama3.2
```
### High Quality (Local)
**Best**: `llama3.1` (8B)
```bash
ollama pull llama3.1
```
### Budget GPU
**Efficient**: `qwen2.5:3b`
```bash
ollama pull qwen2.5:3b
```
## Future Models
New models are constantly being released. Check for tool support:
1. Visit: https://ollama.ai/library
2. Look for "Tools" or "Function Calling" in model description
3. Test with: `python quick_test_ollama.py`
## Summary
**Best for most users**: `llama3.2`
**Best quality (local)**: `llama3.1`
**Fastest**: `llama3.2:1b`
**Balanced**: `mistral-nemo` or `qwen2.5`
**Command to get started:**
```bash
ollama pull llama3.2
```
Then run:
```bash
python -m cli.main
```
And select **Ollama** as your provider! 🚀