532 lines
13 KiB
Markdown
532 lines
13 KiB
Markdown
# TradingAgents Security Audit Report
|
|
|
|
**Date:** 2025-11-14
|
|
**Auditor:** Claude (AI Security Analysis)
|
|
**Project:** TradingAgents - Multi-Agents LLM Financial Trading Framework
|
|
**Version:** Current main branch
|
|
|
|
---
|
|
|
|
## Executive Summary
|
|
|
|
This security audit identifies critical and moderate security vulnerabilities in the TradingAgents codebase, along with recommendations for remediation. The project handles sensitive financial data and API keys, making security a top priority.
|
|
|
|
### Risk Summary
|
|
- **Critical Issues:** 3
|
|
- **High Issues:** 5
|
|
- **Medium Issues:** 7
|
|
- **Low Issues:** 4
|
|
|
|
---
|
|
|
|
## Critical Security Issues
|
|
|
|
### 1. Path Traversal Vulnerability (CRITICAL)
|
|
**File:** `cli/main.py:757`
|
|
**Risk Level:** Critical
|
|
**CVSS Score:** 8.6
|
|
|
|
**Issue:**
|
|
```python
|
|
results_dir = Path(config["results_dir"]) / selections["ticker"] / selections["analysis_date"]
|
|
```
|
|
|
|
User-controlled input (`ticker` and `analysis_date`) is used directly in file path construction without sanitization.
|
|
|
|
**Attack Scenario:**
|
|
```python
|
|
ticker = "../../../etc/passwd"
|
|
analysis_date = "../../secrets"
|
|
# Results in: ./results/../../../etc/passwd/../../secrets
|
|
```
|
|
|
|
**Remediation:**
|
|
```python
|
|
import re
|
|
from pathlib import Path
|
|
|
|
def sanitize_path_component(value):
|
|
"""Sanitize user input for safe file path usage."""
|
|
# Remove any path traversal attempts
|
|
value = value.replace('..', '').replace('/', '').replace('\\', '')
|
|
# Allow only alphanumeric, dash, underscore
|
|
value = re.sub(r'[^a-zA-Z0-9_-]', '_', value)
|
|
return value
|
|
|
|
# Usage
|
|
results_dir = Path(config["results_dir"]) / sanitize_path_component(selections["ticker"]) / sanitize_path_component(selections["analysis_date"])
|
|
```
|
|
|
|
### 2. Hardcoded Developer Path Exposure (CRITICAL)
|
|
**File:** `tradingagents/default_config.py:6`
|
|
**Risk Level:** Critical
|
|
**CVSS Score:** 7.5
|
|
|
|
**Issue:**
|
|
```python
|
|
"data_dir": "/Users/yluo/Documents/Code/ScAI/FR1-data",
|
|
```
|
|
|
|
Exposes developer's local file system structure and potentially identifies system users.
|
|
|
|
**Remediation:**
|
|
```python
|
|
"data_dir": os.getenv("TRADINGAGENTS_DATA_DIR", "./data"),
|
|
```
|
|
|
|
### 3. No Input Validation on External API Calls (CRITICAL)
|
|
**File:** `tradingagents/dataflows/googlenews_utils.py:60-64`
|
|
**Risk Level:** Critical
|
|
**CVSS Score:** 8.1
|
|
|
|
**Issue:**
|
|
```python
|
|
url = (
|
|
f"https://www.google.com/search?q={query}"
|
|
f"&tbs=cdr:1,cd_min:{start_date},cd_max:{end_date}"
|
|
f"&tbm=nws&start={offset}"
|
|
)
|
|
```
|
|
|
|
User input is directly interpolated into URLs without encoding or validation.
|
|
|
|
**Remediation:**
|
|
```python
|
|
from urllib.parse import quote_plus
|
|
|
|
url = (
|
|
f"https://www.google.com/search?q={quote_plus(query)}"
|
|
f"&tbs=cdr:1,cd_min:{quote_plus(start_date)},cd_max:{quote_plus(end_date)}"
|
|
f"&tbm=nws&start={int(offset)}"
|
|
)
|
|
```
|
|
|
|
---
|
|
|
|
## High Security Issues
|
|
|
|
### 4. Missing API Key Validation (HIGH)
|
|
**File:** `tradingagents/dataflows/openai.py:7`
|
|
**Risk Level:** High
|
|
|
|
**Issue:**
|
|
The OpenAI client is initialized without checking if API key is set, leading to unclear error messages.
|
|
|
|
**Remediation:**
|
|
```python
|
|
import os
|
|
|
|
def get_openai_client():
|
|
api_key = os.getenv("OPENAI_API_KEY")
|
|
if not api_key:
|
|
raise ValueError(
|
|
"OPENAI_API_KEY environment variable is not set. "
|
|
"Please set it in your .env file or environment."
|
|
)
|
|
config = get_config()
|
|
return OpenAI(base_url=config["backend_url"], api_key=api_key)
|
|
```
|
|
|
|
### 5. No Rate Limiting Protection (HIGH)
|
|
**File:** Multiple data vendor files
|
|
**Risk Level:** High
|
|
|
|
**Issue:**
|
|
No centralized rate limiting for API calls could lead to:
|
|
- API quota exhaustion
|
|
- Service denial
|
|
- Unexpected costs
|
|
|
|
**Remediation:**
|
|
Implement a rate limiter using `ratelimit` library or custom implementation:
|
|
|
|
```python
|
|
from functools import wraps
|
|
import time
|
|
from collections import deque
|
|
|
|
class RateLimiter:
|
|
def __init__(self, max_calls, period):
|
|
self.max_calls = max_calls
|
|
self.period = period
|
|
self.calls = deque()
|
|
|
|
def __call__(self, func):
|
|
@wraps(func)
|
|
def wrapper(*args, **kwargs):
|
|
now = time.time()
|
|
# Remove old calls
|
|
while self.calls and self.calls[0] < now - self.period:
|
|
self.calls.popleft()
|
|
|
|
if len(self.calls) >= self.max_calls:
|
|
sleep_time = self.period - (now - self.calls[0])
|
|
if sleep_time > 0:
|
|
time.sleep(sleep_time)
|
|
self.calls.popleft()
|
|
|
|
self.calls.append(time.time())
|
|
return func(*args, **kwargs)
|
|
return wrapper
|
|
|
|
# Usage
|
|
@RateLimiter(max_calls=60, period=60) # 60 calls per minute
|
|
def make_api_call():
|
|
pass
|
|
```
|
|
|
|
### 6. Insufficient Error Handling in API Calls (HIGH)
|
|
**File:** `tradingagents/dataflows/alpha_vantage_common.py:66`
|
|
**Risk Level:** High
|
|
|
|
**Issue:**
|
|
```python
|
|
response = requests.get(API_BASE_URL, params=api_params)
|
|
response.raise_for_status()
|
|
```
|
|
|
|
No timeout specified, could lead to hung connections.
|
|
|
|
**Remediation:**
|
|
```python
|
|
response = requests.get(
|
|
API_BASE_URL,
|
|
params=api_params,
|
|
timeout=30, # 30 second timeout
|
|
verify=True # Ensure SSL verification
|
|
)
|
|
```
|
|
|
|
### 7. Debug Mode Enabled in Production Examples (HIGH)
|
|
**File:** Multiple files
|
|
**Risk Level:** High
|
|
|
|
**Issue:**
|
|
Documentation examples show `debug=True`:
|
|
```python
|
|
ta = TradingAgentsGraph(debug=True, config=config)
|
|
```
|
|
|
|
**Remediation:**
|
|
Update all examples to:
|
|
```python
|
|
# For production
|
|
ta = TradingAgentsGraph(debug=False, config=config)
|
|
|
|
# For development only
|
|
# ta = TradingAgentsGraph(debug=True, config=config)
|
|
```
|
|
|
|
### 8. No Test Coverage (HIGH)
|
|
**File:** Project-wide
|
|
**Risk Level:** High
|
|
|
|
**Issue:**
|
|
No unit tests or integration tests found. This makes it difficult to:
|
|
- Verify security fixes
|
|
- Prevent regressions
|
|
- Ensure code quality
|
|
|
|
**Remediation:**
|
|
Create comprehensive test suite. Example structure:
|
|
|
|
```
|
|
tests/
|
|
├── __init__.py
|
|
├── unit/
|
|
│ ├── test_security.py
|
|
│ ├── test_config.py
|
|
│ ├── test_dataflows.py
|
|
│ └── test_agents.py
|
|
├── integration/
|
|
│ ├── test_trading_graph.py
|
|
│ └── test_api_vendors.py
|
|
└── security/
|
|
├── test_input_validation.py
|
|
├── test_path_traversal.py
|
|
└── test_api_key_handling.py
|
|
```
|
|
|
|
---
|
|
|
|
## Medium Security Issues
|
|
|
|
### 9. Exposed Global State (MEDIUM)
|
|
**File:** `tradingagents/dataflows/alpha_vantage_common.py:57`
|
|
**Risk Level:** Medium
|
|
|
|
**Issue:**
|
|
```python
|
|
current_entitlement = globals().get('_current_entitlement')
|
|
```
|
|
|
|
Using globals for configuration is error-prone and not thread-safe.
|
|
|
|
**Remediation:**
|
|
Use configuration objects or dependency injection instead.
|
|
|
|
### 10. Web Scraping User-Agent Spoofing (MEDIUM)
|
|
**File:** `tradingagents/dataflows/googlenews_utils.py:48-54`
|
|
**Risk Level:** Medium
|
|
|
|
**Issue:**
|
|
User-Agent spoofing may violate Google's Terms of Service.
|
|
|
|
**Remediation:**
|
|
- Use official Google News API if available
|
|
- Clearly document scraping behavior
|
|
- Implement respectful rate limiting
|
|
- Consider alternative news sources with official APIs
|
|
|
|
### 11. No Secret Scanning in CI/CD (MEDIUM)
|
|
**Risk Level:** Medium
|
|
|
|
**Issue:**
|
|
No automated secret scanning in version control.
|
|
|
|
**Remediation:**
|
|
Add `.gitguard.yml` or use GitHub secret scanning:
|
|
|
|
```yaml
|
|
# .github/workflows/security.yml
|
|
name: Security Scan
|
|
on: [push, pull_request]
|
|
jobs:
|
|
secret-scan:
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- uses: actions/checkout@v2
|
|
- name: TruffleHog Secret Scan
|
|
uses: trufflesecurity/trufflehog@main
|
|
with:
|
|
path: ./
|
|
base: ${{ github.event.repository.default_branch }}
|
|
```
|
|
|
|
### 12. Insufficient Logging of Security Events (MEDIUM)
|
|
**Risk Level:** Medium
|
|
|
|
**Issue:**
|
|
No logging of:
|
|
- Failed authentication attempts
|
|
- API rate limit hits
|
|
- Unusual activity patterns
|
|
|
|
**Remediation:**
|
|
Implement security event logging:
|
|
|
|
```python
|
|
import logging
|
|
|
|
security_logger = logging.getLogger('security')
|
|
security_logger.setLevel(logging.INFO)
|
|
|
|
# Log security events
|
|
security_logger.warning(f"API rate limit exceeded for vendor: {vendor}")
|
|
security_logger.info(f"API key rotation detected")
|
|
```
|
|
|
|
### 13. No Dependency Pinning (MEDIUM)
|
|
**File:** `requirements.txt`
|
|
**Risk Level:** Medium
|
|
|
|
**Issue:**
|
|
Dependencies specified without version pins:
|
|
```
|
|
typing-extensions
|
|
langchain-openai
|
|
pandas
|
|
```
|
|
|
|
**Remediation:**
|
|
Pin all dependencies with specific versions:
|
|
```
|
|
typing-extensions==4.15.0
|
|
langchain-openai==1.0.2
|
|
pandas==2.3.3
|
|
```
|
|
|
|
Or use `requirements-lock.txt`:
|
|
```bash
|
|
pip freeze > requirements-lock.txt
|
|
```
|
|
|
|
### 14. Missing Security Headers (MEDIUM)
|
|
**Risk Level:** Medium (if web interface is added)
|
|
|
|
**Remediation:**
|
|
If adding web interface, implement security headers:
|
|
```python
|
|
security_headers = {
|
|
'X-Content-Type-Options': 'nosniff',
|
|
'X-Frame-Options': 'DENY',
|
|
'X-XSS-Protection': '1; mode=block',
|
|
'Strict-Transport-Security': 'max-age=31536000; includeSubDomains',
|
|
'Content-Security-Policy': "default-src 'self'"
|
|
}
|
|
```
|
|
|
|
### 15. No Data Validation Schema (MEDIUM)
|
|
**Risk Level:** Medium
|
|
|
|
**Issue:**
|
|
No validation of API responses before processing.
|
|
|
|
**Remediation:**
|
|
Use Pydantic for data validation:
|
|
|
|
```python
|
|
from pydantic import BaseModel, validator
|
|
|
|
class StockData(BaseModel):
|
|
ticker: str
|
|
date: str
|
|
price: float
|
|
|
|
@validator('ticker')
|
|
def validate_ticker(cls, v):
|
|
if not v.isalnum() or len(v) > 10:
|
|
raise ValueError('Invalid ticker symbol')
|
|
return v.upper()
|
|
```
|
|
|
|
---
|
|
|
|
## Low Security Issues
|
|
|
|
### 16. Verbose Error Messages (LOW)
|
|
**Risk Level:** Low
|
|
|
|
**Issue:**
|
|
Error messages may leak sensitive information in production.
|
|
|
|
**Remediation:**
|
|
Implement error handling that logs detailed errors but shows generic messages to users.
|
|
|
|
### 17. No Security.md File (LOW)
|
|
**Risk Level:** Low
|
|
|
|
**Remediation:**
|
|
Create SECURITY.md for responsible disclosure.
|
|
|
|
### 18. No Code Signing (LOW)
|
|
**Risk Level:** Low
|
|
|
|
**Remediation:**
|
|
Consider signing releases with GPG.
|
|
|
|
### 19. No Dependency Vulnerability Scanning (LOW)
|
|
**Risk Level:** Low
|
|
|
|
**Remediation:**
|
|
Add `pip-audit` or `safety` to CI/CD:
|
|
```bash
|
|
pip install pip-audit
|
|
pip-audit
|
|
```
|
|
|
|
---
|
|
|
|
## Compliance Considerations
|
|
|
|
### Financial Data Handling
|
|
1. **GDPR**: If handling EU user data
|
|
2. **SOC 2**: For service providers
|
|
3. **PCI DSS**: If handling payment data (future)
|
|
|
|
### Recommendations:
|
|
- Document data retention policies
|
|
- Implement data encryption at rest
|
|
- Add audit logging
|
|
- Create data access controls
|
|
|
|
---
|
|
|
|
## Security Best Practices for Contributors
|
|
|
|
### 1. Environment Setup
|
|
- Never commit `.env` files
|
|
- Use `.env.example` as template
|
|
- Rotate API keys regularly
|
|
|
|
### 2. Code Review Checklist
|
|
- [ ] No hardcoded secrets
|
|
- [ ] Input validation on all user inputs
|
|
- [ ] Error handling doesn't leak sensitive info
|
|
- [ ] Dependencies are pinned and scanned
|
|
- [ ] Tests added for security-critical code
|
|
|
|
### 3. Secure Development Guidelines
|
|
- Always validate and sanitize user input
|
|
- Use parameterized queries (if SQL is added)
|
|
- Implement least privilege principle
|
|
- Log security events
|
|
- Keep dependencies updated
|
|
|
|
---
|
|
|
|
## Automated Security Tools Recommendations
|
|
|
|
1. **Static Analysis:**
|
|
- `bandit` - Python security linter
|
|
- `semgrep` - Lightweight static analysis
|
|
|
|
2. **Dependency Scanning:**
|
|
- `pip-audit` - Check for known vulnerabilities
|
|
- `safety` - Check Python dependencies
|
|
|
|
3. **Secret Scanning:**
|
|
- `trufflehog` - Find secrets in git history
|
|
- `gitleaks` - Detect hardcoded secrets
|
|
|
|
4. **Dynamic Analysis:**
|
|
- `pytest-security` - Security testing framework
|
|
|
|
---
|
|
|
|
## Immediate Action Items
|
|
|
|
### Priority 1 (Critical - Fix Immediately)
|
|
1. Fix path traversal vulnerability in cli/main.py
|
|
2. Remove hardcoded path from default_config.py
|
|
3. Add input validation to URL construction
|
|
|
|
### Priority 2 (High - Fix This Week)
|
|
1. Add API key validation
|
|
2. Implement rate limiting
|
|
3. Add timeouts to all network requests
|
|
4. Pin all dependencies
|
|
5. Add basic test coverage
|
|
|
|
### Priority 3 (Medium - Fix This Month)
|
|
1. Implement comprehensive logging
|
|
2. Add secret scanning to CI/CD
|
|
3. Create SECURITY.md
|
|
4. Add data validation schemas
|
|
|
|
### Priority 4 (Low - Fix When Possible)
|
|
1. Improve error messages
|
|
2. Add code signing
|
|
3. Document security practices
|
|
|
|
---
|
|
|
|
## Conclusion
|
|
|
|
The TradingAgents framework has several security issues that should be addressed before production use. The critical issues around path traversal and input validation pose immediate risks and should be fixed as the highest priority.
|
|
|
|
The project would benefit from:
|
|
1. Comprehensive test coverage
|
|
2. Automated security scanning
|
|
3. Clear security documentation
|
|
4. Regular security audits
|
|
|
|
---
|
|
|
|
## References
|
|
|
|
- OWASP Top 10: https://owasp.org/www-project-top-ten/
|
|
- Python Security Best Practices: https://python.readthedocs.io/en/stable/library/security_warnings.html
|
|
- CWE-22 Path Traversal: https://cwe.mitre.org/data/definitions/22.html
|
|
- CWE-89 SQL Injection: https://cwe.mitre.org/data/definitions/89.html
|