TradingAgents/SECURITY_AUDIT.md

13 KiB

TradingAgents Security Audit Report

Date: 2025-11-14 Auditor: Claude (AI Security Analysis) Project: TradingAgents - Multi-Agents LLM Financial Trading Framework Version: Current main branch


Executive Summary

This security audit identifies critical and moderate security vulnerabilities in the TradingAgents codebase, along with recommendations for remediation. The project handles sensitive financial data and API keys, making security a top priority.

Risk Summary

  • Critical Issues: 3
  • High Issues: 5
  • Medium Issues: 7
  • Low Issues: 4

Critical Security Issues

1. Path Traversal Vulnerability (CRITICAL)

File: cli/main.py:757 Risk Level: Critical CVSS Score: 8.6

Issue:

results_dir = Path(config["results_dir"]) / selections["ticker"] / selections["analysis_date"]

User-controlled input (ticker and analysis_date) is used directly in file path construction without sanitization.

Attack Scenario:

ticker = "../../../etc/passwd"
analysis_date = "../../secrets"
# Results in: ./results/../../../etc/passwd/../../secrets

Remediation:

import re
from pathlib import Path

def sanitize_path_component(value):
    """Sanitize user input for safe file path usage."""
    # Remove any path traversal attempts
    value = value.replace('..', '').replace('/', '').replace('\\', '')
    # Allow only alphanumeric, dash, underscore
    value = re.sub(r'[^a-zA-Z0-9_-]', '_', value)
    return value

# Usage
results_dir = Path(config["results_dir"]) / sanitize_path_component(selections["ticker"]) / sanitize_path_component(selections["analysis_date"])

2. Hardcoded Developer Path Exposure (CRITICAL)

File: tradingagents/default_config.py:6 Risk Level: Critical CVSS Score: 7.5

Issue:

"data_dir": "/Users/yluo/Documents/Code/ScAI/FR1-data",

Exposes developer's local file system structure and potentially identifies system users.

Remediation:

"data_dir": os.getenv("TRADINGAGENTS_DATA_DIR", "./data"),

3. No Input Validation on External API Calls (CRITICAL)

File: tradingagents/dataflows/googlenews_utils.py:60-64 Risk Level: Critical CVSS Score: 8.1

Issue:

url = (
    f"https://www.google.com/search?q={query}"
    f"&tbs=cdr:1,cd_min:{start_date},cd_max:{end_date}"
    f"&tbm=nws&start={offset}"
)

User input is directly interpolated into URLs without encoding or validation.

Remediation:

from urllib.parse import quote_plus

url = (
    f"https://www.google.com/search?q={quote_plus(query)}"
    f"&tbs=cdr:1,cd_min:{quote_plus(start_date)},cd_max:{quote_plus(end_date)}"
    f"&tbm=nws&start={int(offset)}"
)

High Security Issues

4. Missing API Key Validation (HIGH)

File: tradingagents/dataflows/openai.py:7 Risk Level: High

Issue: The OpenAI client is initialized without checking if API key is set, leading to unclear error messages.

Remediation:

import os

def get_openai_client():
    api_key = os.getenv("OPENAI_API_KEY")
    if not api_key:
        raise ValueError(
            "OPENAI_API_KEY environment variable is not set. "
            "Please set it in your .env file or environment."
        )
    config = get_config()
    return OpenAI(base_url=config["backend_url"], api_key=api_key)

5. No Rate Limiting Protection (HIGH)

File: Multiple data vendor files Risk Level: High

Issue: No centralized rate limiting for API calls could lead to:

  • API quota exhaustion
  • Service denial
  • Unexpected costs

Remediation: Implement a rate limiter using ratelimit library or custom implementation:

from functools import wraps
import time
from collections import deque

class RateLimiter:
    def __init__(self, max_calls, period):
        self.max_calls = max_calls
        self.period = period
        self.calls = deque()

    def __call__(self, func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            now = time.time()
            # Remove old calls
            while self.calls and self.calls[0] < now - self.period:
                self.calls.popleft()

            if len(self.calls) >= self.max_calls:
                sleep_time = self.period - (now - self.calls[0])
                if sleep_time > 0:
                    time.sleep(sleep_time)
                self.calls.popleft()

            self.calls.append(time.time())
            return func(*args, **kwargs)
        return wrapper

# Usage
@RateLimiter(max_calls=60, period=60)  # 60 calls per minute
def make_api_call():
    pass

6. Insufficient Error Handling in API Calls (HIGH)

File: tradingagents/dataflows/alpha_vantage_common.py:66 Risk Level: High

Issue:

response = requests.get(API_BASE_URL, params=api_params)
response.raise_for_status()

No timeout specified, could lead to hung connections.

Remediation:

response = requests.get(
    API_BASE_URL,
    params=api_params,
    timeout=30,  # 30 second timeout
    verify=True  # Ensure SSL verification
)

7. Debug Mode Enabled in Production Examples (HIGH)

File: Multiple files Risk Level: High

Issue: Documentation examples show debug=True:

ta = TradingAgentsGraph(debug=True, config=config)

Remediation: Update all examples to:

# For production
ta = TradingAgentsGraph(debug=False, config=config)

# For development only
# ta = TradingAgentsGraph(debug=True, config=config)

8. No Test Coverage (HIGH)

File: Project-wide Risk Level: High

Issue: No unit tests or integration tests found. This makes it difficult to:

  • Verify security fixes
  • Prevent regressions
  • Ensure code quality

Remediation: Create comprehensive test suite. Example structure:

tests/
├── __init__.py
├── unit/
│   ├── test_security.py
│   ├── test_config.py
│   ├── test_dataflows.py
│   └── test_agents.py
├── integration/
│   ├── test_trading_graph.py
│   └── test_api_vendors.py
└── security/
    ├── test_input_validation.py
    ├── test_path_traversal.py
    └── test_api_key_handling.py

Medium Security Issues

9. Exposed Global State (MEDIUM)

File: tradingagents/dataflows/alpha_vantage_common.py:57 Risk Level: Medium

Issue:

current_entitlement = globals().get('_current_entitlement')

Using globals for configuration is error-prone and not thread-safe.

Remediation: Use configuration objects or dependency injection instead.

10. Web Scraping User-Agent Spoofing (MEDIUM)

File: tradingagents/dataflows/googlenews_utils.py:48-54 Risk Level: Medium

Issue: User-Agent spoofing may violate Google's Terms of Service.

Remediation:

  • Use official Google News API if available
  • Clearly document scraping behavior
  • Implement respectful rate limiting
  • Consider alternative news sources with official APIs

11. No Secret Scanning in CI/CD (MEDIUM)

Risk Level: Medium

Issue: No automated secret scanning in version control.

Remediation: Add .gitguard.yml or use GitHub secret scanning:

# .github/workflows/security.yml
name: Security Scan
on: [push, pull_request]
jobs:
  secret-scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - name: TruffleHog Secret Scan
        uses: trufflesecurity/trufflehog@main
        with:
          path: ./
          base: ${{ github.event.repository.default_branch }}

12. Insufficient Logging of Security Events (MEDIUM)

Risk Level: Medium

Issue: No logging of:

  • Failed authentication attempts
  • API rate limit hits
  • Unusual activity patterns

Remediation: Implement security event logging:

import logging

security_logger = logging.getLogger('security')
security_logger.setLevel(logging.INFO)

# Log security events
security_logger.warning(f"API rate limit exceeded for vendor: {vendor}")
security_logger.info(f"API key rotation detected")

13. No Dependency Pinning (MEDIUM)

File: requirements.txt Risk Level: Medium

Issue: Dependencies specified without version pins:

typing-extensions
langchain-openai
pandas

Remediation: Pin all dependencies with specific versions:

typing-extensions==4.15.0
langchain-openai==1.0.2
pandas==2.3.3

Or use requirements-lock.txt:

pip freeze > requirements-lock.txt

14. Missing Security Headers (MEDIUM)

Risk Level: Medium (if web interface is added)

Remediation: If adding web interface, implement security headers:

security_headers = {
    'X-Content-Type-Options': 'nosniff',
    'X-Frame-Options': 'DENY',
    'X-XSS-Protection': '1; mode=block',
    'Strict-Transport-Security': 'max-age=31536000; includeSubDomains',
    'Content-Security-Policy': "default-src 'self'"
}

15. No Data Validation Schema (MEDIUM)

Risk Level: Medium

Issue: No validation of API responses before processing.

Remediation: Use Pydantic for data validation:

from pydantic import BaseModel, validator

class StockData(BaseModel):
    ticker: str
    date: str
    price: float

    @validator('ticker')
    def validate_ticker(cls, v):
        if not v.isalnum() or len(v) > 10:
            raise ValueError('Invalid ticker symbol')
        return v.upper()

Low Security Issues

16. Verbose Error Messages (LOW)

Risk Level: Low

Issue: Error messages may leak sensitive information in production.

Remediation: Implement error handling that logs detailed errors but shows generic messages to users.

17. No Security.md File (LOW)

Risk Level: Low

Remediation: Create SECURITY.md for responsible disclosure.

18. No Code Signing (LOW)

Risk Level: Low

Remediation: Consider signing releases with GPG.

19. No Dependency Vulnerability Scanning (LOW)

Risk Level: Low

Remediation: Add pip-audit or safety to CI/CD:

pip install pip-audit
pip-audit

Compliance Considerations

Financial Data Handling

  1. GDPR: If handling EU user data
  2. SOC 2: For service providers
  3. PCI DSS: If handling payment data (future)

Recommendations:

  • Document data retention policies
  • Implement data encryption at rest
  • Add audit logging
  • Create data access controls

Security Best Practices for Contributors

1. Environment Setup

  • Never commit .env files
  • Use .env.example as template
  • Rotate API keys regularly

2. Code Review Checklist

  • No hardcoded secrets
  • Input validation on all user inputs
  • Error handling doesn't leak sensitive info
  • Dependencies are pinned and scanned
  • Tests added for security-critical code

3. Secure Development Guidelines

  • Always validate and sanitize user input
  • Use parameterized queries (if SQL is added)
  • Implement least privilege principle
  • Log security events
  • Keep dependencies updated

Automated Security Tools Recommendations

  1. Static Analysis:

    • bandit - Python security linter
    • semgrep - Lightweight static analysis
  2. Dependency Scanning:

    • pip-audit - Check for known vulnerabilities
    • safety - Check Python dependencies
  3. Secret Scanning:

    • trufflehog - Find secrets in git history
    • gitleaks - Detect hardcoded secrets
  4. Dynamic Analysis:

    • pytest-security - Security testing framework

Immediate Action Items

Priority 1 (Critical - Fix Immediately)

  1. Fix path traversal vulnerability in cli/main.py
  2. Remove hardcoded path from default_config.py
  3. Add input validation to URL construction

Priority 2 (High - Fix This Week)

  1. Add API key validation
  2. Implement rate limiting
  3. Add timeouts to all network requests
  4. Pin all dependencies
  5. Add basic test coverage

Priority 3 (Medium - Fix This Month)

  1. Implement comprehensive logging
  2. Add secret scanning to CI/CD
  3. Create SECURITY.md
  4. Add data validation schemas

Priority 4 (Low - Fix When Possible)

  1. Improve error messages
  2. Add code signing
  3. Document security practices

Conclusion

The TradingAgents framework has several security issues that should be addressed before production use. The critical issues around path traversal and input validation pose immediate risks and should be fixed as the highest priority.

The project would benefit from:

  1. Comprehensive test coverage
  2. Automated security scanning
  3. Clear security documentation
  4. Regular security audits

References