TradingAgents/CONTRIBUTING_SECURITY.md

# Security Best Practices for Contributors

Thank you for contributing to TradingAgents! Security is a top priority for this project. This guide outlines security best practices all contributors should follow.

## Table of Contents

1. [General Security Principles](#general-security-principles)
2. [Code Security Guidelines](#code-security-guidelines)
3. [Secure Coding Checklist](#secure-coding-checklist)
4. [Common Vulnerabilities to Avoid](#common-vulnerabilities-to-avoid)
5. [Security Testing](#security-testing)
6. [Code Review Security Focus](#code-review-security-focus)
7. [Tools and Resources](#tools-and-resources)

---

## General Security Principles

### Defense in Depth
Always implement multiple layers of security:
- Input validation
- Output encoding
- Least privilege
- Fail securely

### Secure by Default
- Default configurations should be secure
- Require explicit opt-in for insecure features
- Never expose sensitive data by default

### Zero Trust
- Validate all inputs, even from trusted sources
- Never assume data is safe
- Always sanitize before use

---

## Code Security Guidelines

### 1. Input Validation

**ALWAYS validate all user inputs:**

```python
# ✗ BAD - No validation
def get_stock_data(ticker: str):
    return fetch_data(ticker)

# ✓ GOOD - Proper validation
from tradingagents.security import validate_ticker

def get_stock_data(ticker: str):
    ticker = validate_ticker(ticker)  # Raises ValueError if invalid
    return fetch_data(ticker)
```

**Use the provided validators:**

```python
from tradingagents.security import (
    validate_ticker,
    validate_date,
    sanitize_path_component,
    validate_api_key
)

# Validate ticker symbols
ticker = validate_ticker(user_input)

# Validate dates
date = validate_date(user_input_date)

# Sanitize file paths
safe_filename = sanitize_path_component(user_filename)

# Validate API keys
api_key = validate_api_key(os.getenv("API_KEY"), "API_KEY")
```

### 2. Path Traversal Prevention

**NEVER use user input directly in file paths:**

```python
# ✗ BAD - Path traversal vulnerability
from pathlib import Path
results_dir = Path("./results") / user_ticker / user_date

# ✓ GOOD - Sanitized path
from tradingagents.security import sanitize_path_component
results_dir = Path("./results") / sanitize_path_component(user_ticker) / sanitize_path_component(user_date)
```

### 3. Secret Management

**NEVER hardcode secrets:**

```python
# ✗ BAD - Hardcoded secret
API_KEY = "sk-1234567890abcdef"

# ✗ BAD - Hardcoded path with personal info
DATA_DIR = "/Users/john/Documents/private-data"

# ✓ GOOD - Environment variable
import os
API_KEY = os.getenv("API_KEY")
if not API_KEY:
    raise ValueError("API_KEY environment variable not set")

# ✓ GOOD - Configurable path
DATA_DIR = os.getenv("TRADINGAGENTS_DATA_DIR", "./data")
```

### 4. API Security

**Always implement proper error handling and timeouts:**

```python
# ✗ BAD - No timeout, poor error handling
response = requests.get(url)
data = response.json()

# ✓ GOOD - Timeout and proper error handling
try:
    response = requests.get(
        url,
        timeout=30,  # 30 second timeout
        verify=True  # Verify SSL certificates
    )
    response.raise_for_status()
    data = response.json()
except requests.exceptions.Timeout:
    logger.error("Request timed out")
    raise
except requests.exceptions.RequestException as e:
    logger.error(f"Request failed: {e}")
    raise
```

**Use rate limiting:**

```python
from tradingagents.security import RateLimiter

# Apply rate limiting to API calls
@RateLimiter(max_calls=60, period=60)  # 60 calls per minute
def fetch_stock_data(ticker: str):
    return api.get_stock_data(ticker)
```

### 5. URL Construction

**Always encode user input in URLs:**

```python
# ✗ BAD - Direct string interpolation
url = f"https://api.example.com/search?q={user_query}"

# ✓ GOOD - Proper URL encoding
from urllib.parse import quote_plus
url = f"https://api.example.com/search?q={quote_plus(user_query)}"

# ✓ BETTER - Use params argument
import requests
response = requests.get(
    "https://api.example.com/search",
    params={"q": user_query}  # Automatically encoded
)
```

### 6. SQL Injection Prevention

**If you add database functionality:**

```python
# ✗ BAD - SQL injection vulnerability
query = f"SELECT * FROM stocks WHERE ticker = '{user_ticker}'"

# ✓ GOOD - Parameterized query
query = "SELECT * FROM stocks WHERE ticker = ?"
cursor.execute(query, (user_ticker,))
```

### 7. Safe File Operations

**Always validate file paths and use safe methods:**

```python
# ✗ BAD - Unsafe file read
with open(user_provided_path, 'r') as f:
    data = f.read()

# ✓ GOOD - Validated and safe
from pathlib import Path
from tradingagents.security import sanitize_path_component

safe_path = Path("./data") / sanitize_path_component(user_filename)
# Ensure path is within allowed directory
if not safe_path.resolve().is_relative_to(Path("./data").resolve()):
    raise ValueError("Invalid file path")

with open(safe_path, 'r') as f:
    data = f.read()
```

### 8. Logging Security

**Never log sensitive information:**

```python
# ✗ BAD - Logging sensitive data
logger.info(f"API request with key: {api_key}")
logger.debug(f"User password: {password}")

# ✓ GOOD - Safe logging
logger.info(f"API request initiated")
logger.debug(f"User ID: {user_id}")  # OK to log non-sensitive IDs

# ✓ GOOD - Redact sensitive data
def redact_api_key(key: str) -> str:
    if len(key) > 8:
        return f"{key[:4]}...{key[-4:]}"
    return "***"

logger.info(f"Using API key: {redact_api_key(api_key)}")
```

### 9. Error Messages

**Don't leak sensitive information in error messages:**

```python
# ✗ BAD - Leaking internal paths
except Exception as e:
    return f"Error reading file: {str(e)}"  # Might expose paths

# ✓ GOOD - Generic error message
except FileNotFoundError:
    logger.error(f"File not found: {safe_path}")
    return "The requested file was not found"
except Exception as e:
    logger.error(f"Unexpected error: {e}")
    return "An error occurred while processing your request"
```

### 10. Type Hints and Validation

**Use type hints for better security:**

```python
from typing import Dict, List, Optional
from datetime import datetime

# ✓ GOOD - Clear types make validation easier
def analyze_stock(
    ticker: str,
    start_date: str,
    end_date: str,
    config: Optional[Dict] = None
) -> Dict[str, float]:
    """
    Analyze stock performance.

    Args:
        ticker: Stock ticker symbol (validated)
        start_date: Start date in YYYY-MM-DD format
        end_date: End date in YYYY-MM-DD format
        config: Optional configuration dictionary

    Returns:
        Dictionary with analysis results

    Raises:
        ValueError: If inputs are invalid
    """
    ticker = validate_ticker(ticker)
    start_date = validate_date(start_date)
    end_date = validate_date(end_date)

    # Implementation...
    return {"return": 0.15}
```

---

## Secure Coding Checklist

Before submitting a pull request, verify:

- [ ] **Input Validation**
  - [ ] All user inputs are validated
  - [ ] Used appropriate validators from `tradingagents.security`
  - [ ] Edge cases are handled

- [ ] **Path Security**
  - [ ] No user input in file paths without sanitization
  - [ ] Used `sanitize_path_component` for file operations
  - [ ] Paths are restricted to allowed directories

- [ ] **Secret Management**
  - [ ] No hardcoded API keys, passwords, or secrets
  - [ ] Environment variables used for configuration
  - [ ] No personal file paths or usernames in code

- [ ] **API Security**
  - [ ] All HTTP requests have timeouts
  - [ ] SSL verification is enabled
  - [ ] Error handling is comprehensive
  - [ ] Rate limiting is applied where appropriate

- [ ] **Error Handling**
  - [ ] Errors are logged appropriately
  - [ ] Sensitive data is not in error messages
  - [ ] Failures are handled gracefully

- [ ] **Logging**
  - [ ] No sensitive data in logs
  - [ ] Appropriate log levels used
  - [ ] Security events are logged

- [ ] **Code Quality**
  - [ ] Type hints added
  - [ ] Docstrings include security notes
  - [ ] Code is readable and maintainable

- [ ] **Testing**
  - [ ] Security tests added
  - [ ] Edge cases tested
  - [ ] Error paths tested

---

## Common Vulnerabilities to Avoid

### 1. Path Traversal (CWE-22)

```python
# ✗ VULNERABLE
user_file = request.get("file")
with open(f"./data/{user_file}", 'r') as f:
    # Attacker could use: ../../../../etc/passwd
    data = f.read()

# ✓ SECURE
from tradingagents.security import sanitize_path_component
user_file = sanitize_path_component(request.get("file"))
safe_path = Path("./data") / user_file
if not safe_path.resolve().is_relative_to(Path("./data").resolve()):
    raise ValueError("Invalid path")
```

### 2. SQL Injection (CWE-89)

```python
# ✗ VULNERABLE
ticker = request.get("ticker")
query = f"SELECT * FROM stocks WHERE ticker = '{ticker}'"
# Attacker could use: AAPL'; DROP TABLE stocks; --

# ✓ SECURE
query = "SELECT * FROM stocks WHERE ticker = ?"
cursor.execute(query, (ticker,))
```

### 3. Command Injection (CWE-78)

```python
# ✗ VULNERABLE
import subprocess
user_input = request.get("command")
subprocess.run(f"process_data {user_input}", shell=True)

# ✓ SECURE - Don't use shell=True, validate inputs
allowed_commands = ['analyze', 'report', 'export']
if user_input not in allowed_commands:
    raise ValueError("Invalid command")
subprocess.run(["process_data", user_input], shell=False)
```

### 4. SSRF (Server-Side Request Forgery) (CWE-918)

```python
# ✗ VULNERABLE
user_url = request.get("data_source")
response = requests.get(user_url)

# ✓ SECURE
from tradingagents.security.validators import validate_url
user_url = validate_url(user_url, allowed_schemes=['https'])
# URL validator blocks private IPs, localhost, etc.
response = requests.get(user_url, timeout=10)
```

### 5. Insecure Deserialization (CWE-502)

```python
# ✗ VULNERABLE
import pickle
data = pickle.loads(user_provided_data)

# ✓ SECURE - Use safe serialization
import json
try:
    data = json.loads(user_provided_data)
except json.JSONDecodeError:
    raise ValueError("Invalid JSON data")
```

### 6. Information Disclosure (CWE-200)

```python
# ✗ VULNERABLE
@app.route("/debug")
def debug():
    return {"config": app.config, "env": os.environ}

# ✓ SECURE
@app.route("/health")
def health():
    return {"status": "healthy", "version": VERSION}
```

### 7. Insufficient Logging & Monitoring (CWE-778)

```python
# ✗ INSUFFICIENT
def transfer_funds(amount):
    # No logging of financial transaction
    execute_transfer(amount)

# ✓ SECURE
import logging
security_logger = logging.getLogger('security')

def transfer_funds(amount, user_id):
    security_logger.info(
        f"Transfer initiated",
        extra={
            "user_id": user_id,
            "amount": amount,
            "timestamp": datetime.now().isoformat()
        }
    )
    try:
        execute_transfer(amount)
        security_logger.info(f"Transfer completed for user {user_id}")
    except Exception as e:
        security_logger.error(f"Transfer failed for user {user_id}: {e}")
        raise
```

---

## Security Testing

### Write Security Tests

```python
# tests/security/test_input_validation.py
import pytest
from tradingagents.security import validate_ticker, sanitize_path_component

def test_validate_ticker_prevents_path_traversal():
    """Test that ticker validation prevents path traversal."""
    malicious_inputs = [
        "../../../etc/passwd",
        "..\\..\\..\\windows\\system32",
        "ticker/../../../secrets"
    ]

    for malicious in malicious_inputs:
        with pytest.raises(ValueError, match="Invalid ticker"):
            validate_ticker(malicious)

def test_sanitize_path_component():
    """Test path sanitization."""
    assert sanitize_path_component("../etc/passwd") == "etcpasswd"
    assert sanitize_path_component("normal_file.txt") == "normal_file.txt"
    assert ".." not in sanitize_path_component("../../data")

def test_api_key_validation():
    """Test API key validation."""
    from tradingagents.security import validate_api_key

    # Should pass
    validate_api_key("sk-1234567890abcdef", "TEST_KEY")

    # Should fail
    with pytest.raises(ValueError):
        validate_api_key(None, "TEST_KEY")

    with pytest.raises(ValueError):
        validate_api_key("", "TEST_KEY")
```

### Run Security Scans

```bash
# Static security analysis
bandit -r tradingagents/ -ll

# Dependency vulnerability scan
safety check
pip-audit

# Secret scanning
gitleaks detect --source=. --verbose
```

---

## Code Review Security Focus

When reviewing code, check for:

### Input Validation
- [ ] All user inputs are validated
- [ ] Validation happens on server side
- [ ] Whitelist approach used where possible

### Authentication & Authorization
- [ ] Proper authentication checks
- [ ] Authorization before sensitive operations
- [ ] Session management is secure

### Data Protection
- [ ] Sensitive data is encrypted
- [ ] No secrets in code or logs
- [ ] Proper error handling doesn't leak info

### API Security
- [ ] Rate limiting implemented
- [ ] Timeouts configured
- [ ] SSL/TLS used for all connections

### Dependencies
- [ ] Dependencies are up to date
- [ ] No known vulnerabilities
- [ ] Minimal dependencies used

---

## Tools and Resources

### Security Tools

1. **Bandit** - Python security linter
   ```bash
   pip install bandit
   bandit -r tradingagents/
   ```

2. **Safety** - Dependency vulnerability scanner
   ```bash
   pip install safety
   safety check
   ```

3. **pip-audit** - Another dependency scanner
   ```bash
   pip install pip-audit
   pip-audit
   ```

4. **Gitleaks** - Secret scanning
   ```bash
   docker run -v $(pwd):/path zricethezav/gitleaks:latest detect --source="/path"
   ```

5. **Pre-commit hooks** - Automated checks
   ```bash
   pip install pre-commit
   pre-commit install
   ```

### Resources

- [OWASP Top 10](https://owasp.org/www-project-top-ten/)
- [CWE Top 25](https://cwe.mitre.org/top25/)
- [Python Security](https://python.readthedocs.io/en/stable/library/security_warnings.html)
- [NIST Guidelines](https://www.nist.gov/cybersecurity)

---

## Questions?

If you have security questions or concerns:
- Email: yijia.xiao@cs.ucla.edu
- Review: [SECURITY.md](SECURITY.md)
- Check: [SECURITY_AUDIT.md](SECURITY_AUDIT.md)

**Remember: When in doubt, ask before committing!**

Thank you for keeping TradingAgents secure!