238 lines
7.5 KiB
Markdown
238 lines
7.5 KiB
Markdown
# PR #281 Critical Security Fixes
|
||
|
||
**Priority**: CRITICAL
|
||
**Impact**: Prevents path traversal attacks, data loss, and unauthorized file access
|
||
**Estimated Total Time**: 15-20 minutes
|
||
|
||
---
|
||
|
||
## Fix 1: ChromaDB Reset Flag - Production Hardening
|
||
|
||
**File**: `/tradingagents/agents/utils/memory.py`
|
||
**Line**: 13
|
||
**Severity**: HIGH - Allows complete database deletion
|
||
**Time to Apply**: 2 minutes
|
||
|
||
### Why This Matters
|
||
Setting `allow_reset=True` in production allows anyone with access to completely wipe the ChromaDB database. This is a data loss risk and should only be enabled in development/testing environments.
|
||
|
||
### BEFORE
|
||
```python
|
||
def __init__(self, name, config):
|
||
if config["backend_url"] == "http://localhost:11434/v1":
|
||
self.embedding = "nomic-embed-text"
|
||
else:
|
||
self.embedding = "text-embedding-3-small"
|
||
self.client = OpenAI(base_url=config["backend_url"])
|
||
self.chroma_client = chromadb.Client(Settings(allow_reset=True)) # ⚠️ DANGEROUS
|
||
self.situation_collection = self.chroma_client.create_collection(name=name)
|
||
```
|
||
|
||
### AFTER
|
||
```python
|
||
def __init__(self, name, config):
|
||
if config["backend_url"] == "http://localhost:11434/v1":
|
||
self.embedding = "nomic-embed-text"
|
||
else:
|
||
self.embedding = "text-embedding-3-small"
|
||
self.client = OpenAI(base_url=config["backend_url"])
|
||
self.chroma_client = chromadb.Client(Settings(allow_reset=False)) # ✓ SECURE
|
||
self.situation_collection = self.chroma_client.create_collection(name=name)
|
||
```
|
||
|
||
---
|
||
|
||
## Fix 2: Input Validation - Prevent Path Traversal
|
||
|
||
**File**: `/tradingagents/dataflows/local.py`
|
||
**Lines**: 11-50, 51-84, and similar patterns throughout
|
||
**Severity**: CRITICAL - Allows arbitrary file access
|
||
**Time to Apply**: 8-10 minutes
|
||
|
||
### Why This Matters
|
||
Ticker symbols are directly interpolated into file paths without validation. An attacker could provide input like `../../etc/passwd` or `../../../sensitive_data` to access files outside the intended directory.
|
||
|
||
### BEFORE
|
||
```python
|
||
def get_YFin_data_window(
|
||
symbol: Annotated[str, "ticker symbol of the company"],
|
||
curr_date: Annotated[str, "Start date in yyyy-mm-dd format"],
|
||
look_back_days: Annotated[int, "how many days to look back"],
|
||
) -> str:
|
||
# calculate past days
|
||
date_obj = datetime.strptime(curr_date, "%Y-%m-%d")
|
||
before = date_obj - relativedelta(days=look_back_days)
|
||
start_date = before.strftime("%Y-%m-%d")
|
||
|
||
# read in data
|
||
data = pd.read_csv(
|
||
os.path.join(
|
||
DATA_DIR,
|
||
f"market_data/price_data/{symbol}-YFin-data-2015-01-01-2025-03-25.csv", # ⚠️ VULNERABLE
|
||
)
|
||
)
|
||
```
|
||
|
||
### AFTER
|
||
```python
|
||
import re
|
||
|
||
def validate_ticker_symbol(symbol: str) -> str:
|
||
"""
|
||
Validate and sanitize ticker symbol to prevent path traversal.
|
||
|
||
Args:
|
||
symbol: Ticker symbol to validate
|
||
|
||
Returns:
|
||
Sanitized ticker symbol
|
||
|
||
Raises:
|
||
ValueError: If ticker contains invalid characters
|
||
"""
|
||
# Ticker symbols should only contain alphanumeric characters, dots, and hyphens
|
||
if not re.match(r'^[A-Za-z0-9.\-]+$', symbol):
|
||
raise ValueError(f"Invalid ticker symbol: {symbol}")
|
||
|
||
# Prevent path traversal patterns
|
||
if '..' in symbol or '/' in symbol or '\\' in symbol:
|
||
raise ValueError(f"Invalid ticker symbol: {symbol}")
|
||
|
||
# Limit length (typical tickers are 1-5 characters, extended can be longer)
|
||
if len(symbol) > 10:
|
||
raise ValueError(f"Ticker symbol too long: {symbol}")
|
||
|
||
return symbol.upper() # Normalize to uppercase
|
||
|
||
|
||
def get_YFin_data_window(
|
||
symbol: Annotated[str, "ticker symbol of the company"],
|
||
curr_date: Annotated[str, "Start date in yyyy-mm-dd format"],
|
||
look_back_days: Annotated[int, "how many days to look back"],
|
||
) -> str:
|
||
# Validate ticker symbol
|
||
symbol = validate_ticker_symbol(symbol) # ✓ SECURE
|
||
|
||
# calculate past days
|
||
date_obj = datetime.strptime(curr_date, "%Y-%m-%d")
|
||
before = date_obj - relativedelta(days=look_back_days)
|
||
start_date = before.strftime("%Y-%m-%d")
|
||
|
||
# read in data
|
||
data = pd.read_csv(
|
||
os.path.join(
|
||
DATA_DIR,
|
||
f"market_data/price_data/{symbol}-YFin-data-2015-01-01-2025-03-25.csv", # ✓ SAFE NOW
|
||
)
|
||
)
|
||
```
|
||
|
||
### Additional Changes Required
|
||
Apply the `validate_ticker_symbol()` call to ALL functions in `local.py` that accept a ticker parameter:
|
||
- `get_YFin_data()` - line 51
|
||
- `get_finnhub_news()` - line 85
|
||
- `get_finnhub_company_insider_sentiment()` - line 120
|
||
- `get_finnhub_company_insider_transactions()` - line 157
|
||
- `get_data_in_range()` - line 194
|
||
- `get_simfin_balance_sheet()` - line 227
|
||
- `get_simfin_cashflow()` - line 274
|
||
- `get_simfin_income_statements()` - line 321
|
||
|
||
**Pattern to apply:**
|
||
```python
|
||
def function_name(ticker: str, ...):
|
||
ticker = validate_ticker_symbol(ticker) # Add this as first line
|
||
# ... rest of function
|
||
```
|
||
|
||
---
|
||
|
||
## Fix 3: CLI Input Validation
|
||
|
||
**File**: `/cli/main.py`
|
||
**Lines**: 499-501, 438
|
||
**Severity**: HIGH - Entry point for malicious input
|
||
**Time to Apply**: 3-5 minutes
|
||
|
||
### Why This Matters
|
||
The CLI accepts ticker symbols without validation, which feeds directly into the vulnerable file path operations in `local.py`. This is the primary attack vector.
|
||
|
||
### BEFORE
|
||
```python
|
||
def get_ticker():
|
||
"""Get ticker symbol from user input."""
|
||
return typer.prompt("", default="SPY") # ⚠️ NO VALIDATION
|
||
```
|
||
|
||
### AFTER
|
||
```python
|
||
def get_ticker():
|
||
"""Get ticker symbol from user input with validation."""
|
||
while True:
|
||
ticker = typer.prompt("", default="SPY")
|
||
try:
|
||
# Validate ticker format (alphanumeric, dots, hyphens only)
|
||
if not ticker or len(ticker) > 10:
|
||
console.print("[red]Error: Ticker must be 1-10 characters[/red]")
|
||
continue
|
||
|
||
# Check for path traversal attempts
|
||
if '..' in ticker or '/' in ticker or '\\' in ticker:
|
||
console.print("[red]Error: Invalid characters in ticker symbol[/red]")
|
||
continue
|
||
|
||
# Validate characters
|
||
if not all(c.isalnum() or c in '.-' for c in ticker):
|
||
console.print("[red]Error: Ticker can only contain letters, numbers, dots, and hyphens[/red]")
|
||
continue
|
||
|
||
return ticker.upper() # ✓ SECURE AND NORMALIZED
|
||
except Exception as e:
|
||
console.print(f"[red]Error validating ticker: {e}[/red]")
|
||
```
|
||
|
||
---
|
||
|
||
## Testing Recommendations
|
||
|
||
After applying these fixes, test with these attack vectors to ensure they're blocked:
|
||
|
||
```bash
|
||
# Test CLI with malicious input
|
||
python -m cli.main analyze
|
||
# Try entering: ../../etc/passwd
|
||
# Try entering: ../../../sensitive_file
|
||
# Try entering: AAPL/../../../etc/hosts
|
||
|
||
# Test programmatically
|
||
python -c "
|
||
from tradingagents.dataflows.local import validate_ticker_symbol
|
||
try:
|
||
validate_ticker_symbol('../../etc/passwd')
|
||
print('FAIL: Attack not blocked')
|
||
except ValueError:
|
||
print('PASS: Attack blocked')
|
||
"
|
||
```
|
||
|
||
---
|
||
|
||
## Summary
|
||
|
||
| Fix | File | Lines Changed | Time | Risk Reduced |
|
||
|-----|------|---------------|------|--------------|
|
||
| ChromaDB Reset | `memory.py` | 1 | 2 min | Data loss |
|
||
| Path Traversal | `local.py` | ~30 | 10 min | File access |
|
||
| CLI Validation | `cli/main.py` | ~20 | 5 min | Attack vector |
|
||
|
||
**Total Estimated Time**: 15-20 minutes
|
||
**Security Impact**: Prevents critical path traversal and data loss vulnerabilities
|
||
|
||
---
|
||
|
||
## References
|
||
|
||
- CWE-22: Improper Limitation of a Pathname to a Restricted Directory ('Path Traversal')
|
||
- CWE-73: External Control of File Name or Path
|
||
- OWASP Top 10: A01:2021 – Broken Access Control
|