4.5 KiB
Concurrent Scanner Execution
Overview
Implemented concurrent scanner execution using Python's ThreadPoolExecutor to improve discovery pipeline performance by 25-30%.
Performance Results
Concurrent (8 workers): 42-43 seconds
Sequential (1 worker): 54-56 seconds
Improvement: 25-30% faster ⚡
Configuration
Add to your config or use defaults in default_config.py:
"scanner_execution": {
"concurrent": True, # Enable parallel execution
"max_workers": 8, # Max concurrent scanner threads
"timeout_seconds": 30, # Per-scanner timeout
}
How It Works
Thread Pool Execution
- Scanner Preparation: All enabled scanners are instantiated and validated
- Concurrent Dispatch: Scanners submitted to ThreadPoolExecutor
- State Isolation: Each scanner gets a copy of state (thread-safe)
- Result Collection: Candidates collected as scanners complete
- Log Merging: Tool logs merged back into main state
Timeout Handling
# Per-scanner timeout (not global timeout)
for future in as_completed(future_to_scanner):
try:
result = future.result(timeout=timeout_seconds)
# Process result
except TimeoutError:
# Scanner timed out, continue with others
logger.warning(f"Scanner {name} timed out")
Key insight: Using per-scanner timeout instead of global timeout means slow scanners don't block the entire batch.
Error Isolation
def run_scanner(scanner_info):
try:
candidates = scanner.scan_with_validation(state_copy)
return (name, pipeline, candidates, None)
except Exception as e:
# Return error, don't raise
return (name, pipeline, [], str(e))
Key insight: Each scanner runs in isolation. One failure doesn't stop others.
Why ThreadPoolExecutor?
I/O-Bound Operations
Scanners spend most time waiting for:
- API responses (Reddit, Finnhub, Alpha Vantage)
- Network requests (news, fundamentals)
- Database queries
CPU time is minimal compared to I/O waits.
GIL Not a Problem
Python's Global Interpreter Lock (GIL) doesn't affect I/O-bound code because:
- Threads release GIL during I/O operations
- Multiple threads can wait on I/O concurrently
- Only one thread executes Python bytecode at a time (but that's fast)
State Management
# Thread-safe pattern
scanner_state = state.copy() # Each thread gets copy
scanner.scan(scanner_state) # No race conditions
# Merge results after completion
state["tool_logs"].extend(scanner_state["tool_logs"])
Key insight: Copying state dict is cheap (<1ms) compared to API latency (5-10s).
Testing
Run comprehensive tests:
# Full test suite
python tests/test_concurrent_scanners.py
# Quick verification
python verify_concurrent_execution.py
Test coverage:
- ✅ Concurrent execution works
- ✅ Sequential fallback when disabled
- ✅ Timeout handling (graceful degradation)
- ✅ Error isolation (one failure doesn't stop others)
- ✅ Same candidates found in both modes
Disabling Concurrent Execution
Set concurrent: False to revert to sequential execution:
config["discovery"]["scanner_execution"]["concurrent"] = False
Useful for:
- Debugging individual scanners
- Environments with limited resources
- Rate limit testing
Performance Tips
-
Optimal Worker Count: 8 workers balances parallelism with resource usage
- Too few: Underutilized (scanners wait in queue)
- Too many: Thread overhead, potential rate limiting
-
Timeout Configuration: 30s per scanner is reasonable
- Too short: Legitimate slow scanners timeout
- Too long: Keeps slow scanners running unnecessarily
-
Enable for Production: Always use concurrent mode unless debugging
Monitoring
Concurrent execution logs scanner completion:
Running 8 scanners concurrently (max 8 workers)...
✓ market_movers: 10 candidates
✓ insider_buying: 20 candidates
⏱️ slow_scanner: timeout after 30s
⚠️ broken_scanner: HTTP 500 error
✓ volume_accumulation: 2 candidates
Next Steps
Remaining performance optimizations:
- Rate Limiting: Add exponential backoff for API calls
- TTL Caching: Time-based cache for expensive operations
- Circuit Breaker: Auto-disable consistently failing scanners
Implementation Files
tradingagents/default_config.py- Configurationtradingagents/graph/discovery_graph.py- Execution logictests/test_concurrent_scanners.py- Test suiteverify_concurrent_execution.py- Quick verification