diff --git a/docs/architecture/orchestrator-validation.md b/docs/architecture/orchestrator-validation.md new file mode 100644 index 00000000..52b8f431 --- /dev/null +++ b/docs/architecture/orchestrator-validation.md @@ -0,0 +1,299 @@ +# Orchestrator Configuration Validation + +Status: implemented (2026-04-16) +Audience: orchestrator users, backend maintainers +Scope: LLMRunner configuration validation and error classification + +## Overview + +`orchestrator/llm_runner.py` implements three layers of configuration validation to catch errors before expensive graph initialization or API calls: + +1. **Provider × Base URL Matrix Validation** - detects provider/endpoint mismatches +2. **Timeout Configuration Validation** - warns when timeouts may be insufficient +3. **Runtime Error Classification** - categorizes failures into actionable reason codes + +## 1. Provider × Base URL Matrix Validation + +### Purpose + +Prevent wasted initialization time and API calls when provider and base_url are incompatible. + +### Implementation + +`LLMRunner._detect_provider_mismatch()` validates provider × base_url combinations using a pattern matrix: + +```python +_PROVIDER_BASE_URL_PATTERNS = { + "anthropic": [r"api\.anthropic\.com", r"api\.minimaxi\.com/anthropic"], + "openai": [r"api\.openai\.com"], + "google": [r"generativelanguage\.googleapis\.com"], + "xai": [r"api\.x\.ai"], + "ollama": [r"localhost:\d+", r"127\.0\.0\.1:\d+", r"ollama"], + "openrouter": [r"openrouter\.ai"], +} +``` + +### Validation Logic + +1. Extract `llm_provider` and `backend_url` from `trading_agents_config` +2. Look up expected URL patterns for the provider +3. Check if `backend_url` matches any expected pattern (regex) +4. If no match found, return mismatch details before graph initialization + +### Error Response + +When mismatch detected, `get_signal()` returns: + +```python +Signal( + degraded=True, + reason_code="provider_mismatch", + metadata={ + "data_quality": { + "state": "provider_mismatch", + "provider": "google", + "backend_url": "https://api.openai.com/v1", + "expected_patterns": [r"generativelanguage\.googleapis\.com"], + } + } +) +``` + +### Examples + +**Valid configurations:** +- `anthropic` + `https://api.minimaxi.com/anthropic` ✓ +- `openai` + `https://api.openai.com/v1` ✓ +- `ollama` + `http://localhost:11434` ✓ + +**Invalid configurations (detected):** +- `google` + `https://api.openai.com/v1` → `provider_mismatch` +- `xai` + `https://api.minimaxi.com/anthropic` → `provider_mismatch` +- `ollama` + `https://api.openai.com/v1` → `provider_mismatch` + +### Design Notes + +- Uses **original provider name** (not canonical) for validation + - `ollama`, `openrouter`, and `openai` share the same canonical provider (`openai`) but have different URL patterns + - Validation must distinguish between them +- Validation runs **before** `TradingAgentsGraph` initialization + - Saves ~5-10s of initialization time on mismatch + - Avoids confusing error messages from LangChain/provider SDKs + +## 2. Timeout Configuration Validation + +### Purpose + +Warn users when timeout settings may be insufficient for their analyst profile, preventing unexpected research degradation. + +### Implementation + +`LLMRunner._validate_timeout_config()` checks timeout sufficiency based on analyst count: + +```python +_RECOMMENDED_TIMEOUTS = { + 1: {"analyst": 75.0, "research": 30.0}, # single analyst + 2: {"analyst": 90.0, "research": 45.0}, # two analysts + 3: {"analyst": 105.0, "research": 60.0}, # three analysts + 4: {"analyst": 120.0, "research": 75.0}, # four analysts +} +``` + +### Validation Logic + +1. Extract `selected_analysts` from `trading_agents_config` (default: 4 analysts) +2. Extract `analyst_node_timeout_secs` and `research_node_timeout_secs` +3. Compare against recommended thresholds for analyst count +4. Log `WARNING` if configured timeout < recommended threshold + +### Warning Example + +``` +LLMRunner: analyst_node_timeout_secs=75.0s may be insufficient for 4 analyst(s) (recommended: 120.0s) +``` + +### Design Notes + +- **Non-blocking validation** - logs warning but does not prevent initialization + - Different LLM providers have vastly different speeds (MiniMax vs OpenAI) + - Users may have profiled their specific setup and chosen lower timeouts intentionally +- **Conservative recommendations** - thresholds assume slower providers + - Based on real profiling data from MiniMax Anthropic-compatible endpoint + - Users with faster providers can safely ignore warnings +- **Runs at `__init__` time** - warns early, before any API calls + +### Timeout Calculation Rationale + +Multi-analyst execution is **serial** for analysts, **parallel** for research: + +``` +Total time ≈ (analyst_count × analyst_timeout) + research_timeout + trading + risk + portfolio +``` + +For 4 analysts with 75s timeout each: +- Analyst phase: ~300s (serial) +- Research phase: ~30s (parallel bull/bear) +- Trading phase: ~15s +- Risk phase: ~10s +- Portfolio phase: ~10s +- **Total: ~365s** (6+ minutes) + +Recommended 120s per analyst assumes: +- Some analysts may timeout and degrade +- Degraded path still completes within timeout +- Total execution stays under reasonable bounds (~8-10 minutes) + +## 3. Runtime Error Classification + +### Purpose + +Categorize runtime failures into actionable reason codes for debugging and monitoring. + +### Error Taxonomy + +Defined in `orchestrator/contracts/error_taxonomy.py`: + +```python +class ReasonCode(str, Enum): + CONFIG_INVALID = "config_invalid" + PROVIDER_MISMATCH = "provider_mismatch" + PROVIDER_AUTH_FAILED = "provider_auth_failed" + LLM_INIT_FAILED = "llm_init_failed" + LLM_SIGNAL_FAILED = "llm_signal_failed" + LLM_UNKNOWN_RATING = "llm_unknown_rating" + # ... (quant-related codes omitted) +``` + +### Classification Logic + +`LLMRunner.get_signal()` catches exceptions from `propagate()` and classifies them: + +1. **Provider mismatch** (pre-initialization) + - Detected by `_detect_provider_mismatch()` before graph creation + - Returns `provider_mismatch` immediately + +2. **Provider auth failure** (runtime) + - Detected by `_looks_like_provider_auth_failure()` heuristic + - Markers: `"authentication_error"`, `"login fail"`, `"invalid api key"`, `"unauthorized"`, `"error code: 401"` + - Returns `provider_auth_failed` + +3. **Generic LLM failure** (runtime) + - Any other exception from `propagate()` + - Returns `llm_signal_failed` + +### Error Response Structure + +All error signals include: + +```python +Signal( + degraded=True, + reason_code="", + direction=0, + confidence=0.0, + metadata={ + "error": "", + "data_quality": { + "state": "", + # ... additional context + } + } +) +``` + +### Design Notes + +- **Fail-fast on config errors** - mismatch detected before expensive operations +- **Heuristic auth detection** - no API call overhead, relies on error message patterns +- **Structured metadata** - `data_quality.state` mirrors `reason_code` for consistency + +## 4. Testing + +### Test Coverage + +`orchestrator/tests/test_llm_runner.py` includes: + +**Provider matrix validation:** +- `test_detect_provider_mismatch_google_with_openai_url` +- `test_detect_provider_mismatch_xai_with_anthropic_url` +- `test_detect_provider_mismatch_ollama_with_openai_url` +- `test_detect_provider_mismatch_valid_anthropic_minimax` +- `test_detect_provider_mismatch_valid_openai` + +**Timeout validation:** +- `test_timeout_validation_warns_for_multiple_analysts_low_timeout` +- `test_timeout_validation_no_warn_for_single_analyst` +- `test_timeout_validation_no_warn_for_sufficient_timeout` + +**Error classification:** +- `test_get_signal_classifies_provider_auth_failure` +- `test_get_signal_returns_provider_mismatch_before_graph_init` +- `test_get_signal_returns_reason_code_on_propagate_failure` + +### Running Tests + +```bash +cd /path/to/TradingAgents +python -m pytest orchestrator/tests/test_llm_runner.py -v +``` + +## 5. Maintenance + +### Adding New Providers + +When adding a new provider to `tradingagents/llm_clients/factory.py`: + +1. Add URL pattern to `_PROVIDER_BASE_URL_PATTERNS` in `llm_runner.py` +2. Add test cases for valid and invalid configurations +3. Update this documentation + +### Adjusting Timeout Recommendations + +If profiling shows different timeout requirements: + +1. Update `_RECOMMENDED_TIMEOUTS` in `llm_runner.py` +2. Document rationale in this file +3. Update test expectations if needed + +### Extending Error Classification + +To add new reason codes: + +1. Add to `ReasonCode` enum in `contracts/error_taxonomy.py` +2. Add detection logic in `LLMRunner.get_signal()` +3. Add test case in `test_llm_runner.py` +4. Update this documentation + +## 6. Known Limitations + +### API Key Validation + +Current implementation does **not** validate API key validity before graph initialization: + +- **Limitation**: Expired/invalid keys are only detected during first `propagate()` call +- **Impact**: ~5-10s wasted on graph initialization before auth failure +- **Rationale**: Lightweight key validation would require provider-specific API calls, adding latency and complexity +- **Mitigation**: Auth failures are still classified correctly as `provider_auth_failed` + +### Provider Pattern Maintenance + +URL patterns must be manually kept in sync with provider changes: + +- **Risk**: Provider changes base URL structure (e.g., API versioning) +- **Mitigation**: Validation is non-blocking; mismatches are logged but don't prevent operation +- **Future**: Consider moving patterns to `tradingagents/llm_clients/factory.py` as part of `ProviderSpec` + +### Timeout Recommendations + +Recommendations are based on MiniMax profiling and may not generalize: + +- **Risk**: Faster providers (OpenAI GPT-4) may trigger unnecessary warnings +- **Mitigation**: Warnings are advisory only; users can ignore if they've profiled their setup +- **Future**: Consider provider-specific timeout recommendations + +## 7. Related Documentation + +- `docs/contracts/result-contract-v1alpha1.md` - Signal contract structure +- `docs/architecture/research-provenance.md` - Research degradation semantics +- `docs/migration/rollback-notes.md` - Backend migration status +- `orchestrator/contracts/error_taxonomy.py` - Complete reason code list diff --git a/docs/migration/rollback-notes.md b/docs/migration/rollback-notes.md index e973f24d..ffad0844 100644 --- a/docs/migration/rollback-notes.md +++ b/docs/migration/rollback-notes.md @@ -14,13 +14,19 @@ Mainline has moved beyond pure planning, but it has not finished the full bounda - result contracts are persisted via `result_store.py`; - `/ws/analysis/{task_id}` and `/ws/orchestrator` already wrap payloads with `contract_version`; - recommendation and task-status reads already depend on application-layer shaping more than route-local reconstruction. -- `Phase 5` is **not complete**: - - `web_dashboard/backend/main.py` is still too large; - - route-local orchestration has not been fully deleted; +- `Phase 5` is **partially landed** via the task lifecycle boundary slice: + - `status/list/cancel` now route through backend task services instead of route-local orchestration; + - `web_dashboard/backend/main.py` is still too large outside that slice; + - reports/export and other residual route-local orchestration are still pending; - compatibility fields still coexist with the newer contract-first path. Also note that research provenance / node guard / profiling work is now landed on the orchestrator side. That effort complements the backend migration but should not be confused with “application boundary fully complete.” +**Recent improvements (2026-04-16)**: +- Orchestrator error classification now includes comprehensive provider × base_url matrix validation +- Timeout configuration validation warns when analyst/research timeouts may be insufficient for multi-analyst profiles +- All provider mismatches (anthropic, openai, google, xai, ollama, openrouter) are now detected before graph initialization + ## 1. Migration objective Move backend delivery code from route-local orchestration to an application-service layer without changing the quant+LLM merge kernel behavior. @@ -80,6 +86,7 @@ Rollback: Current status: - partially complete on mainline via `analysis_service.py`, `job_service.py`, and `result_store.py` +- task lifecycle (`status/list/cancel`) is now service-routed - not complete enough yet to claim `main.py` is only a thin adapter ## Phase 2: dual-read for task status diff --git a/orchestrator/contracts/error_taxonomy.py b/orchestrator/contracts/error_taxonomy.py index d6f1fc3d..d733bcfa 100644 --- a/orchestrator/contracts/error_taxonomy.py +++ b/orchestrator/contracts/error_taxonomy.py @@ -14,6 +14,7 @@ class ReasonCode(str, Enum): LLM_SIGNAL_FAILED = "llm_signal_failed" LLM_UNKNOWN_RATING = "llm_unknown_rating" PROVIDER_MISMATCH = "provider_mismatch" + PROVIDER_AUTH_FAILED = "provider_auth_failed" BOTH_SIGNALS_UNAVAILABLE = "both_signals_unavailable" diff --git a/orchestrator/examples/validation_examples.py b/orchestrator/examples/validation_examples.py new file mode 100644 index 00000000..ddd8f151 --- /dev/null +++ b/orchestrator/examples/validation_examples.py @@ -0,0 +1,150 @@ +#!/usr/bin/env python3 +""" +Orchestrator configuration validation examples. + +Demonstrates provider mismatch detection and timeout validation. +""" + +import logging +import sys +from pathlib import Path + +# Add parent directories to path +repo_root = Path(__file__).parent.parent.parent +sys.path.insert(0, str(repo_root)) + +from orchestrator.config import OrchestratorConfig +from orchestrator.llm_runner import LLMRunner + +logging.basicConfig(level=logging.WARNING, format='%(levelname)s: %(message)s') + + +def example_1_provider_mismatch(): + """Example 1: Provider mismatch detection.""" + print("=" * 60) + print("Example 1: Provider Mismatch Detection") + print("=" * 60) + + # Invalid: Google provider with OpenAI URL + cfg = OrchestratorConfig( + cache_dir="/tmp/orchestrator_validation_example", + trading_agents_config={ + "llm_provider": "google", + "backend_url": "https://api.openai.com/v1", + }, + ) + + runner = LLMRunner(cfg) + signal = runner.get_signal("AAPL", "2024-01-02") + + print(f"\nConfiguration:") + print(f" Provider: google") + print(f" Base URL: https://api.openai.com/v1") + print(f"\nResult:") + print(f" Degraded: {signal.degraded}") + print(f" Reason: {signal.reason_code}") + print(f" Message: {signal.metadata.get('error', 'N/A')}") + print(f" Expected patterns: {signal.metadata.get('data_quality', {}).get('expected_patterns', [])}") + print() + + +def example_2_valid_configuration(): + """Example 2: Valid configuration (no mismatch).""" + print("=" * 60) + print("Example 2: Valid Configuration") + print("=" * 60) + + # Valid: Anthropic provider with MiniMax Anthropic-compatible URL + cfg = OrchestratorConfig( + cache_dir="/tmp/orchestrator_validation_example", + trading_agents_config={ + "llm_provider": "anthropic", + "backend_url": "https://api.minimaxi.com/anthropic", + "selected_analysts": ["market"], + "analyst_node_timeout_secs": 75.0, + }, + ) + + runner = LLMRunner(cfg) + mismatch = runner._detect_provider_mismatch() + + print(f"\nConfiguration:") + print(f" Provider: anthropic") + print(f" Base URL: https://api.minimaxi.com/anthropic") + print(f" Selected analysts: ['market']") + print(f" Analyst timeout: 75.0s") + print(f"\nResult:") + print(f" Mismatch detected: {mismatch is not None}") + if mismatch: + print(f" Details: {mismatch}") + else: + print(f" Status: Configuration is valid ✓") + print() + + +def example_3_timeout_warning(): + """Example 3: Timeout configuration warning.""" + print("=" * 60) + print("Example 3: Timeout Configuration Warning") + print("=" * 60) + + # Warning: 4 analysts with insufficient timeout + print("\nConfiguration:") + print(f" Provider: anthropic") + print(f" Base URL: https://api.minimaxi.com/anthropic") + print(f" Selected analysts: ['market', 'social', 'news', 'fundamentals']") + print(f" Analyst timeout: 75.0s (recommended: 120.0s)") + print(f"\nExpected warning:") + + cfg = OrchestratorConfig( + cache_dir="/tmp/orchestrator_validation_example", + trading_agents_config={ + "llm_provider": "anthropic", + "backend_url": "https://api.minimaxi.com/anthropic", + "selected_analysts": ["market", "social", "news", "fundamentals"], + "analyst_node_timeout_secs": 75.0, + }, + ) + + # Warning will be logged during initialization + runner = LLMRunner(cfg) + print() + + +def example_4_multiple_mismatches(): + """Example 4: Multiple provider mismatch scenarios.""" + print("=" * 60) + print("Example 4: Multiple Provider Mismatch Scenarios") + print("=" * 60) + + scenarios = [ + ("xai", "https://api.minimaxi.com/anthropic"), + ("ollama", "https://api.openai.com/v1"), + ("openrouter", "https://api.anthropic.com/v1"), + ] + + for provider, url in scenarios: + cfg = OrchestratorConfig( + cache_dir="/tmp/orchestrator_validation_example", + trading_agents_config={ + "llm_provider": provider, + "backend_url": url, + }, + ) + + runner = LLMRunner(cfg) + signal = runner.get_signal("AAPL", "2024-01-02") + + print(f"\n {provider} + {url}") + print(f" → Degraded: {signal.degraded}, Reason: {signal.reason_code}") + + +if __name__ == "__main__": + example_1_provider_mismatch() + example_2_valid_configuration() + example_3_timeout_warning() + example_4_multiple_mismatches() + + print("=" * 60) + print("All examples completed") + print("=" * 60) diff --git a/orchestrator/llm_runner.py b/orchestrator/llm_runner.py index c14f5ce2..53e165da 100644 --- a/orchestrator/llm_runner.py +++ b/orchestrator/llm_runner.py @@ -1,6 +1,7 @@ import json import logging import os +import re from datetime import datetime, timezone from orchestrator.config import OrchestratorConfig @@ -10,6 +11,31 @@ from tradingagents.agents.utils.agent_states import extract_research_provenance logger = logging.getLogger(__name__) +# Provider × base_url validation matrix +# Note: ollama/openrouter share openai's canonical provider but have different URL patterns +_PROVIDER_BASE_URL_PATTERNS = { + "anthropic": [r"api\.anthropic\.com", r"api\.minimaxi\.com/anthropic"], + "openai": [r"api\.openai\.com"], + "google": [r"generativelanguage\.googleapis\.com"], + "xai": [r"api\.x\.ai"], + "ollama": [r"localhost:\d+", r"127\.0\.0\.1:\d+", r"ollama"], + "openrouter": [r"openrouter\.ai"], +} + +# Precompile regex patterns for efficiency +_COMPILED_PATTERNS = { + provider: [re.compile(pattern) for pattern in patterns] + for provider, patterns in _PROVIDER_BASE_URL_PATTERNS.items() +} + +# Recommended timeout thresholds by analyst count +_RECOMMENDED_TIMEOUTS = { + 1: {"analyst": 75.0, "research": 30.0}, + 2: {"analyst": 90.0, "research": 45.0}, + 3: {"analyst": 105.0, "research": 60.0}, + 4: {"analyst": 120.0, "research": 75.0}, +} + def _build_data_quality(state: str, **details): payload = {"state": state} @@ -24,12 +50,53 @@ def _extract_research_metadata(final_state: dict | None) -> dict | None: return extract_research_provenance(debate_state) +def _looks_like_provider_auth_failure(exc: Exception) -> bool: + text = str(exc).lower() + markers = ( + "authentication_error", + "login fail", + "please carry the api secret key", + "invalid api key", + "unauthorized", + "error code: 401", + ) + return any(marker in text for marker in markers) + + class LLMRunner: def __init__(self, config: OrchestratorConfig): self._config = config self._graph = None # Lazy-initialized on first get_signal() call (requires API key) self.cache_dir = config.cache_dir os.makedirs(self.cache_dir, exist_ok=True) + self._validate_timeout_config() + + def _validate_timeout_config(self): + """Warn if timeout configuration may be insufficient for selected analysts.""" + trading_cfg = self._config.trading_agents_config or {} + selected_analysts = trading_cfg.get("selected_analysts", ["market", "social", "news", "fundamentals"]) + analyst_count = len(selected_analysts) if selected_analysts else 4 + + analyst_timeout = float(trading_cfg.get("analyst_node_timeout_secs", 75.0)) + research_timeout = float(trading_cfg.get("research_node_timeout_secs", 30.0)) + + # Get recommended thresholds (use max if analyst_count > 4) + recommended = _RECOMMENDED_TIMEOUTS.get(analyst_count, _RECOMMENDED_TIMEOUTS[4]) + + warnings = [] + if analyst_timeout < recommended["analyst"]: + warnings.append( + f"analyst_node_timeout_secs={analyst_timeout:.1f}s may be insufficient " + f"for {analyst_count} analyst(s) (recommended: {recommended['analyst']:.1f}s)" + ) + if research_timeout < recommended["research"]: + warnings.append( + f"research_node_timeout_secs={research_timeout:.1f}s may be insufficient " + f"for {analyst_count} analyst(s) (recommended: {recommended['research']:.1f}s)" + ) + + for warning in warnings: + logger.warning("LLMRunner: %s", warning) def _get_graph(self): """Lazy-initialize TradingAgentsGraph (heavy, requires API key at init time).""" @@ -43,42 +110,39 @@ class LLMRunner: return self._graph def _detect_provider_mismatch(self): + """Validate provider × base_url compatibility using pattern matrix. + + Uses the original provider name (not canonical) for validation since + ollama/openrouter share openai's canonical provider but have different URLs. + """ trading_cfg = self._config.trading_agents_config or {} provider = str(trading_cfg.get("llm_provider", "")).lower() base_url = str(trading_cfg.get("backend_url", "") or "").lower() + if not provider or not base_url: return None - if provider == "anthropic" and "/anthropic" not in base_url: - return { - "provider": provider, - "backend_url": trading_cfg.get("backend_url"), - } - if provider in {"openai", "openrouter", "ollama", "xai"} and "/anthropic" in base_url: - return { - "provider": provider, - "backend_url": trading_cfg.get("backend_url"), - } - return None + + # Use original provider name for pattern matching (not canonical) + # This handles ollama/openrouter which share openai's canonical provider + compiled_patterns = _COMPILED_PATTERNS.get(provider, []) + if not compiled_patterns: + # No validation rules defined for this provider + return None + + for pattern in compiled_patterns: + if pattern.search(base_url): + return None # Match found, no mismatch + + # No pattern matched - return raw patterns for error message + return { + "provider": provider, + "backend_url": trading_cfg.get("backend_url"), + "expected_patterns": _PROVIDER_BASE_URL_PATTERNS[provider], + } def get_signal(self, ticker: str, date: str) -> Signal: """获取指定股票在指定日期的 LLM 信号,带缓存。""" - safe_ticker = ticker.replace("/", "_") # sanitize for filesystem (e.g. BRK/B) - cache_path = os.path.join(self.cache_dir, f"{safe_ticker}_{date}.json") - - if os.path.exists(cache_path): - logger.info("LLMRunner: cache hit for %s %s", ticker, date) - with open(cache_path, "r", encoding="utf-8") as f: - data = json.load(f) - # Use stored direction/confidence directly to avoid re-mapping drift - return Signal( - ticker=ticker, - direction=data["direction"], - confidence=data["confidence"], - source="llm", - timestamp=datetime.fromisoformat(data["timestamp"]), - metadata=data, - ) - + # Validate configuration first (lightweight, prevents returning stale cache on config errors) mismatch = self._detect_provider_mismatch() if mismatch is not None: return build_error_signal( @@ -94,6 +158,25 @@ class LLMRunner: }, ) + # Check cache after validation + safe_ticker = ticker.replace("/", "_") + cache_path = os.path.join(self.cache_dir, f"{safe_ticker}_{date}.json") + + try: + with open(cache_path, "r", encoding="utf-8") as f: + data = json.load(f) + logger.info("LLMRunner: cache hit for %s %s", ticker, date) + return Signal( + ticker=ticker, + direction=data["direction"], + confidence=data["confidence"], + source="llm", + timestamp=datetime.fromisoformat(data["timestamp"]), + metadata=data, + ) + except FileNotFoundError: + pass # Continue to LLM call + try: _final_state, processed_signal = self._get_graph().propagate(ticker, date) rating = processed_signal if isinstance(processed_signal, str) else str(processed_signal) @@ -118,6 +201,11 @@ class LLMRunner: "timestamp": now.isoformat(), "ticker": ticker, "date": date, + "decision_structured": ( + (_final_state or {}).get("final_trade_decision_structured") + if isinstance(_final_state, dict) + else None + ), "data_quality": data_quality, "research": research_metadata, "sample_quality": ( @@ -142,6 +230,16 @@ class LLMRunner: reason_code = ReasonCode.LLM_SIGNAL_FAILED.value if "Unsupported LLM provider" in str(e): reason_code = ReasonCode.PROVIDER_MISMATCH.value + elif _looks_like_provider_auth_failure(e): + reason_code = ReasonCode.PROVIDER_AUTH_FAILED.value + + # Map reason code to data quality state + state_map = { + ReasonCode.PROVIDER_MISMATCH.value: "provider_mismatch", + ReasonCode.PROVIDER_AUTH_FAILED.value: "provider_auth_failed", + } + state = state_map.get(reason_code, "unknown") + return build_error_signal( ticker=ticker, source="llm", @@ -149,7 +247,7 @@ class LLMRunner: message=str(e), metadata={ "data_quality": _build_data_quality( - "provider_mismatch" if reason_code == ReasonCode.PROVIDER_MISMATCH.value else "unknown", + state, provider=(self._config.trading_agents_config or {}).get("llm_provider"), backend_url=(self._config.trading_agents_config or {}).get("backend_url"), ), diff --git a/orchestrator/tests/test_llm_runner.py b/orchestrator/tests/test_llm_runner.py index 23ddedac..c5889657 100644 --- a/orchestrator/tests/test_llm_runner.py +++ b/orchestrator/tests/test_llm_runner.py @@ -1,4 +1,5 @@ """Tests for LLMRunner.""" +import logging import sys from types import ModuleType @@ -9,9 +10,34 @@ from orchestrator.contracts.error_taxonomy import ReasonCode from orchestrator.llm_runner import LLMRunner +def _clear_runtime_llm_env(monkeypatch): + for env_name in ( + "TRADINGAGENTS_LLM_PROVIDER", + "TRADINGAGENTS_BACKEND_URL", + "TRADINGAGENTS_MODEL", + "TRADINGAGENTS_DEEP_MODEL", + "TRADINGAGENTS_QUICK_MODEL", + "ANTHROPIC_BASE_URL", + "OPENAI_BASE_URL", + "ANTHROPIC_API_KEY", + "MINIMAX_API_KEY", + "OPENAI_API_KEY", + ): + monkeypatch.delenv(env_name, raising=False) + + @pytest.fixture -def runner(tmp_path): - cfg = OrchestratorConfig(cache_dir=str(tmp_path)) +def runner(tmp_path, monkeypatch): + _clear_runtime_llm_env(monkeypatch) + cfg = OrchestratorConfig( + cache_dir=str(tmp_path), + trading_agents_config={ + "llm_provider": "anthropic", + "backend_url": "https://api.minimaxi.com/anthropic", + "deep_think_llm": "MiniMax-M2.7-highspeed", + "quick_think_llm": "MiniMax-M2.7-highspeed", + }, + ) return LLMRunner(cfg) @@ -69,11 +95,20 @@ def test_get_graph_preserves_explicit_empty_selected_analysts(monkeypatch, tmp_p def test_get_signal_returns_reason_code_on_propagate_failure(monkeypatch, tmp_path): + _clear_runtime_llm_env(monkeypatch) class BrokenGraph: def propagate(self, ticker, date): raise RuntimeError("graph unavailable") - cfg = OrchestratorConfig(cache_dir=str(tmp_path)) + cfg = OrchestratorConfig( + cache_dir=str(tmp_path), + trading_agents_config={ + "llm_provider": "anthropic", + "backend_url": "https://api.minimaxi.com/anthropic", + "deep_think_llm": "MiniMax-M2.7-highspeed", + "quick_think_llm": "MiniMax-M2.7-highspeed", + }, + ) runner = LLMRunner(cfg) monkeypatch.setattr(runner, "_get_graph", lambda: BrokenGraph()) @@ -84,6 +119,34 @@ def test_get_signal_returns_reason_code_on_propagate_failure(monkeypatch, tmp_pa assert signal.metadata["error"] == "graph unavailable" +def test_get_signal_classifies_provider_auth_failure(monkeypatch, tmp_path): + _clear_runtime_llm_env(monkeypatch) + + class BrokenGraph: + def propagate(self, ticker, date): + raise RuntimeError( + "Error code: 401 - {'type': 'error', 'error': {'type': 'authentication_error', 'message': \"login fail: Please carry the API secret key in the Authorization field\"}}" + ) + + cfg = OrchestratorConfig( + cache_dir=str(tmp_path), + trading_agents_config={ + "llm_provider": "anthropic", + "backend_url": "https://api.minimaxi.com/anthropic", + "deep_think_llm": "MiniMax-M2.7-highspeed", + "quick_think_llm": "MiniMax-M2.7-highspeed", + }, + ) + runner = LLMRunner(cfg) + monkeypatch.setattr(runner, "_get_graph", lambda: BrokenGraph()) + + signal = runner.get_signal("AAPL", "2024-01-02") + + assert signal.degraded is True + assert signal.reason_code == ReasonCode.PROVIDER_AUTH_FAILED.value + assert signal.metadata["data_quality"]["state"] == "provider_auth_failed" + + def test_get_signal_returns_provider_mismatch_before_graph_init(tmp_path): cfg = OrchestratorConfig( cache_dir=str(tmp_path), @@ -102,6 +165,7 @@ def test_get_signal_returns_provider_mismatch_before_graph_init(tmp_path): def test_get_signal_persists_research_provenance_on_success(monkeypatch, tmp_path): + _clear_runtime_llm_env(monkeypatch) class SuccessfulGraph: def propagate(self, ticker, date): return { @@ -113,9 +177,22 @@ def test_get_signal_persists_research_provenance_on_success(monkeypatch, tmp_pat "covered_dimensions": ["market"], "manager_confidence": None, } + , + "final_trade_decision_structured": { + "rating": "BUY", + "hold_subtype": "N/A", + }, }, "BUY" - cfg = OrchestratorConfig(cache_dir=str(tmp_path)) + cfg = OrchestratorConfig( + cache_dir=str(tmp_path), + trading_agents_config={ + "llm_provider": "anthropic", + "backend_url": "https://api.minimaxi.com/anthropic", + "deep_think_llm": "MiniMax-M2.7-highspeed", + "quick_think_llm": "MiniMax-M2.7-highspeed", + }, + ) runner = LLMRunner(cfg) monkeypatch.setattr(runner, "_get_graph", lambda: SuccessfulGraph()) @@ -125,3 +202,128 @@ def test_get_signal_persists_research_provenance_on_success(monkeypatch, tmp_pat assert signal.metadata["research"]["research_status"] == "degraded" assert signal.metadata["sample_quality"] == "degraded_research" assert signal.metadata["data_quality"]["state"] == "research_degraded" + assert signal.metadata["decision_structured"]["rating"] == "BUY" + + +# Phase 2: Provider matrix validation tests +def test_detect_provider_mismatch_google_with_openai_url(tmp_path): + cfg = OrchestratorConfig( + cache_dir=str(tmp_path), + trading_agents_config={ + "llm_provider": "google", + "backend_url": "https://api.openai.com/v1", + }, + ) + runner = LLMRunner(cfg) + signal = runner.get_signal("AAPL", "2024-01-02") + + assert signal.degraded is True + assert signal.reason_code == ReasonCode.PROVIDER_MISMATCH.value + + +def test_detect_provider_mismatch_xai_with_anthropic_url(tmp_path): + cfg = OrchestratorConfig( + cache_dir=str(tmp_path), + trading_agents_config={ + "llm_provider": "xai", + "backend_url": "https://api.minimaxi.com/anthropic", + }, + ) + runner = LLMRunner(cfg) + signal = runner.get_signal("AAPL", "2024-01-02") + + assert signal.degraded is True + assert signal.reason_code == ReasonCode.PROVIDER_MISMATCH.value + + +def test_detect_provider_mismatch_ollama_with_openai_url(tmp_path): + cfg = OrchestratorConfig( + cache_dir=str(tmp_path), + trading_agents_config={ + "llm_provider": "ollama", + "backend_url": "https://api.openai.com/v1", + }, + ) + runner = LLMRunner(cfg) + signal = runner.get_signal("AAPL", "2024-01-02") + + assert signal.degraded is True + assert signal.reason_code == ReasonCode.PROVIDER_MISMATCH.value + + +def test_detect_provider_mismatch_valid_anthropic_minimax(tmp_path): + cfg = OrchestratorConfig( + cache_dir=str(tmp_path), + trading_agents_config={ + "llm_provider": "anthropic", + "backend_url": "https://api.minimaxi.com/anthropic", + }, + ) + runner = LLMRunner(cfg) + mismatch = runner._detect_provider_mismatch() + + assert mismatch is None + + +def test_detect_provider_mismatch_valid_openai(tmp_path): + cfg = OrchestratorConfig( + cache_dir=str(tmp_path), + trading_agents_config={ + "llm_provider": "openai", + "backend_url": "https://api.openai.com/v1", + }, + ) + runner = LLMRunner(cfg) + mismatch = runner._detect_provider_mismatch() + + assert mismatch is None + + +# Phase 3: Timeout configuration validation tests +def test_timeout_validation_warns_for_multiple_analysts_low_timeout(tmp_path, caplog): + cfg = OrchestratorConfig( + cache_dir=str(tmp_path), + trading_agents_config={ + "llm_provider": "anthropic", + "backend_url": "https://api.minimaxi.com/anthropic", + "selected_analysts": ["market", "social", "news", "fundamentals"], + "analyst_node_timeout_secs": 75.0, + }, + ) + with caplog.at_level(logging.WARNING): + runner = LLMRunner(cfg) + + assert any("analyst_node_timeout_secs=75.0s may be insufficient" in record.message for record in caplog.records) + + +def test_timeout_validation_no_warn_for_single_analyst(tmp_path, caplog): + cfg = OrchestratorConfig( + cache_dir=str(tmp_path), + trading_agents_config={ + "llm_provider": "anthropic", + "backend_url": "https://api.minimaxi.com/anthropic", + "selected_analysts": ["market"], + "analyst_node_timeout_secs": 75.0, + }, + ) + with caplog.at_level(logging.WARNING): + runner = LLMRunner(cfg) + + assert not any("may be insufficient" in record.message for record in caplog.records) + + +def test_timeout_validation_no_warn_for_sufficient_timeout(tmp_path, caplog): + cfg = OrchestratorConfig( + cache_dir=str(tmp_path), + trading_agents_config={ + "llm_provider": "anthropic", + "backend_url": "https://api.minimaxi.com/anthropic", + "selected_analysts": ["market", "social", "news", "fundamentals"], + "analyst_node_timeout_secs": 120.0, + "research_node_timeout_secs": 75.0, + }, + ) + with caplog.at_level(logging.WARNING): + runner = LLMRunner(cfg) + + assert not any("may be insufficient" in record.message for record in caplog.records)