38 KiB
TradingAgents — Test Suite Reference
Last verified: 2026-03-24
Test counts (current): 793 passed · 14 skipped (unit+portfolio+cli default run) · 101 integration · 1 e2e
Table of Contents
- Overview
- Three-Tier Architecture
- Libraries and Tools
- Fixtures Reference
- Markers Reference
- Test File Catalogue
- Execution Flow Diagrams
- How to Run Tests
- Mock Patterns
- What is MagicMock?
- Adding New Tests — Checklist
Overview
The test suite enforces a strict network isolation policy: the default pytest run
(used in CI) cannot make any real socket connections. Tests that need live APIs are
placed in separate directories and are excluded from the default run via addopts in
pyproject.toml.
tests/
├── conftest.py ← root fixtures (shared across all tiers)
├── unit/ ← offline, <5 s total, default run
│ ├── conftest.py ← mock factories (yfinance, AV, LLM)
│ ├── agents/ ← agent-specific unit tests
│ └── test_*.py
├── integration/ ← live APIs, excluded from default run
│ ├── conftest.py ← VCR config + live key fixtures
│ └── test_*_live.py
├── e2e/ ← real LLM pipeline, manual only
│ ├── conftest.py
│ └── test_*.py
├── portfolio/ ← portfolio module unit tests (no DB)
│ ├── conftest.py
│ └── test_*.py
├── cli/ ← CLI module unit tests
│ └── test_*.py
└── cassettes/ ← recorded HTTP responses (VCR)
Three-Tier Architecture
| Tier | Directory | Default run? | Network? | Speed | Purpose |
|---|---|---|---|---|---|
| Unit | tests/unit/, tests/portfolio/, tests/cli/ |
✅ yes | ❌ blocked by pytest-socket |
< 5 s | Validate logic, parsing, routing with mocks |
| Integration | tests/integration/ |
❌ ignored | ✅ real APIs | seconds–minutes | Validate vendor API contracts, live data shapes |
| E2E | tests/e2e/ |
❌ ignored | ✅ real LLM + APIs | minutes | Validate the full multi-agent pipeline |
Why three tiers?
- Fast feedback loop — developers get a pass/fail signal in under 5 seconds on every commit.
- No flaky CI — CI never fails due to API rate limits, network timeouts, or key rotation.
- Live API contract tests — integration tests confirm the real API shape hasn't drifted from mocks.
- Full pipeline validation — e2e tests confirm all agents wire together correctly end-to-end.
Libraries and Tools
pytest >=9.0.2
The test runner. Key configuration lives in pyproject.toml under
[tool.pytest.ini_options]:
[tool.pytest.ini_options]
testpaths = ["tests"]
addopts = "--ignore=tests/integration --ignore=tests/e2e --disable-socket --allow-unix-socket -x -q"
markers = [
"integration: tests that hit real external APIs",
"e2e: tests that hit real LLM APIs (manual trigger only)",
"vcr: tests that use VCR cassette recording",
"slow: tests that take a long time to run",
"paid_tier: tests requiring a paid Finnhub subscription",
]
Key flags explained:
| Flag | Effect |
|---|---|
--ignore=tests/integration |
Excludes the entire integration/ directory from the default run |
--ignore=tests/e2e |
Excludes the entire e2e/ directory from the default run |
--disable-socket |
Blocks all TCP/UDP sockets — any real network call raises SocketBlockedError |
--allow-unix-socket |
Permits Unix domain socket connections (needed by some local processes) |
-x |
Stop at the first failure (fast feedback in CI) |
-q |
Quiet mode — minimal output |
pytest-socket >=0.7.0
Adds the --disable-socket and --allow-hosts CLI flags and the @pytest.mark.allow_hosts marker.
How it works:
At test startup it monkey-patches socket.socket.__new__ to raise
pytest_socket.SocketBlockedError (a RuntimeError subclass) for any TCP/UDP
connection attempt. Unix domain sockets are allowed through when
--allow-unix-socket is set.
Impact on the project:
- All unit tests run with sockets blocked — any accidental real API call immediately fails with a clear error message.
- The root
conftest.py'sav_api_keyfixture catchesRuntimeErrorso that@pytest.mark.integrationtests that depend on it auto-skip rather than error when run in a socket-blocked context. - yfinance uses
curl_cffi(libcurl) which bypasses Python'ssocketmodule. This is why yfinance-backed tests must use mocks rather than relying on--disable-socketalone.
pytest-recording >=0.13.2 + vcrpy >=6.0.2
VCR.py records real HTTP responses to YAML "cassette" files, then replays them offline in subsequent runs.
Configuration (in tests/integration/conftest.py):
@pytest.fixture(scope="module")
def vcr_config():
return {
"cassette_library_dir": "tests/cassettes",
"match_on": ["method", "scheme", "host", "port", "path"],
"filter_headers": ["Authorization", "Cookie", "X-Api-Key"],
"filter_query_parameters": ["apikey", "token"],
"decode_compressed_response": True,
}
Key settings:
| Setting | Value | Why |
|---|---|---|
match_on |
method, scheme, host, port, path | Ignores query string changes (e.g., different API keys), matches by URL shape |
filter_headers |
Auth headers | Strips secrets before writing to cassette files |
filter_query_parameters |
apikey, token |
Strips API keys from recorded URLs |
decode_compressed_response |
True |
Ensures gzip/brotli responses are stored as readable text |
Note: VCR.py cannot intercept
curl_cffi(yfinance's HTTP backend). Therefore, cassettes are only used forrequests-based vendors (Alpha Vantage, Finnhub). yfinance integration tests run live.
unittest.mock (stdlib)
Python's built-in mocking library. The project uses three primitives heavily:
| Primitive | Use case |
|---|---|
patch(target) |
Temporarily replace a module-level name (e.g., requests.get) |
patch.dict(os.environ, {...}) |
Inject temporary env vars without touching the real environment |
MagicMock() |
Create a flexible mock object with auto-spec attributes |
PropertyMock |
Mock @property descriptors on classes (e.g., yf.Ticker.info) |
pandas / numpy (test helpers)
Used only inside test helpers to build realistic DataFrame fixtures that match yfinance's actual return shapes. No pandas assertions are made directly — output is always validated as a formatted string.
Fixtures Reference
Fixtures are defined at three levels; pytest resolves them from the innermost conftest outward.
Root: tests/conftest.py
Available to all tiers.
_set_alpha_vantage_demo_key (autouse)
@pytest.fixture(autouse=True)
def _set_alpha_vantage_demo_key(monkeypatch):
...
- Scope: function (default)
- Effect: Sets
ALPHA_VANTAGE_API_KEY=demoin the test environment if the variable is not already present. - Why autouse: Prevents tests from accidentally hitting Alpha Vantage with a real key or failing because the key is missing. Every test runs with a known safe value.
av_api_key
@pytest.fixture
def av_api_key():
...
- Scope: function
- Effect: Returns the Alpha Vantage API key (
"demo"by default). If the Alpha Vantage endpoint is unreachable (network blocked, CI sandbox, orpytest-socketactive), the test is automatically skipped. - Why: Allows the same integration test file to run both in development (live) and in CI (skipped gracefully) without any test code changes.
- Catches:
socket.error,OSError,RuntimeError(coversSocketBlockedError).
av_config
@pytest.fixture
def av_config():
...
- Scope: function
- Effect: Returns a copy of
DEFAULT_CONFIGwithscanner_datavendor overridden to"alpha_vantage". - Why: Tests that want to exercise the Alpha Vantage scanner code path without touching the real config.
Unit tier: tests/unit/conftest.py
Available only within tests/unit/.
mock_yf_screener
@pytest.fixture
def mock_yf_screener():
# Returns a factory: _make(quotes) → {"quotes": quotes}
- Scope: function
- Effect: Factory that builds a minimal yfinance screener response dict.
- Why: yfinance's
Screenerobject is hard to instantiate offline; this factory lets tests inject arbitrary screener data.
mock_yf_download
@pytest.fixture
def mock_yf_download():
# Returns a factory: _make(symbols, periods, base_price) → MultiIndex DataFrame
- Scope: function
- Effect: Factory that builds a MultiIndex
CloseDataFrame matching yfinance'sdownload()output shape. - Why: Tests for functions that process downloaded price data need a realistic DataFrame — this factory provides one without any network calls.
mock_av_request
@pytest.fixture
def mock_av_request():
# Returns a factory: _make(responses: dict) → fake _rate_limited_request
- Scope: function
- Effect: Factory that builds a drop-in replacement for
alpha_vantage_common._rate_limited_request. Theresponsesdict mapsfunction_name → return_value. Supports both plain values and callables (for dynamic responses). - Why: Lets unit tests exercise AV parsing code without any HTTP calls or rate-limit logic.
mock_llm
@pytest.fixture
def mock_llm():
# Returns a factory: _make(content) → MagicMock LLM
- Scope: function
- Effect: Factory that builds a
MagicMockthat implements.invoke()and.ainvoke()returning a cannedcontentstring. - Why: Agent tests need an LLM object but must not make real API calls.
Integration tier: tests/integration/conftest.py
Available only within tests/integration/.
vcr_config (module-scoped)
@pytest.fixture(scope="module")
def vcr_config():
return { "cassette_library_dir": "tests/cassettes", ... }
- Scope: module (shared across all tests in a module)
- Effect: Provides VCR.py configuration — cassette directory, match rules, secret filtering.
- Why module-scoped: Cassette config is the same for all tests in a file; no need to recreate per-test.
av_api_key (integration override)
@pytest.fixture
def av_api_key():
return os.environ.get("ALPHA_VANTAGE_API_KEY", "demo")
- Scope: function
- Effect: Returns the API key directly without a reachability check. Integration tests assume the network is available.
- Why override: Integration tests are only run when the developer explicitly
requests them (
pytest tests/integration/), so a reachability guard is unnecessary.
E2E tier: tests/e2e/conftest.py
pytest_collection_modifyitems hook
def pytest_collection_modifyitems(config, items):
for item in items:
item.add_marker(pytest.mark.e2e)
item.add_marker(pytest.mark.slow)
- Effect: Automatically tags every test in
tests/e2e/with both@pytest.mark.e2eand@pytest.mark.slow— no manual decoration needed.
Markers Reference
| Marker | Applied by | Meaning | Tests using it |
|---|---|---|---|
integration |
@pytest.mark.integration on class/function |
Test hits a real external API | tests/unit/test_alpha_vantage_scanner.py, tests/integration/*.py, some tests/unit/test_*.py integration classes |
e2e |
e2e conftest hook (autoapplied) | Test runs real LLM pipeline | all of tests/e2e/ |
slow |
e2e conftest hook (autoapplied) | Test takes >30 s | all of tests/e2e/ |
vcr |
@pytest.mark.vcr on function |
Test replays VCR cassette | (available, not yet widely used) |
paid_tier |
@pytest.mark.paid_tier |
Requires paid Finnhub subscription | tests/integration/test_finnhub_live.py |
skip |
@pytest.mark.skip |
Unconditionally skipped | paid-tier Finnhub tests |
skipif |
@pytest.mark.skipif(not KEY, ...) |
Conditionally skipped | tests/integration/test_finnhub_live.py, tests/integration/test_nlm_live.py |
Test File Catalogue
Unit tests (tests/unit/)
| File | # Tests (approx.) | What it covers | Key mocks used |
|---|---|---|---|
test_alpha_vantage_exceptions.py |
7 | AV exception hierarchy + error-handling branches | requests.get (side_effect) |
test_alpha_vantage_integration.py |
~36 | AV data layer — stock, fundamentals, news, indicators | requests.get (mock response) |
test_alpha_vantage_scanner.py |
10 (skipped) | AV scanner — gainers, losers, indices, sectors, news | Real API (auto-skipped via av_api_key) |
test_config_wiring.py |
15 | AgentState fields, new tool exports, config defaults | Import-only |
test_debate_rounds.py |
17 | ConditionalLogic — debate and risk routing thresholds |
None (pure logic) |
test_e2e_api_integration.py |
19 | route_to_vendor + full yfinance+AV pipeline |
yf.Ticker, requests.get |
test_env_override.py |
15 | TRADINGAGENTS_* env vars override DEFAULT_CONFIG |
importlib.reload, patch.dict |
test_finnhub_integration.py |
~100 | Finnhub data layer — all endpoints, exception types | requests.get (mock response) |
test_finnhub_scanner_utils.py |
10 | _safe_fmt and other Finnhub scanner utility functions |
None (pure logic) |
test_incident_fixes.py |
15 | _load_or_fetch_ohlcv cache, YFinanceError propagation |
yf.Ticker, tmp_path |
test_industry_deep_dive.py |
12 | _extract_top_sectors() + run_tool_loop nudge |
MagicMock LLM, ToolMessage |
test_json_utils.py |
15 | extract_json — fences, think-tags, malformed input |
None (pure logic) |
test_langgraph_engine_extraction.py |
14 | _map_langgraph_event — event type mapping, metadata extraction |
MagicMock events |
test_langgraph_engine_run_modes.py |
28 | run_scan/pipeline/portfolio/auto — phase coordination, skip logic |
MagicMock store, AsyncMock graph |
test_macro_bridge.py |
~12 | Macro JSON parsing, filtering, report rendering | tmp_path |
test_macro_regime.py |
~32 | VIX signals, credit spread, breadth, regime classifier | pd.Series, patch (yfinance) |
test_notebook_sync.py |
5 | sync_to_notebooklm subprocess flow |
subprocess.run |
test_peer_comparison.py |
~18 | Sector peers, relative performance, comparison report | yf.Ticker, yf.Sector |
test_portfolio_tools.py |
23 | portfolio_tools.py — in-memory data, no DB |
MagicMock repo |
test_scanner_fallback.py |
2 | AV scanner raises on total failure | _fetch_global_quote side_effect |
test_scanner_graph.py |
4 | ScannerGraph + ScannerGraphSetup compile correctly |
ScannerGraph._create_llm |
test_scanner_mocked.py |
~57 | yfinance + AV scanner functions, route_to_vendor routing | yf.Screener, requests.get |
test_security_notebook_sync.py |
3 | Shell injection guard in sync_to_notebooklm |
None |
test_stockstats_utils.py |
~20 | get_stockstats_indicator + stockstats formatting |
yf.download mock |
test_ttm_analysis.py |
~21 | TTM metric computation, report formatting | yf.Ticker (quarterly data) |
test_vendor_failfast.py |
11 | Fail-fast routing (ADR 011), error chaining | requests.get, MagicMock |
test_yfinance_integration.py |
~48 | yfinance data layer — OHLCV, fundamentals, news | yf.Ticker, yf.Search |
agents/test_analyst_agents.py |
~15 | Analyst agent node wiring + prompt formatting | MagicMock LLM |
Portfolio tests (tests/portfolio/)
These tests cover the portfolio management module with in-memory data only — no database connection is required.
| File | # Tests | What it covers | Key mocks used |
|---|---|---|---|
test_candidate_prioritizer.py |
10 | Candidate scoring + ranking algorithm | None (pure logic) |
test_config.py |
8 | Portfolio config defaults + env var overrides | patch.dict |
test_models.py |
23 | Portfolio, Holding, Trade, PortfolioSnapshot dataclasses |
None (pure logic) |
test_report_store.py |
23 | ReportStore save/load cycle, JSON serialization |
tmp_path |
test_repository.py |
16 | PortfolioRepository — CRUD operations |
MagicMock Supabase client |
test_risk_evaluator.py |
28 | Risk constraint checks — position limits, sector limits, cash floor | None (pure logic) |
test_risk_metrics.py |
48 | Volatility, Sharpe, max drawdown, beta metrics | pd.Series |
test_trade_executor.py |
10 | TradeExecutor — sell/buy ordering, constraint pre-flight, snapshot |
MagicMock repo |
CLI tests (tests/cli/)
| File | # Tests | What it covers | Key mocks used |
|---|---|---|---|
test_main.py |
~8 | extract_content_string, CLI argument parsing |
patch (LLM), tmp_path |
test_stats_handler.py |
~6 | StatsCallbackHandler — token counting, threading safety |
MagicMock LLM callbacks |
Integration tests (tests/integration/)
| File | # Tests | What it covers | Requires |
|---|---|---|---|
test_alpha_vantage_live.py |
3 | Live AV _make_api_request — key errors, timeout, success |
Network |
test_finnhub_live.py |
~41 | All Finnhub free-tier + paid-tier endpoints (live HTTP) | FINNHUB_API_KEY |
test_finviz_live.py |
27 | All three Finviz smart-money screener tools — live HTTP | Network; auto-skips if finvizfinance not installed |
test_nlm_live.py |
1 | NotebookLM source CRUD via nlm CLI |
NOTEBOOKLM_ID + nlm binary |
test_scanner_live.py |
~23 | yfinance scanner tools + AV routing (live yfinance + AV) | Network; ALPHA_VANTAGE_API_KEY for AV tests |
test_stockstats_live.py |
~6 | get_stockstats_indicator against live yfinance data |
Network |
E2E tests (tests/e2e/)
| File | # Tests | What it covers | Requires |
|---|---|---|---|
test_llm_e2e.py |
1 | Full run_scan() pipeline — file output validation |
LLM API key + network |
Execution Flow Diagrams
Default pytest run (CI / development)
flowchart TD
A([pytest invoked]) --> B{addopts applied}
B --> C["--ignore=tests/integration\n--ignore=tests/e2e"]
B --> D["--disable-socket\n--allow-unix-socket"]
B --> E["-x -q"]
C --> F[Collect tests/unit/**]
D --> G[pytest-socket patches socket.socket.__new__]
F --> H{For each test}
H --> I{Needs network?}
I -- "yes (real call)" --> J["SocketBlockedError raised\n→ test FAILS immediately"]
I -- "no (mocked)" --> K[Test runs offline]
K --> L{Uses av_api_key fixture?}
L -- yes --> M["root conftest tries socket.connect()\n→ catches RuntimeError\n→ pytest.skip()"]
L -- no --> N[Run test body]
M --> O([Test SKIPPED])
N --> P{assertions pass?}
P -- yes --> Q([Test PASSED])
P -- no --> R([Test FAILED])
E --> S["-x: stop at first failure"]
Integration test run (pytest tests/integration/)
flowchart TD
A([pytest tests/integration/]) --> B[No --disable-socket\nNetwork allowed]
B --> C{FINNHUB_API_KEY set?}
C -- no --> D["pytestmark skipif\n→ entire test_finnhub_live.py SKIPPED"]
C -- yes --> E[Finnhub live tests run]
B --> F{ALPHA_VANTAGE_API_KEY set?}
F -- no --> G["av_api_key returns 'demo'\n(limited data)"]
F -- yes --> H["av_api_key returns real key"]
G & H --> I[AV live tests run]
B --> J{NOTEBOOKLM_ID + nlm CLI set?}
J -- no --> K["pytest.mark.skipif\n→ test_nlm_live.py SKIPPED"]
J -- yes --> L[NotebookLM live test runs]
B --> M[Scanner live tests run\nagainst real yfinance API]
B --> N{finvizfinance installed?}
N -- no --> O["pytestmark skipif\n→ test_finviz_live.py SKIPPED"]
N -- yes --> P[Finviz screener tests run\nno API key needed]
E & I & K & L & M & O & P --> Q([Results reported])
Mock data flow (unit test)
flowchart LR
subgraph "Test body"
T[Test function]
end
subgraph "Mocks / patches"
P1["patch('requests.get')\nreturns mock response"]
P2["patch('yf.Ticker')\nreturns MagicMock"]
P3["patch.dict(os.environ)\ninjects API key"]
end
subgraph "Code under test"
F1["AV data function\nalpha_vantage_stock.get_stock()"]
F2["yfinance data function\ny_finance.get_YFin_data_online()"]
F3["route_to_vendor()\nvendor router"]
end
T --> P1 --> F1 --> R1[Formatted string result]
T --> P2 --> F2 --> R2[Formatted string result]
T --> P3 --> F3 --> R1
T --> P3 --> F3 --> R2
R1 --> A1{assert isinstance result str}
R2 --> A2{assert 'AAPL' in result}
pytest-socket protection flow
flowchart TD
A[Test starts] --> B["pytest-socket active\n(--disable-socket)"]
B --> C{Test tries to\nopen a socket?}
C -- "No socket call" --> D[Test continues normally]
C -- "Python socket.socket()" --> E["SocketBlockedError raised\n(RuntimeError subclass)"]
C -- "curl_cffi / libcurl\n(yfinance)" --> F["⚠️ Bypasses pytest-socket!\nMust use mock instead"]
E --> G{Is test using\nav_api_key fixture?}
G -- yes --> H["RuntimeError caught in fixture\n→ pytest.skip()"]
G -- no --> I[Test FAILS]
F --> J["Use patch('yf.Ticker') to mock\nbefore calling yfinance code"]
VCR cassette lifecycle (integration)
flowchart TD
A["pytest tests/integration/ --record-mode=new_episodes"] --> B{Cassette file\nexists?}
B -- no --> C[Make real HTTP request to API]
C --> D[Write response to\ntests/cassettes/<name>.yaml]
D --> E[Test asserts on response]
B -- yes --> F[Load response from cassette]
F --> E
E --> G{Test passes?}
G -- yes --> H([✅ Pass])
G -- no --> I([❌ Fail])
A2["pytest tests/integration/\n(default — no --record-mode)"] --> F
How to Run Tests
Quick reference
| What to run | Command | Expected result |
|---|---|---|
| All unit tests (CI-safe) | pytest |
793 passed, 14 skipped, < 5 s |
| Unit tests only | pytest tests/unit/ |
~600 passed |
| Portfolio tests only | pytest tests/portfolio/ |
~166 passed |
| CLI tests only | pytest tests/cli/ |
~14 passed |
| All integration tests | pytest tests/integration/ -v |
varies (network required) |
| Finviz integration | pytest tests/integration/test_finviz_live.py -v |
27 tests, network required |
| Finnhub integration | FINNHUB_API_KEY=<key> pytest tests/integration/test_finnhub_live.py -v |
requires key |
| E2E pipeline | pytest tests/e2e/ -v |
requires LLM key + network |
Install dependencies first
pip install -e .
pip install pytest pytest-socket
Default (unit + portfolio + cli, CI-safe)
The default pytest invocation is controlled by addopts in pyproject.toml:
addopts = "--ignore=tests/integration --ignore=tests/e2e --disable-socket --allow-unix-socket -x -q"
Run it with:
# Simplest — uses addopts automatically
pytest
# Equivalent explicit form
pytest tests/ --ignore=tests/integration --ignore=tests/e2e \
--disable-socket --allow-unix-socket -x -q
Expected output:
793 passed, 14 skipped in 4.9s
What -x does: stops at the first failing test.
What -q does: minimal output (dots + summary).
What --disable-socket does: blocks all real network calls — any accidental HTTP call fails with SocketBlockedError.
To see all failures instead of stopping at the first:
pytest tests/ --ignore=tests/integration --ignore=tests/e2e \
--disable-socket --allow-unix-socket -q
# (omit -x)
Run a single test file
# Unit file
pytest tests/unit/test_langgraph_engine_run_modes.py -v \
--disable-socket --allow-unix-socket
# Portfolio file
pytest tests/portfolio/test_trade_executor.py -v \
--disable-socket --allow-unix-socket
# CLI file
pytest tests/cli/test_stats_handler.py -v \
--disable-socket --allow-unix-socket
Run a single test class or test function
# Single class
pytest tests/unit/test_langgraph_engine_run_modes.py::TestRunAutoTickerSource -v \
--disable-socket --allow-unix-socket
# Single test function
pytest tests/unit/test_langgraph_engine_run_modes.py::TestRunAutoTickerSource::test_run_auto_gets_tickers_from_scan_report -v \
--disable-socket --allow-unix-socket
Integration tests (requires network)
Integration tests make real HTTP calls to external APIs. They are excluded from the default run but can be run explicitly at any time.
# All integration tests (everything in tests/integration/)
pytest tests/integration/ -v
# Finviz smart-money screener tests (no API key needed — free public screener)
pytest tests/integration/test_finviz_live.py -v
# Alpha Vantage live tests ('demo' key works for basic calls)
pytest tests/integration/test_alpha_vantage_live.py -v
# Finnhub live tests (free-tier endpoints only)
FINNHUB_API_KEY=your_key pytest tests/integration/test_finnhub_live.py -v \
-m "integration and not paid_tier"
# Finnhub — all tests including paid-tier (requires premium subscription)
FINNHUB_API_KEY=your_key pytest tests/integration/test_finnhub_live.py -v
# Live scanner tests (hits real yfinance + Alpha Vantage APIs)
pytest tests/integration/test_scanner_live.py -v
# Live stockstats tests
pytest tests/integration/test_stockstats_live.py -v
# NotebookLM tests (requires NOTEBOOKLM_ID env var and nlm CLI binary)
NOTEBOOKLM_ID=your_id pytest tests/integration/test_nlm_live.py -v
Tip: When running integration tests from a network-blocked environment (e.g., CI without outbound access), tests that require a network connection will auto-skip rather than fail — they detect the blocked socket and call
pytest.skip()gracefully.
E2E tests (requires LLM API key + network, manual only)
# Requires a real LLM API key (e.g., OPENAI_API_KEY) and network access.
# Takes several minutes to complete.
pytest tests/e2e/ -v
Run only tests with a specific marker
# All integration-marked tests across the entire test suite
pytest tests/ --override-ini="addopts=" -m integration -v
# All tests except slow ones
pytest tests/ --override-ini="addopts=" -m "not slow" \
--disable-socket --allow-unix-socket
# Only paid_tier Finnhub tests (requires premium key)
FINNHUB_API_KEY=your_key pytest tests/integration/ -m paid_tier -v
--override-ini="addopts="clears the defaultaddoptsfrompyproject.tomlso you can pass your own flags instead.
Re-record VCR cassettes
# Record only new requests (existing cassettes kept)
pytest tests/integration/ --record-mode=new_episodes
# Re-record everything from scratch
pytest tests/integration/ --record-mode=all
Mock Patterns
Pattern 1 — Mock requests.get for Alpha Vantage / Finnhub
Used in: test_alpha_vantage_integration.py, test_finnhub_integration.py,
test_scanner_mocked.py, test_vendor_failfast.py
import json
from unittest.mock import patch, MagicMock
def _mock_response(payload, status_code=200):
resp = MagicMock()
resp.status_code = status_code
resp.text = json.dumps(payload) if isinstance(payload, dict) else payload
resp.json.return_value = payload if isinstance(payload, dict) else {}
resp.raise_for_status = MagicMock()
return resp
def test_something():
with patch("tradingagents.dataflows.alpha_vantage_common.requests.get",
return_value=_mock_response({"Symbol": "AAPL"})):
result = get_fundamentals("AAPL")
assert "AAPL" in result
Pattern 2 — Mock yf.Ticker for yfinance
Used in: test_yfinance_integration.py, test_e2e_api_integration.py,
test_scanner_mocked.py, test_peer_comparison.py
import pandas as pd
from unittest.mock import patch, MagicMock, PropertyMock
def _make_ohlcv():
idx = pd.date_range("2024-01-02", periods=3, freq="B", tz="America/New_York")
return pd.DataFrame(
{"Open": [150.0, 151.0, 152.0], "Close": [152.0, 153.0, 154.0],
"High": [155.0, 156.0, 157.0], "Low": [148.0, 149.0, 150.0],
"Volume": [1_000_000] * 3},
index=idx,
)
def test_something():
mock_ticker = MagicMock()
mock_ticker.history.return_value = _make_ohlcv()
# For .info (a property):
type(mock_ticker).info = PropertyMock(return_value={"longName": "Apple Inc."})
with patch("tradingagents.dataflows.y_finance.yf.Ticker", return_value=mock_ticker):
result = get_YFin_data_online("AAPL", "2024-01-02", "2024-01-05")
assert "AAPL" in result
Pattern 3 — Mock requests.get for error branches
Used in: test_alpha_vantage_exceptions.py, test_vendor_failfast.py
import requests as _requests
from unittest.mock import patch
def test_timeout_raises_correct_exception():
with patch(
"tradingagents.dataflows.alpha_vantage_common.requests.get",
side_effect=_requests.exceptions.Timeout("simulated timeout"),
):
with pytest.raises(ThirdPartyTimeoutError):
_make_api_request("TIME_SERIES_DAILY", {"symbol": "IBM"})
Pattern 4 — Reload config module to test env var overrides
Used in: test_env_override.py
import importlib
import os
from unittest.mock import patch
class TestEnvOverrides:
def _reload_config(self):
import tradingagents.default_config as mod
importlib.reload(mod)
return mod.DEFAULT_CONFIG
def test_llm_provider_override(self):
with patch.dict(os.environ, {"TRADINGAGENTS_LLM_PROVIDER": "anthropic"}):
cfg = self._reload_config()
assert cfg["llm_provider"] == "anthropic"
Why
importlib.reload?DEFAULT_CONFIGis built at module import time. To test different env var values, the module must be re-evaluated. The_reload_confighelper also patchesdotenv.load_dotenvto prevent.envfiles from interfering with isolated env patches.
Pattern 5 — Mock LLM for agent / tool-loop tests
Used in: test_industry_deep_dive.py
from unittest.mock import MagicMock
from langchain_core.messages import AIMessage
def _make_llm(content: str):
msg = AIMessage(content=content, tool_calls=[])
llm = MagicMock()
llm.invoke.return_value = msg
return llm
Pattern 6 — Local-file fixtures with autouse
Used in: tests/unit/test_finnhub_integration.py
@pytest.fixture(autouse=True)
def set_fake_api_key(monkeypatch):
"""Inject a dummy API key so every test bypasses the missing-key guard."""
monkeypatch.setenv("FINNHUB_API_KEY", "test_key")
monkeypatch is a built-in pytest fixture. autouse=True makes it apply
automatically to every test in the file without explicit declaration.
What is MagicMock?
MagicMock is the workhorse of Python's unittest.mock library — the
standard tool for replacing real objects with controllable stand-ins during
tests. Understanding it is essential for reading and writing tests in this
project.
The core idea: a pretend object
When you write:
from unittest.mock import MagicMock
repo = MagicMock()
repo is now an object that:
- accepts any attribute access (
repo.some_attr→ anotherMagicMock) - accepts any method call (
repo.get_portfolio()→ anotherMagicMock) - records everything so you can ask later: "was this called? with what?"
No database, no HTTP, no file system — just a pretend object you fully control.
Setting return values
By default every method returns a new MagicMock. You can override this:
repo = MagicMock()
repo.get_portfolio.return_value = {"portfolio_id": "p1", "cash": 50_000.0}
result = repo.get_portfolio("p1") # returns the dict you set, not a MagicMock
print(result["cash"]) # 50000.0
This is critical. If you forget return_value=..., the method returns a
MagicMock, which is truthy. Code that does if repo.load_data(...):
will take the "data exists" branch even when it shouldn't.
The truthy problem (and why it caused the mock trade bug)
This project has guard clauses like:
if not force and store.load_execution_result(date, portfolio_id):
# skip — already done
return
If store = MagicMock() and load_execution_result is never given a
return_value, the call returns a new MagicMock, which is truthy.
The entire Phase 3 (portfolio execution / trades) gets skipped even
though no trades ever ran.
The fix: always set return_value=None for methods that are supposed to
return "nothing found":
mock_store = MagicMock()
mock_store.load_scan.return_value = scan_data # some data
mock_store.load_execution_result.return_value = None # ← explicitly "not found"
mock_store.load_pm_decision.return_value = None # ← explicitly "not found"
mock_store.load_analysis.return_value = None # ← explicitly "not found"
This is the pattern used in TestRunAutoTickerSource._make_mock_store().
Checking calls (assertions)
repo = MagicMock()
repo.add_holding("p1", "AAPL", 10, 150.0)
# Was it called?
repo.add_holding.assert_called_once()
# Was it called with these exact arguments?
repo.add_holding.assert_called_once_with("p1", "AAPL", 10, 150.0)
# How many times?
assert repo.add_holding.call_count == 1
# What args did the last call use?
args, kwargs = repo.add_holding.call_args
assert args[1] == "AAPL"
Raising exceptions from mocks
repo.add_holding.side_effect = InsufficientCashError("Not enough cash")
# Now calling add_holding() will raise the error instead of returning a value
side_effect can also be a callable or a list of values/exceptions to cycle
through on successive calls.
AsyncMock — for async def functions
LangGraph uses async for and await extensively. For those, use
AsyncMock:
from unittest.mock import AsyncMock, MagicMock
mock_graph = MagicMock()
mock_graph.astream_events = AsyncMock(return_value=iter([]))
# or as an async generator:
async def fake_stream(*args, **kwargs):
yield {"event": "on_chain_end", "data": {"output": {}}}
mock_graph.astream_events = fake_stream
patch() — replacing real objects temporarily
MagicMock creates the stand-in; patch() swaps it in for the duration
of a with block (or test function):
from unittest.mock import patch, MagicMock
def test_run_portfolio_skips_when_already_done():
mock_store = MagicMock()
mock_store.load_execution_result.return_value = {"summary": "done"}
with patch("agent_os.backend.services.langgraph_engine.ReportStore",
return_value=mock_store):
# All code inside here that imports ReportStore gets mock_store instead
...
The target string must be the import path where the name is used, not
where it is defined. If langgraph_engine.py does
from tradingagents.portfolio.report_store import ReportStore, you patch
agent_os.backend.services.langgraph_engine.ReportStore.
PropertyMock — for @property attributes
from unittest.mock import MagicMock, PropertyMock
mock_ticker = MagicMock()
type(mock_ticker).info = PropertyMock(return_value={"longName": "Apple Inc."})
# mock_ticker.info now returns the dict (not a MagicMock)
Quick reference table
| Tool | Use case |
|---|---|
MagicMock() |
Stand-in for any object (repo, client, graph) |
mock.method.return_value = x |
Make a method return x |
mock.method.side_effect = exc |
Make a method raise exc |
mock.method.assert_called_once_with(...) |
Assert exact call args |
AsyncMock() |
Stand-in for async def functions / methods |
patch("module.Name") |
Swap a real class/function with a mock temporarily |
patch.dict(os.environ, {...}) |
Inject env vars for the duration of a test |
PropertyMock |
Mock @property descriptors |
Adding New Tests — Checklist
When adding a test to this project, choose the right tier and follow the corresponding checklist.
Unit test (default tier — 95% of cases)
- File goes in
tests/unit/test_<module>.py - No real network calls. All HTTP must be mocked with
patch. - yfinance: use
patch("...yf.Ticker", ...)— never call yfinance directly. - AV / Finnhub: use
patch("...requests.get", return_value=_mock_response(...)). - Use
monkeypatch.setenvorpatch.dict(os.environ, ...)for env var tests. - When mocking a store/repo, always pin
load_*methods toNoneto avoid truthy MagicMock accidentally triggering "already done" skip branches. - Do NOT use
@pytest.mark.integration— that signals the test is being tracked for future migration, not that it's already mocked. - Run
pytest tests/unit/ -x --disable-socket --allow-unix-socketto confirm the test passes offline.
Portfolio module test (no DB needed)
- File goes in
tests/portfolio/test_<component>.py. - Use
MagicMock()for the Supabase client /PortfolioRepositoryif testing higher-level logic. Do NOT use a real DB connection. - Run
pytest tests/portfolio/ -x --disable-socket --allow-unix-socket.
Integration test (live API needed)
- File goes in
tests/integration/test_<vendor>_live.py. - Class or function decorated with
@pytest.mark.integration. - Use the
av_api_keyfixture (or a similar guard) to auto-skip when the API is unavailable. - For APIs with no key (e.g., Finviz): use
pytest.mark.skipif(not lib_available, ...)so the file is skipped if the library is not installed. - For Finnhub paid-tier endpoints: add both
@pytest.mark.paid_tierand@pytest.mark.skipso they are documented but never run accidentally. - Do NOT add the file path to
addopts's--ignorelist — it is already covered by--ignore=tests/integration.
E2E test (full pipeline)
- File goes in
tests/e2e/test_<feature>_e2e.py. - The conftest auto-applies
@pytest.mark.e2eand@pytest.mark.slow. - Mock only filesystem paths and CLI prompts — not LLM or data APIs.
- Document required env vars in the module docstring.