- active.json: updated days_elapsed from hypothesis runner
- hypotheses.py: black formatting applied by pre-commit hook
- .gitignore: local additions
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Uses google-genai SDK with gemini-2.5-flash-lite — same model already
used by the discovery pipeline, so no new secret needed (GOOGLE_API_KEY).
Removed ANTHROPIC_API_KEY from hypothesis-runner.yml.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
days_elapsed counts entries in picks_log, so running on weekends would
inflate the counter with noise picks. Exit early on Saturday/Sunday.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Before concluding a hypothesis, check if the scanner's source file
changed on main since created_at. If it did, the baseline picks in
performance_database.json reflect the updated code for the later part
of the experiment, which can confound the comparison.
When drift is detected, a warning is embedded in:
- the concluded .md doc (blockquote below Decision)
- the PR comment (blockquote in the conclusion body)
The programmatic decision is not overridden — the warning is purely
informational, allowing the reviewer to judge whether the result is
trustworthy.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
load_dotenv() in tradingagents/config.py searches the cwd for .env.
Worktrees in /tmp/ don't have one, so symlink the main repo's .env
into the worktree root before running discovery.
In CI, secrets are passed as env vars directly — symlink is a no-op.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Pending hypotheses queue by priority and promote when a slot opens,
rather than pausing a running experiment mid-streak.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When ANTHROPIC_API_KEY is set, conclude_hypothesis now:
- Loads the scanner domain file for context
- Calls claude-haiku-4-5-20251001 for a 3–5 sentence interpretation
- Embeds the analysis in the concluded .md doc and PR comment
The LLM enriches the conclusion with sample-size caveats, market
context, and a follow-up hypothesis suggestion — without overriding
the programmatic accept/reject decision.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements compute_7d_return, compute_metrics, load_baseline_metrics,
and make_decision functions with full TDD coverage (11 tests passing).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements compute_7d_return, compute_metrics, load_baseline_metrics,
and make_decision functions with full TDD coverage (11 tests passing).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Pending hypotheses queue by priority and promote when a slot opens,
rather than pausing a running experiment mid-streak.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Covers 5 tasks: knowledge base structure, /iterate command,
/research-strategy command, and two GitHub Actions workflows with
rolling PR logic.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
At most one open PR per skill at any time. Daily runs push onto the
existing branch and update the PR description. Merging resets the cycle.
Prevents PR accumulation from unreviewed automated runs.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>