# ChromaDB Collection Tests - Test Suite Documentation ## Purpose This test suite addresses **Issue #30**: Fix ChromaDB collection [bull_memory] already exists error. The issue occurs when `FinancialSituationMemory` is instantiated multiple times with the same collection name. The current implementation uses `create_collection()`, which raises an error if the collection already exists. ## Solution Change from `create_collection()` to `get_or_create_collection()` in `tradingagents/agents/utils/memory.py` line 29. ## TDD Status: RED Phase ✓ Tests written FIRST before implementation (Test-Driven Development). All new tests currently FAIL as expected. ### Test Results (Before Implementation) ``` FAILED tests/test_openrouter.py::TestChromaDBCollectionHandling::test_memory_uses_get_or_create_collection FAILED tests/test_openrouter.py::TestChromaDBCollectionHandling::test_idempotent_collection_creation FAILED tests/test_openrouter.py::TestChromaDBCollectionHandling::test_existing_data_preserved_on_reinitialization FAILED tests/test_openrouter.py::TestChromaDBCollectionHandling::test_multiple_collections_coexist FAILED tests/test_openrouter.py::TestChromaDBCollectionHandling::test_collection_creation_without_openrouter FAILED tests/test_openrouter.py::TestChromaDBCollectionHandling::test_empty_collection_name_handled FAILED tests/test_openrouter.py::TestChromaDBCollectionHandling::test_collection_creation_with_ollama FAILED tests/test_openrouter.py::TestChromaDBCollectionHandling::test_add_situations_to_reinitialized_collection FAILED tests/test_openrouter.py::TestChromaDBCollectionHandling::test_get_memories_from_reinitialized_collection 9 failed, 1 passed in 1.85s ``` ### Existing Tests: All Passing ✓ ``` 30 passed in 2.05s ``` ## Test Suite Overview ### Test Class: `TestChromaDBCollectionHandling` Location: `/Users/andrewkaszubski/Dev/TradingAgents/tests/test_openrouter.py` ### Unit Tests (10 tests) #### 1. `test_memory_uses_get_or_create_collection` **Purpose**: Primary test - verifies `get_or_create_collection()` is called instead of `create_collection()` **Test Pattern**: - Arrange: Mock ChromaDB client with both methods - Act: Create `FinancialSituationMemory` instance - Assert: - `get_or_create_collection()` called once with correct name - `create_collection()` NOT called **Expected Failure**: ``` AssertionError: Expected 'get_or_create_collection' to be called once. Called 0 times. ``` #### 2. `test_idempotent_collection_creation` **Purpose**: Test creating same collection twice does not raise error **Test Pattern**: - Arrange: Same collection name used twice - Act: Create two `FinancialSituationMemory` instances with same name - Assert: - Both instances created successfully - `get_or_create_collection()` called twice - Both calls use same collection name **Expected Failure**: ``` AssertionError: assert 0 == 2 (call_count mismatch) ``` #### 3. `test_existing_data_preserved_on_reinitialization` **Purpose**: Verify existing collection data is not lost when re-initializing **Test Pattern**: - Arrange: Mock collection with 5 existing entries - Act: Create memory instance, simulate data exists, create second instance - Assert: Second instance sees existing data (count == 5) **Expected Failure**: ``` AssertionError: assert 0 == 5 (collection.count() returns 0 instead of 5) ``` **Edge Case Coverage**: Data preservation, reinitialization behavior #### 4. `test_multiple_collections_coexist` **Purpose**: Verify different collection names can be created independently **Test Pattern**: - Arrange: Three different collection names - Act: Create memory instances for "bull_memory", "bear_memory", "neutral_memory" - Assert: - All instances created successfully - `get_or_create_collection()` called 3 times - Each call uses correct name **Expected Failure**: ``` assert 0 == 3 (len([]) - no collections created) ``` **Edge Case Coverage**: Multiple collections, naming isolation #### 5. `test_collection_creation_without_openrouter` **Purpose**: Verify fix works with non-OpenRouter providers (OpenAI, Anthropic, etc.) **Test Pattern**: - Arrange: OpenAI provider config - Act: Create memory instance - Assert: `get_or_create_collection()` still used **Expected Failure**: ``` AssertionError: Expected 'get_or_create_collection' to be called once. Called 0 times. ``` **Integration Coverage**: Cross-provider compatibility #### 6. `test_collection_name_with_special_characters` **Purpose**: Test collection names with underscores, hyphens, dots, uppercase **Test Pattern**: - Arrange: List of names with special characters - Act: Create memory for each name - Assert: All instances created successfully **Status**: PASSING (doesn't check method name, only creation success) **Edge Case Coverage**: Naming conventions, special characters #### 7. `test_empty_collection_name_handled` **Purpose**: Verify behavior with empty string as collection name **Test Pattern**: - Arrange: Empty string collection name - Act: Create memory instance - Assert: `get_or_create_collection()` called with empty string **Expected Failure**: ``` AssertionError: Expected 'get_or_create_collection' to be called once. Called 0 times. ``` **Edge Case Coverage**: Boundary conditions, invalid input #### 8. `test_collection_creation_with_ollama` **Purpose**: Verify fix works with Ollama local provider **Test Pattern**: - Arrange: Ollama provider config with local backend - Act: Create memory instance - Assert: `get_or_create_collection()` used **Expected Failure**: ``` AssertionError: Expected 'get_or_create_collection' to be called once. Called 0 times. ``` **Integration Coverage**: Ollama provider compatibility #### 9. `test_add_situations_to_reinitialized_collection` **Purpose**: Test adding data to collection that already has entries **Test Pattern**: - Arrange: Mock collection with 3 existing items - Act: Create memory, add 2 new situations - Assert: - IDs start at offset 3 (existing count) - New IDs are ["3", "4"] **Expected Failure**: ``` AssertionError: Expected 'add' to have been called once. Called 0 times. ``` **Integration Coverage**: Offset calculation, data append behavior #### 10. `test_get_memories_from_reinitialized_collection` **Purpose**: Test querying memories from re-initialized collection **Test Pattern**: - Arrange: Mock collection with existing data - Act: Create new instance, query memories - Assert: Existing data retrieved successfully **Expected Failure**: ``` assert 0 == 1 (len(results) - no results returned) ``` **Integration Coverage**: Query behavior, data retrieval ## Mock Updates ### Updated Fixture: `mock_chromadb` ```python @pytest.fixture def mock_chromadb(): """Mock ChromaDB client.""" with patch("tradingagents.agents.utils.memory.chromadb.Client") as mock: client_instance = Mock() collection_instance = Mock() collection_instance.count.return_value = 0 # Mock both create_collection (old) and get_or_create_collection (new) client_instance.create_collection.return_value = collection_instance client_instance.get_or_create_collection.return_value = collection_instance mock.return_value = client_instance yield mock ``` **Changes**: - Added `get_or_create_collection` mock alongside existing `create_collection` - Both return same collection instance for backward compatibility - Supports transition from old to new method ### Updated Test Reference Changed one existing test to use new mock: ```python # Before collection_mock = mock_chromadb.return_value.create_collection.return_value # After collection_mock = mock_chromadb.return_value.get_or_create_collection.return_value ``` ## Test Coverage ### Code Coverage Targets - **Target**: 80%+ coverage - **Focus Areas**: 1. Collection creation path (100%) 2. Idempotent behavior (100%) 3. Data preservation (100%) 4. Error handling (edge cases) ### Test Categories #### Unit Tests (10 tests) - Collection creation method verification - Idempotent creation behavior - Multiple collection handling - Provider compatibility #### Integration Tests (3 tests) - Data preservation on reinitialization - Adding situations to existing collection - Querying from re-initialized collection #### Edge Cases (4 tests) - Special characters in names - Empty collection name - Multiple collections coexisting - Cross-provider behavior ## Running Tests ### Run All ChromaDB Collection Tests ```bash python -m pytest tests/test_openrouter.py::TestChromaDBCollectionHandling --tb=line -q ``` ### Run Specific Test ```bash python -m pytest tests/test_openrouter.py::TestChromaDBCollectionHandling::test_memory_uses_get_or_create_collection -v ``` ### Run All Tests Except ChromaDB ```bash python -m pytest tests/test_openrouter.py -k "not TestChromaDBCollectionHandling" --tb=line -q ``` ### Verify Existing Tests Still Pass ```bash python -m pytest tests/test_openrouter.py --tb=line -q -k "not TestChromaDBCollectionHandling" # Expected: 30 passed ``` ## Next Steps (Implementation Phase) 1. **GREEN Phase**: Implement fix in `tradingagents/agents/utils/memory.py` ```python # Line 29 - Change from: self.situation_collection = self.chroma_client.create_collection(name=name) # To: self.situation_collection = self.chroma_client.get_or_create_collection(name=name) ``` 2. **Verify Tests Pass**: All 10 new tests should pass after implementation ```bash python -m pytest tests/test_openrouter.py::TestChromaDBCollectionHandling --tb=line -q # Expected: 10 passed ``` 3. **REFACTOR Phase**: Clean up if needed (likely not needed for this simple fix) 4. **Full Regression**: Run all tests to ensure no breaking changes ```bash python -m pytest tests/test_openrouter.py --tb=line -q # Expected: 40 passed (30 existing + 10 new) ``` ## Test Quality Metrics ### Test Structure: Arrange-Act-Assert ✓ All tests follow AAA pattern for clarity and maintainability. ### Test Isolation ✓ Each test is independent with proper mocking and fixtures. ### Clear Naming ✓ Test names describe behavior: `test___` ### Comprehensive Coverage ✓ - Unit tests: Method call verification - Integration tests: End-to-end workflows - Edge cases: Boundary conditions and special inputs ### Minimal Verbosity ✓ Using `pytest --tb=line -q` for concise output (prevents subprocess pipe deadlock per Issue #90) ## Issue References - **Primary Issue**: #30 - Fix ChromaDB collection [bull_memory] already exists error - **Related Issue**: #90 - pytest subprocess pipe deadlock (addressed by minimal verbosity) ## File Locations - **Test File**: `/Users/andrewkaszubski/Dev/TradingAgents/tests/test_openrouter.py` - **Implementation File**: `/Users/andrewkaszubski/Dev/TradingAgents/tradingagents/agents/utils/memory.py` (line 29) - **Documentation**: `/Users/andrewkaszubski/Dev/TradingAgents/tests/CHROMADB_COLLECTION_TESTS.md` ## Summary 10 comprehensive tests written FIRST (TDD red phase): - 9 tests currently FAILING (expected - no implementation yet) - 1 test PASSING (edge case that doesn't check method name) - 30 existing tests PASSING (backward compatibility verified) Tests cover: - Primary fix verification - Idempotent behavior - Data preservation - Multiple collections - Cross-provider compatibility - Edge cases (special chars, empty names, Ollama) - Integration scenarios (add/query after reinitialization) Ready for GREEN phase implementation.