The memory_index FT schema only materializes embedding, scope,
categories, created_at, and importance. ValkeyStorage._vector_search
was emitting metadata_filter clauses as @{key}:{value} against that
index, but metadata keys are user-defined and not part of the schema,
so the FT.SEARCH server would either error out or silently return
results that the metadata predicate failed to narrow.
Move metadata_filter to a Python post-filter that runs after FT.SEARCH
parses results. Overfetch from KNN (limit * 10, capped at 1000) when
metadata_filter is supplied so the post-filter still returns the
caller-requested number of hits in the common case, then truncate to
limit. Scope/categories predicates remain pushed down into FT.SEARCH
because they are valid index fields.
Adds TestValkeyStorageMetadataPostFilter and
TestValkeyStorageMatchesMetadataFilter test classes; updates three
pre-existing tests that asserted the broken contract.
Closes#5794
Co-Authored-By: João <joao@crewai.com>
Extract duplicated Redis URL parsing into a shared cache_config utility.
Introduce ValkeyCache as a lightweight async key/value cache using
valkey-glide. Wire it into A2A task handling, agent card caching, and
file upload caching.
Part 1/4 of Valkey storage implementation.
fix: async-safe embeddings and resilient drain_writes
Add bytes→float validators on MemoryRecord and ItemState to handle
Valkey returning embeddings as raw bytes. Make embed_texts() safe when
called from an async context by using a thread pool. Improve
drain_writes() with per-save timeouts and error logging instead of
raising on failure.
Part 3/4 of Valkey storage implementation.
feat(valkey): ValkeyStorage vector memory backend
Add ValkeyStorage, a distributed StorageBackend implementation using
Valkey-GLIDE with Valkey Search for vector similarity. Wire it into
Memory as the 'valkey' storage option. Pin scrapegraph-py<2 to fix
unrelated upstream breakage.
Part 4/4 of Valkey storage implementation.
fix: use datetime.utcnow() for last_accessed consistency
MemoryRecord defaults use utcnow() for created_at and last_accessed.
Match that in ValkeyStorage.update_record() to avoid timezone
inconsistency in recency scoring.
feat(valkey): shared cache config + ValkeyCache for A2A and file uploads
Extract duplicated Redis URL parsing into a shared cache_config utility.
Introduce ValkeyCache as a lightweight async key/value cache using
valkey-glide. Wire it into A2A task handling, agent card caching, and
file upload caching.
Part 1/4 of Valkey storage implementation.
fix: handle non-numeric database path in cache URL parsing
Extract _parse_db_from_path() helper that catches ValueError for
paths like /mydb and defaults to 0 with a warning, instead of
crashing.
fix: async-safe embeddings and resilient drain_writes
Add bytes→float validators on MemoryRecord and ItemState to handle
Valkey returning embeddings as raw bytes. Make embed_texts() safe when
called from an async context by using a thread pool. Improve
drain_writes() with per-save timeouts and error logging instead of
raising on failure.
Part 3/4 of Valkey storage implementation.
fix: catch concurrent.futures.TimeoutError for Python 3.10 compat
In Python <3.11, concurrent.futures.TimeoutError is distinct from the
builtin TimeoutError. Catch both so the timeout warning path works
on all supported Python versions.