Astra DB
DataStax Astra DB is a serverless vector-capable database built on Cassandra and made conveniently available through an easy-to-use JSON API.
AstraDBStore and AstraDBByteStore need the astrapy package to be installed:
%pip install --upgrade --quiet astrapy
The Store takes the following parameters:
api_endpoint: Astra DB API endpoint. Looks likehttps://01234567-89ab-cdef-0123-456789abcdef-us-east1.apps.astra.datastax.comtoken: Astra DB token. Looks likeAstraCS:6gBhNmsk135....collection_name: Astra DB collection namenamespace: (Optional) Astra DB namespace
AstraDBStore
The AstraDBStore is an implementation of BaseStore that stores everything in your DataStax Astra DB instance.
The store keys must be strings and will be mapped to the _id field of the Astra DB document.
The store values can be any object that can be serialized by json.dumps.
In the database, entries will have the form:
{
"_id": "<key>",
"value": <value>
}
from langchain_community.storage import AstraDBStore
from getpass import getpass
ASTRA_DB_API_ENDPOINT = input("ASTRA_DB_API_ENDPOINT = ")
ASTRA_DB_APPLICATION_TOKEN = getpass("ASTRA_DB_APPLICATION_TOKEN = ")
store = AstraDBStore(
api_endpoint=ASTRA_DB_API_ENDPOINT,
token=ASTRA_DB_APPLICATION_TOKEN,
collection_name="my_store",
)
store.mset([("k1", "v1"), ("k2", [0.1, 0.2, 0.3])])
print(store.mget(["k1", "k2"]))
['v1', [0.1, 0.2, 0.3]]
Usage with CacheBackedEmbeddings
You may use the AstraDBStore in conjunction with a CacheBackedEmbeddings to cache the result of embeddings computations.
Note that AstraDBStore stores the embeddings as a list of floats without converting them first to bytes so we don't use fromByteStore there.
from langchain.embeddings import CacheBackedEmbeddings
from langchain_openai import OpenAIEmbeddings
embeddings = CacheBackedEmbeddings(
underlying_embeddings=OpenAIEmbeddings(), document_embedding_store=store
)
AstraDBByteStore
The AstraDBByteStore is an implementation of ByteStore that stores everything in your DataStax Astra DB instance.
The store keys must be strings and will be mapped to the _id field of the Astra DB document.
The store bytes values are converted to base64 strings for storage into Astra DB.
In the database, entries will have the form:
{
"_id": "<key>",
"value": "bytes encoded in base 64"
}
from langchain_community.storage import AstraDBByteStore
from getpass import getpass
ASTRA_DB_API_ENDPOINT = input("ASTRA_DB_API_ENDPOINT = ")
ASTRA_DB_APPLICATION_TOKEN = getpass("ASTRA_DB_APPLICATION_TOKEN = ")
store = AstraDBByteStore(
api_endpoint=ASTRA_DB_API_ENDPOINT,
token=ASTRA_DB_APPLICATION_TOKEN,
collection_name="my_store",
)
store.mset([("k1", b"v1"), ("k2", b"v2")])
print(store.mget(["k1", "k2"]))
[b'v1', b'v2']