NanoPQ-Retriever
Product Quantization algorithm (k-NN) in brief is a quantization algorithm that helps in compression of database vectors which helps in semantic search when large datasets are involved. In a nutshell, the embedding is split into M subspaces which further goes through clustering. Upon clustering the vectors the centroid vector gets mapped to the vectors present in the each of the clusters of the subspace.
This notebook goes over how to use a retriever that under the hood uses a Product Quantization which has been implemented by the nanopq package.
from langchain_community.retrievers import NanoPQRetriever
from langchain_openai import OpenAIEmbeddings
API Reference:OpenAIEmbeddings
Create New Retriever with Texts
retriever = NanoPQRetriever.from_texts(
["foo", "bar", "world", "hello", "foo bar"], OpenAIEmbeddings()
)
Use Retriever
We can now use the retriever!
result = retriever.invoke("foo")
result