NVIDIARerank#

class langchain_nvidia_ai_endpoints.reranking.NVIDIARerank[source]#

Bases: BaseDocumentCompressor

LangChain Document Compressor that uses the NVIDIA NeMo Retriever Reranking API.

Create a new NVIDIARerank document compressor.

This class provides access to a NVIDIA NIM for reranking. By default, it connects to a hosted NIM, but can be configured to connect to a local NIM using the base_url parameter. An API key is required to connect to the hosted NIM.

Parameters:

model (str) – The model to use for reranking.
nvidia_api_key (str) – The API key to use for connecting to the hosted NIM.
api_key (str) – Alternative to nvidia_api_key.
base_url (str) – The base URL of the NIM to connect to.
truncate (str) – “NONE”, “END”, truncate input text if it exceeds the model’s context length. Default is model dependent and is likely to raise an error if an input is too long.

API Key: - The recommended way to provide the API key is through the NVIDIA_API_KEY

environment variable.

param base_url: str [Required]#: Base url for model listing an invocation

param max_batch_size: int = 32#

The maximum batch size.

Constraints:

minimum = 1

param model: str | None = None#: The model to use for reranking.

param top_n: int = 5#

The number of documents to return.

Constraints:

minimum = 0

param truncate: Literal['NONE', 'END'] | None = None#: Truncate input text if it exceeds the model’s maximum token length. Default is model dependent and is likely to raise error if an input is too long.

async acompress_documents(documents: Sequence[Document], query: str, callbacks: List[BaseCallbackHandler] | BaseCallbackManager | None = None) → Sequence[Document]#

Async compress retrieved documents given the query context.

Parameters:

documents (Sequence[Document]) – The retrieved documents.
query (str) – The query context.
callbacks (List[BaseCallbackHandler] | BaseCallbackManager | None) – Optional callbacks to run during compression.

Returns:

The compressed documents.

Return type:

Sequence[Document]

compress_documents(documents: Sequence[Document], query: str, callbacks: List[BaseCallbackHandler] | BaseCallbackManager | None = None) → Sequence[Document][source]#

Compress documents using the NVIDIA NeMo Retriever Reranking microservice API.

Parameters:

documents (Sequence[Document]) – A sequence of documents to compress.
query (str) – The query to use for compressing the documents.
callbacks (List[BaseCallbackHandler] | BaseCallbackManager | None) – Callbacks to run during the compression process.

Returns:

A sequence of compressed documents.

Return type:

Sequence[Document]

classmethod get_available_models(**kwargs: Any) → List[Model][source]#

Get a list of available models that work with NVIDIARerank.

Parameters:: kwargs (Any)
Return type:: List[Model]

property available_models: List[Model]#: Get a list of available models that work with NVIDIARerank.