NVIDIARerank#

class langchain_nvidia_ai_endpoints.reranking.NVIDIARerank[source]#

Bases: BaseDocumentCompressor

LangChain Document Compressor that uses the NVIDIA NeMo Retriever Reranking API.

Create a new NVIDIARerank document compressor.

This class provides access to a NVIDIA NIM for reranking. By default, it connects to a hosted NIM, but can be configured to connect to a local NIM using the base_url parameter. An API key is required to connect to the hosted NIM.

Parameters:
  • model (str) – The model to use for reranking.

  • nvidia_api_key (str) – The API key to use for connecting to the hosted NIM.

  • api_key (str) – Alternative to nvidia_api_key.

  • base_url (str) – The base URL of the NIM to connect to.

  • truncate (str) – β€œNONE”, β€œEND”, truncate input text if it exceeds the model’s context length. Default is model dependent and is likely to raise an error if an input is too long.

API Key: - The recommended way to provide the API key is through the NVIDIA_API_KEY

environment variable.

param base_url: str [Required]#

Base url for model listing an invocation

param max_batch_size: int = 32#

The maximum batch size.

Constraints:
  • minimum = 1

param model: str | None = None#

The model to use for reranking.

param top_n: int = 5#

The number of documents to return.

Constraints:
  • minimum = 0

param truncate: Literal['NONE', 'END'] | None = None#

Truncate input text if it exceeds the model’s maximum token length. Default is model dependent and is likely to raise error if an input is too long.

async acompress_documents(documents: Sequence[Document], query: str, callbacks: List[BaseCallbackHandler] | BaseCallbackManager | None = None) Sequence[Document]#

Async compress retrieved documents given the query context.

Parameters:
Returns:

The compressed documents.

Return type:

Sequence[Document]

compress_documents(documents: Sequence[Document], query: str, callbacks: List[BaseCallbackHandler] | BaseCallbackManager | None = None) Sequence[Document][source]#

Compress documents using the NVIDIA NeMo Retriever Reranking microservice API.

Parameters:
  • documents (Sequence[Document]) – A sequence of documents to compress.

  • query (str) – The query to use for compressing the documents.

  • callbacks (List[BaseCallbackHandler] | BaseCallbackManager | None) – Callbacks to run during the compression process.

Returns:

A sequence of compressed documents.

Return type:

Sequence[Document]

classmethod get_available_models(**kwargs: Any) List[Model][source]#

Get a list of available models that work with NVIDIARerank.

Parameters:

kwargs (Any)

Return type:

List[Model]

property available_models: List[Model]#

Get a list of available models that work with NVIDIARerank.