What is Conversational Search?#
The goal of search is to retrieve information from a knowledge base that answers a user’s question. Up until recently, this was done by returning a list of documents that are likely to be relevant to the question. Originally, this ‘search relevance’ was computed using bag-of-words metrics like TF/IDF or BM25. But with deep learning came dense retrieval, another kind of search relevance computation that relied on neural networks to generate high-dimensional vectors representations of documents that could be searched over. This has been shown to perform well, but the mode of interaction with the knowledge base remains based on a list of relevant documents.
Today, we now have generative AI and large language models (LLMs). This allows a new mode of information retrieval, where the answer to a user’s question can be synthesized by the LLM. Early attempts using LLMs to answer questions on prviate data ran into issues, though - LLMs will confidently make things up and are not trained on private data. With the Aryn Conversational Search Stack, we take a different approach - we use a technique called retrieval-augmented generation (RAG) to extend LLMs to cover private datasets. The RAG architecture retrieves relevant sections of private data through hybrid search (vector + keyword search) and feeds them to an LLM to constrain the response and generate high-quality answers. Beyond this, we make these natural language searches conversational by enabling follow-up question-answering that keeps the understaning form prior questions. This givse a chat-like experience with data versus a one-off search query.
Search components in the Aryn Conversational Search Stack#
Aryn’s Conversational Search Stack uses OpenSearch and LLMs for its conversational search pipeline. Aryn added these conversational capabilites in OpenSearch v2.10. The conversational search architecture is built around OpenSearch’s document search core, using hybrid search (a combination of vector and keyword search) to get the best documents out of the knowledge base, and OpenSearch’s Search Pipelines to send those documents to a LLM using a RAG pipeline to get an answer. The pipeline also shows the results of the hybrid search, so users can see what documents were used to generate the natural language answer. We also use conversation memory to store the history of conversations so they can be used as context for the next user query.
You can easily see conversational search in action with the Aryn Quickstart, which will ingest a sample dataset and make it available for search.
For more information about this functionality:
Neural Search: neural search configuration
Search Pipeline & RAG: search pipeline configuration