The Aryn Conversational Stack#

Search your data with better quality and natural language.

The Aryn Conversational Stack is a set of open source, integrated components that makes it easy for developers to build applications that provide high-quality answers with natural language for questions on unstructured enterprise data. Aryn reverses the paradigm of bringing data to large language models (LLMs), and instead uses AI to augment dataflows. This approach improves data enrichment and natural language pipelines with intelligence from generative AI.

Why Aryn?#

Users across industry verticals now prefer to interact with their data through natural language for a variety of use-cases, such as question-answering over knowledge-bases, chatbots for support or API documentation, and interactive research and discovery. Unfortunately, building high-quality conversational search applications like these on your own data is difficult today.

Developers can choose to build these applications with LLMs like GPT-4 or Llama 2, which generate natural language answers to questions on general knowledge. However, these generative AI models are not trained on private enterprise data, and they either refuse to answer questions that require this knowledge or simply hallucinate. Some developers try to fine-tune these models on their corpus of data, but this technique is difficult, expensive, fraught with hallucinations, and needs to be adjusted when new data is added.

On the other hand, developers can augment an LLM with private data through a process called retrieval augmented generation (RAG), where ground truth data is given in a prompt to an LLM. Then, the LLM will generate a natural language answer using that data as context. Though this approach can give high-quality answers, to do so requires complex data enrichment and processing to extract semantic meaning from data, choosing LLMs, and piecing together a number of different components. This is difficult, time consuming, and hard to scale.

Aryn’s conversational search stack is a one-stop shop where developres can easily build get high-quality, natural language search on their data, prototype quickly, and quickly scale to production.

Why use the Aryn Conversational Stack?#

Aryn provides an end-to-end architecture for building high quality, conversational search apps on enterprise data. This enables developers to get started quickly and easily scale to production. The stack consists of three main components:

  • Semantic data preparation with Sycamore, a robust and scalable, open source semantic data preparation system. Sycamore uses LLMs for unlocking the meaning of unstructured data and preparing it for search.

  • Hybrid search with OpenSearch, an enterprise-grade search platform with vector database and term-indexing capabilities. Hybrid search combines vector and keyword search for the best quality information retrieval.

  • Conversational memory and APIs in OpenSearch v2.10. This new functionality stores the history of conversations and orchestrates interactions with LLMs using retrieval-augmented generation (RAG) pipelines. RAG is a popular approach for feeding LLMs private data from which to synthesize answers.

Aryn helps developers deliver high-quality answers from, and natural language interactions with, enterprise data through leveraging semantic data preparation and generative AI in each stage of the search stack. The best quality data results in superior retrieval and question-answering, and Aryn gives developers the tools to instrument the data pipelines to extract and enrich data for search.

Finally, the Aryn conversational search stack is 100% open source (Apache License v2.0) with no vendor lock in. Developers can customize the stack, including using the LLMs or LLM services of their choice, running in the cloud or on-premises, and configuring Sycamore and OpenSearch to meet specific requirements.

How do I get started?#