AI Chatbot

The beAnywhere chatbot is a self-developed application based on a Large Language Model (LLM) optimized for text conversations. With this tool, visitors to our website can interact with the content independently and interactively.

Technology

It is a Retrieval Augmented Generation Application (often also called RAG Application) that combines the capabilities of LLMs with the domain knowledge of a company or organization to produce significantly more informative results. A Large Language Model such as GPT 4 has been trained with tens of millions of data sets and therefore has an impressive ability to generate text or code. However, these models have no insight into organization-specific data and therefore cannot answer specific queries satisfactorily, if at all.

In a RAG application, our information is combined with a semantic, vector database-based search and a classic keyword search with a visitor's query and sent to the Foundation Model (the underlying Large Language Model) with the resulting context. The generated output incorporates the context and the LLM's own "knowledge" into the response with a certain weighting.

We use the following technologies and techniques:

  • A Foundation Model (LLM) optimized for multilingual text generation such as Mistral Large or Anthropic Claude.
  • ETL (Extract - Transform - Load) pipelines for creating so-called embeddings, which are stored in a vector database. This involves parsing the intended content of the beAnywhere website with an Embeddings model and converting it into machine-readable vectors.
  • The embeddings are indexed in a special vector database.
  • Search queries are carried out using a semantic search and a keyword search.
  • The results are sent as context to the Foundation Model to generate the final response.
  • Prompt engineering is used to influence the tonality and other content.

The

content of the answers is based on the combined "knowledge" of our website and the trained data of the Foundation Model. Although we strive to provide high quality answers through continuous development, you should always take a common sense approach to the answers you receive and seek further information from other sources if necessary. RAG technology is an important step in reducing the so-called "halucination" of LLMs; however, errors cannot be ruled out.

Data protection

The processing of data (user input and internal data) takes place exclusively in data centers within the European Union. The data (questions and answers) never leave the EU. Likewise, the data is not used for the training of LLMs, neither by us nor by providers of large language models. All data is also transmitted in encrypted form (encrypted in transit) and stored for the duration of a request (encrypted at rest). CMEK (Customer-managed Encryption Keys) are used for encryption.

No personal data is stored (requests or responses from the chatbot). Likewise, no information such as IP addresses or similar is stored.

For more

information about Retrievak Augmented Generation and the use of artificial intelligence on websites, please contact us. With Gyden, our agency for cloud solutions and Gen AI development, we offer state-of-the-art and customized solutions for companies.