Enhancing Data Understanding - Conversations with Your Logs

Enhancing Data Understanding

Conversations with Your Logs

Rocco Gagliardi
by Rocco Gagliardi
on May 16, 2024
time to read: 6 minutes

Keypoints

Advanced data management with OpenSearch

  • OpenSearch's support for k-NN vector databases enables diverse applications like similarity search, classification, clustering, and time series analysis
  • The introduction of Piped Processing Language (PPL) in OpenSearch allows users to manipulate data efficiently
  • The OpenSearch Assistant Module facilitates the creation of AI assistants within OpenSearch Dashboards, integrating skills for specific tasks, an ML framework, and UI components
  • Local or external models can be employed with OpenSearch Assistant, offering flexibility in model selection based on use case and hardware constraints
  • Language Model Models (LLMs) in OpenSearch simplify data analysis through natural language interactions, encouraging iterative refinement of queries
  • The simplicity of prompts in OpenSearch Assistant belies the underlying complexity of integrating various technologies, ultimately enhancing user experiences in data exploration and analysis

This article is a continuation of the previous Transition to OpenSource. In the previous publication, we explored our transition from Graylog to OpenSearch, highlighting the reasons behind this transition. In this article, we will further delve into the capabilities of OpenSearch, specifically focusing on its ability to integrate local or remote machine learning systems using the OpenSearch Assistant Module. This evolution represents a significant step in the journey towards advanced data management and offers opportunities to enhance analysis and information extraction process from our logs.

What it is

Generative AI is transforming how users interact with and derive insights from their data.

OpenSearch has supported the k-NN vector database since its inception, which is crucial in a wide range of applications, including similarity search, classification, clustering, and time series analysis.

For example, it allows:

OpenSearch has also introduced the concept of Piped Processing Language (PPL), which allows users to concatenate various operations to manipulate data to obtain complex results.

The OpenSearch Assistant Module is a framework to orchestrate these different technologies to create AI assistants directly within OpenSearch Dashboards. It includes “skills” for specific tasks, an ML framework to integrate AI models, and UI components for conversational interactions.

These skills can be connected to an LLM to generate summaries from query results, while ensuring an intuitive and interactive user experience of OpenSearch Dashboards was as simple as incorporating the UI search bar component into our log exploration interface.

Supported models can be local or external. In the case of local models, you can use pre-trained models or customize them to fit your use case, considering local hardware limitations. However, you can also opt for external models, such as ChatGPT 3.5, or others.

For details on the integration of OpenSearch Assistant: RFC – OpenSearch Assistant Toolkit

How it works

When details become too complex, images of monkeys with tambourines spinning in circles form in our minds. Imagine being able to interact with data posing questions in natural language instead of regex. Thanks to LLMs, it is possible to simplify the analysis process, make it as interactive as we’ve been accustomed to since elementary school, and focus on results.

LLMs represent a turning point in the evolution of machines towards greater flexibility. Although they don’t always produce the perfect answer, this imprecision encourages users to interact and refine requests. it is an iterative process that sometimes takes time but ultimately yields more relevant and useful results.

Among the dedicated “skills” that can be combined, one is particularly focused on converting the user-provided natural language prompt into a PPL query. Thanks to this, the days of mastering regex to manipulate data may be over with the advent of LLMs. Now, you can ask them to automatically generate the necessary formulas to filter and synthesize data. For the user, the OpenSearch Assistant module appears as a field where any question about the data can be entered. In the background, questions are interpreted and converted into PPL, allowing not only for answers but also useful suggestions to further refine queries.

Let’s see some simple examples, but the advice is to download a docker-compose.yml file and run the cluster in your infrastructure with your data.

Show me possible problems.

We can start with a generic request to show the issues. Determining what to classify as a problem is the task of generic or customized “skills.”

Query Assistant - Show possible problems

Charting

Similarly, we can request to visualize the data.

Query Assistant - Create a chart

Enriching

The most interesting and promising aspect, considering the possible integrations that can be made, is the ability to add context to our data.

Query Assistant - Enrich the text

Summary

We examined OpenSearch’s evolution towards advanced data management and analysis, focusing on integrating machine learning capabilities. We explored how OpenSearch’s support for k-NN vector databases and the introduction of Piped Processing Language (PPL) enhance data manipulation efficiency.

The result is the OpenSearch Assistant Module, which facilitates the creation of AI assistants within OpenSearch Dashboards and the integration of local or external AI models. We also explored examples of Language Model Models (LLMs) simplifying data analysis through natural language interactions, despite occasional imprecision.

The apparent simplicity of a prompt masks the complex integration of various technologies and methodologies since the inception of OpenSearch, including support for vector databases and the implementation of straightforward yet interconnectable “skills”. Moreover, the integration of increasingly refined and compact pre-trained LLM models adds depth to the development of the OpenSearch Assistant Module.

About the Author

Rocco Gagliardi

Rocco Gagliardi has been working in IT since the 1980s and specialized in IT security in the 1990s. His main focus lies in security frameworks, network routing, firewalling and log management.

Links

You want to evaluate or develop an AI?

Our experts will get in contact with you!

×
Transition to OpenSearch

Transition to OpenSearch

Rocco Gagliardi

Graylog v5

Graylog v5

Rocco Gagliardi

auditd

auditd

Rocco Gagliardi

Security Frameworks

Security Frameworks

Rocco Gagliardi

You want more?

Further articles available here

You need support in such a project?

Our experts will get in contact with you!

You want more?

Further articles available here