Using AI with Your Own Data in Intuiface: A Simple Approach

RAG in Intuiface

AI-powered Intuiface experiences can leverage the vast domain knowledge of Large Language Models (LLMs) via the OpenAI Interface Asset. In many cases, this general knowledge is sufficient, as LLMs are trained on an extensive dataset spanning various topics.

However, what if you need AI to retrieve and generate responses based on your own proprietary data? This is where Retrieval-Augmented Generation (RAG) comes in. RAG enhances an LLM’s capabilities by allowing it to pull relevant information from external sources, enriching responses with domain-specific insights.

In this article, we explore a lightweight, cost-effective approach to RAG within Intuiface—one that doesn’t rely on expensive third-party services but instead utilizes clever prompt engineering. This “poor man’s RAG” method offers a practical way to integrate structured knowledge retrieval without complex backend infrastructure.

What We’re Building: An AI-Powered Aviation Museum

Consider an interactive display for an aviation museum where visitors can ask questions about specific aircraft and get accurate answers based on the museum’s own information, not just what the AI happens to know about planes. This example works well because each aircraft has specific technical details and historical facts that might not be accurately represented in an AI’s general knowledge.

Prerequisites

  • OpenAI API key
  • A structured database of information. We’ll use the following aircraft data as an example

Original aircraft database.txt (22.5 KB)

Step 1: Creating Your Database

We’ll start with a database of 10 aircraft, each with detailed descriptions from the museum. While it’s possible to send this entire database to the AI with each question, that would be inefficient and costly as your database grows. Instead, we’ll implement preprocessing to make it more efficient.

Step 2: Creating Smart Search Tags

The key optimization involves creating a set of searchable tags for each aircraft, essentially creating an efficient search index for the database.

We’ll use AI to help us create these tags with the following prompt:

You are tasked with creating metadata tags from artifact descriptions. For each description:

1. Extract all meaningful identifying terms that users might search for, including:
   - Official names and designations (e.g., "B-29", "CH-47")
   - Common names and nicknames (e.g., "Superfortress", "Chinook")
   - Registration/serial numbers (e.g., "G-BOAG")
   - Manufacturer names (e.g., "Boeing", "Douglas")
   - Time periods/conflicts (e.g., "WWII", "Vietnam War")
   - Basic type/role (e.g., "bomber", "helicopter", "transport")
   - Notable characteristics (e.g., "supersonic", "atomic")

2. Include variations of key terms:
   - Remove spaces and use underscores for multi-word terms
   - Include common abbreviations and acronyms
   - Include both hyphenated and non-hyphenated versions of designations

3. Format rules:
   - All terms should be lowercase
   - Separate terms with commas
   - No categorization or hierarchical structure
   - No duplicate terms

Input format:
ID: [number]
Name: [name]
[description]
###

Output format:
[ID];[name]|[comma-separated metadata tags]

Example input:
ID: 1
Name: CH-47 Chinook
The CH-47 Chinook is a twin-engine transport helicopter. Nicknamed "My Old Lady", it served in Vietnam...
###

Example output:
1;CH-47 Chinook|ch-47,ch47,chinook,ch-47_chinook,helicopter,transport,vietnam,my_old_lady,twin_engine

Process each description to create a rich but focused set of searchable metadata terms.

This process transforms our data as follows:

Before:

The Boeing B-17 Flying Fortress was a four-engine heavy bomber developed in the 1930s. This legendary aircraft played a crucial role in World War II's strategic bombing campaigns...

After:

1;B-17 Flying Fortress|b-17,b17,flying_fortress,boeing,bomber,heavy_bomber,wwii,world_war_2,four_engine,1930s

This transformation reduces our original 22.5 KB database to a more efficient 1.8 KB file, significantly optimizing API usage.

Refined Dataset for AI Retrieval (RAG Data).txt (1.8 KB)

Step 3: Setting Up the Search in Intuiface

The implementation in Intuiface involves creating a search interface where visitors can input their questions. The system sends this streamlined database along with their query to the AI using the following prompt:

You are an intelligent assistant tasked with maintaining and querying a dataset of aircraft and artifact metadata. The dataset is provided in the following format:

[ID];[name]|[comma-separated metadata tags]

This dataset has been loaded into your memory, and you will use it to answer user queries. Your goal is to:
- Keep the dataset in memory and avoid requesting it again.
- Parse user queries and match them against the metadata tags in the dataset.
- Return the best matching records in the following format [ID1];[name1]|[ID2];[name2]|... without any additional commentary, explanations, or formatting.
- If not good matches exist, just return the text NO MATCH

The dataset is now loaded into your memory. Proceed to answer user queries based on it.

DATASET:
[your metadata goes here]

This prompt structure ensures consistent, parseable responses that integrate smoothly with Intuiface.

Step 4: Displaying the Results

The AI returns matching aircraft IDs and names in a structured format. A custom Interface Asset in Intuiface parses these results and displays them in an asset grid collection. Users can then select an aircraft to view its complete details from the original database (Excel in this case).

Cost and Performance Considerations

This implementation offers several advantages:

  • The metadata file’s small size minimizes API costs
  • Focused metadata enables efficient searching
  • Database updates can be made without interface modifications
  • The approach is adaptable to various types of information - products, events, exhibits, etc.

Troubleshooting Tips

Common implementation considerations include:

  • Ensuring metadata tags encompass various search terms and phrases
  • Testing searches with different query formulations to validate matching
  • Monitoring API usage for cost optimization

Ready to Try It?

The demo is available for download and experimentation. It requires your OpenAI API key to function. While the demo features the aviation museum example, the methodology can be adapted for any type of information requiring AI-powered search capabilities.

This streamlined RAG implementation provides a practical balance of functionality and simplicity, enabling the creation of experiences that combine AI capabilities with domain-specific knowledge.

Download the Tag-Based RAG demo (10MB)

Special thanks to @Seb for creating a custom parser Interface Asset for this demo and to @pnelson of The Museum of Flight in Seattle for the continued R&D collaboration.

2 Likes