Accessible Interactive AI Kiosk (GPT-4 + Speech Recognition)

tosolini · March 27, 2023, 11:16am

Accessible AI kiosk

Introducing a new prototype that leverages the power of speech recognition and AI to enhance accessibility in interactive installations for public venues. Specifically, this demo draws inspiration from The Museum of Flight in Seattle and the recently available OpenAI GPT-3 IA developed by @Seb and available through the Marketplace.

The primary goal of this prototype is to assist museum visitors in easily finding exhibition areas that align with their interests by using natural language queries. The demo showcases how Intuiface can achieve this by utilizing the Speech Recognition IA as the input mechanism and specialized prompt engineering to instruct GPT-3 to provide personalized recommendations.

How does it work?

tap and hold

With a simple tap and hold of a button, visitors can speak a query, such as “Where can I find World War II airplanes?” The speech recognition technology then converts the query into text and combines it with a targeted prompt, which instructs the AI model to identify the best match among the list of galleries based on their descriptions.

The success of this system relies on crafting a detailed prompt that sets the stage for the AI to effectively do its job. To achieve this, the instructions are combined with user input in Excel before being passed on to the AI model. A microphone is required for this feature to work.

Opportunities and limitations

Using voice recognition technology has the potential to enhance accessibility for users interacting with kiosks in public spaces. The application of voice recognition extends beyond wayfinding to creating custom conversation agents trained on specific museums data that can respond to user inquiries using text-to-speech technology.

However, the effectiveness of voice recognition depends on the quality of the recognition engine. The Microsoft Speech Platform, upon which Intuiface Speech Recognition is based, may not be suitable for all visitors, particularly non-native English speakers or in noisy environments. Therefore, it may be beneficial for museums to explore other alternatives, such as OpenAI’s Whisper, to improve the accuracy and reliability of voice recognition technology.

Download the Accessible Interactive AI Kiosk demo.

Please note that to use this XP and customize it, you will need to sign up for the OpenAI API platform and obtain your own secret API key. Once you have your API key, simply switch to Edit mode in Composer, locate the Excel database file called XLS_PromptMaker under Interface Assets, and paste your API key into the appropriate field.

If you enjoyed this topic, check out also the AI-Powered Interactive Kiosk article.

shpeterson · April 26, 2023, 10:34pm

Sees lots of great possibilities here, and as usual, a wealth of resources.

tosolini · April 27, 2023, 1:17pm

@shpeterson Glad you enjoyed the prototype!

tosolini · October 5, 2023, 2:59am

This is a refresh of our original R&D prototype, made possible by two new OpenAI Interface Assets developed by magic @Seb. The goal of this demo remains the same: to allow a user to express an interest in a particular artifact / subject (using voice) and have AI suggest the best matching gallery / exhibit in the museum.

The significant improvements are in the quality of the voice recognition (now achieved through the much more accurate Whisper system) and the ability to interface with OpenAI GPT-4.

The key information about the museum galleries / exhibits is passed once to GPT-4 through an initial system prompt. Each subsequent input is treated as a follow-up request, resulting in saving tokens (and money), since OpenAI charges you by token usage.

The system prompt also instructs AI to provide the responses in a particular format:

ID|Gallery name|Gallery description

By organizing the response as pipe-delimited text, we can individually extract the ID, Gallery name, and Gallery description. In particular, the ID is being used to automatically turn visibility on/off of the red pins on the map.

Feel free to download this XP and use it in your own projects. The OpenAI GPT-4 and Whisper IA will need to run in Player NextGen mode. Make sure to add your own OpenAI API Key in the Excel.

Seb · October 5, 2023, 3:02pm

Thanks Paolo @tosolini for coming up with a great usage example of these new AI IAs (Artificial Intelligence - Interface Assets)!

If any of you readers want to test this and don’t have access to Player Next Gen yet, just ping me here

geoff · October 5, 2023, 3:41pm

Not only is this a great example, but it’s a really helpful look at how to efficiently engage with GPT to extract exactly what is needed from each response - no more, no less. It’s a great look at how to leverage AI in a public setting so you don’t risk unusual/unexpected responses.

tosolini · October 5, 2023, 4:12pm

@seb @geoff Thanks for the kind feedback. These new AI IAs are really giving Intuiface some new superpowers.