Introducing the Lakehouse Copilot v2

October 14, 2024

blog banner

Discover how the new Lakehouse Copilot v2 simplifies data exploration with advanced AI insights.

By Theo Bell, Head of AI Product

We are pleased to announce the release of the Lakehouse Copilot version 2.1. This release incorporates feedback from our clients and recent innovations in the AI space and represents several months of development effort.

Key highlights include:

  • The addition of sample Diversity, Equity and Inclusion (DE&I) data from our data partner Denominator and company sustainability data from ESG Book
  • Upgrading to the new GPT-4o model from OpenAI
  • The introduction of a Data Discovery tab to give users insights into the data that is available to be explored in the Copilot
  • The addition of a Bring Your Own Data section that explains how Rimes can help bootstrap your AI journey safely and securely

DE&I Data from Denominator

Until now, the Lakehouse Copilot has primarily featured ETF data for querying. However, in our continuous effort to provide more valuable insights to users while offering data partners the opportunity to showcase their capabilities, we expanded the data available. Our first collaboration in this initiative is with Denominator, who provided sample data on security-level Diversity, Equity, and Inclusion (DEI) statistics for use in the Rimes AI Lab. Denominator supplied the data, example questions, and guidance on which properties to use to answer each query. Our AI Engineering Team then worked on prompt engineering, teaching the Copilot how to interact with this new data, including generating SQL queries for few-shot prompting and evaluating its accuracy.

Sustainability Data from ESG Book

ESG Book is our second partner to provide sample data for the Rimes AI Lab. We've integrated their Disclosures Data feed from both the Sustainability Framework and Emissions Plus Framework, providing users with a comprehensive set of granular sustainability data in a single dataset. This allows for deep dives into various aspects of a company’s sustainability reporting.

Using the same process as we did for the Denominator data, we have trained the Lakehouse Copilot to answer questions about sustainability metrics for both companies and ETF constituents.

GPT-4o

When OpenAI announced the GPT-4o model, we were eager to test it. It promised to be more accurate, faster, and cost-effective compared to GPT-4 turbo, and we hoped it would allow us to simplify our Copilot by reverting to a single-mode operation. Initially, we used GPT-4 turbo to generate SQL code for the Copilot, but its performance was too slow for our needs. To address this, we also experimented with GPT-3.5 turbo, which was almost 10x faster but lacked the same level of accuracy. Based on client use cases, we released a two-mode version: “Quick Thought,” powered by the faster GPT-3.5, and “Deep Thought,” powered by the more accurate but slower GPT-4. However, this required supporting two separate modes.

Through our evaluation framework and a set of human-validated queries, we found that GPT-4o was 60% faster and 7% more accurate than the other models. This enabled us to streamline the Copilot back to a single mode, reducing technical complexity and enhancing the user experience.

Data Discovery Tab

As we expand the data available within the Lakehouse Copilot, we’ve introduced a new Data Discovery tab to make it easier for users to explore and access this growing dataset. The tab provides detailed information such as table names, data contents, and start dates, along with the ability to download a comprehensive data dictionary for the ETF data.

Bring Your Own Data

An important feature of the Lakehouse Copilot is its “bring your own data” (BYOD) functionality, which allows you to kickstart your AI journey with Rimes without requiring specialized technical skills or significant tech investment. We now provide information about how BYOD works directly within the Copilot, but our team is readily available to assist those looking to learn more.

Summary

The past few months have been exciting for Copilot development, and we have even more enhancements planned for future phases, including:

  • Surfacing index and benchmark data
  • Reporting functionality
  • And much more!

Interested in learning more? Register here for a free trial of the Rimes Lakehouse Copilot. For existing users, you can access the Copilot directly here.

< Back to blog posts
handshake between two business personnel

Learn more about
our solutions.

Contact us