Olympics, AI, and some BI

Aug 05, 2024

Now that the Olympics are heating up, I’m doing my stretches and getting ready to jump on the bandwagon. I consider myself a responsible band-wagoner; this means at least doing some basic research before pretending that I’ve cared about a specific sport my entire life. As most of you know, GPT is an incredible teacher, so that’s where I started.

I decided to chronicle the process to show how AI can most effectively play a role in the process. This is Part 1, which covers the high-level learnings. In Part 2 and beyond, I’ll write about the process and relevant code snippets.

Preparation:

We started by asking GPT some basic questions to understand the baseline.

Q1: How many Olympic medals has the United States won in its entire history? I want the sum of all medals including bronze, silver, and gold.

Rationale: I wanted to understand just how many medals are “a lot” of medals. Total noob, I know.

A1: As of now, the United States has won a total of 3,268 Olympic medals throughout its history. This includes 1,065 gold medals, 832 silver medals, and 771 bronze medals in the Summer Olympics, and an additional 330 medals (114 gold, 122 silver, and 94 bronze) in the Winter Olympics. [[link1](<https://en.wikipedia.org/wiki/United_States_at_the_Summer_Olympics>), [link2](<https://en.wikipedia.org/wiki/United_States_at_the_Olympics>)]

While interesting, it is barely scratching the surface. As a side note, I realized that GPT’s math didn’t check out while I was editing, which I’ll ignore for now. Let’s try something more complicated.

Q2: Which country won its first gold medal in the 2020 Olympics?

Rationale: I’m loosely familiar with the historical dominance of the US/Russia/China/UK, but we shouldn’t forget to celebrate the first-time winners. Besides, I’m curious if they’ll keep the streak going!

A2
1. Philippines: Weightlifter Hidilyn Diaz won the country’s first-ever gold medal in the women’s 55 kg weightlifting competition
2. Bermuda: Triathlete Flora Duffy won Bermuda’s first gold medal in the women’s triathlon.
3. Qatar: Qatar won its first gold medal in weightlifting with Fares Ibrahim Elbakh in the men’s 96 kg category and also won another gold in the men’s high jump with Mutaz Essa Barshim. [[Link](<https://www.nbcolympics.com/news/which-nations-have-claimed-their-first-ever-medals-tokyo>)]

The response is insightful, but it essentially ran an effective Google search and summarized what it read. Let’s dig deeper with questions that’ll require some reasoning:

Q3a: Is there any association between Olympic winners and non-winners in terms of their height and weight, specifically in swimming?

Rationale: being something short of a 6ft+ Olympic specimen myself, I was curious if there have been undersized, but successful swimmers. Wanting to understand the general trend, I didn’t ask that exact question.

This resulted in a more interesting response that included a few studies on the relationship between physical attributes and sports performance. I’ll summarize the result for succinctness (which admittedly is a summary of a summary):

A3a:
- Height Advantage: Taller swimmers generally have a competitive advantage. This is because greater height usually means longer limbs, which can cover more distance per stroke. This advantage is particularly pronounced in freestyle and backstroke events.
- Lean Body Mass: Successful swimmers tend to have a higher lean body mass. This contributes to better buoyancy and propulsion in the water.

- Biomechanical Advantage: A study published in the “Journal of Applied Physiology” highlights that taller swimmers with longer limbs have a mechanical advantage, leading to longer strokes and reduced drag.
- Height and Weight Profiles: According to data analyzed during the Olympics, most medal-winning swimmers have a height ranging from 6’1” to 6’5” and a weight that supports a high muscle-to-fat ratio.

Conclusion: While height and weight are significant factors, they are not the only determinants of success in swimming. Technique, training, and mental toughness also play crucial roles. Nonetheless, there is a clear association between these physical attributes and success in swimming, particularly at the elite level.

Sources:
•	“Journal of Applied Physiology”
•	“International Journal of Sports Medicine”
•	Olympic athlete profiles and data analyses

This is the most insightful response thus far. It retrieves from several notable papers and directly touches on the question of height/weight. I hit it with a follow-up question next:

Q3b: Do you have any charts that you can share?

Rationale: I wasn’t sure what to expect, but I was hoping to see some kind of visualization to support the claim. It could be something it generated, or it could be a screenshot from a study. I’ll summarize the response for succinctness.

A3b:
Let me describe the intended chart and provide the code for you to run locally.

Description of the Chart

The chart is a scatter plot that visualizes the height and weight of famous Olympic swimmers, with the size of each point proportional to the number of medals they have won. This visualization helps in understanding any patterns or associations between these physical attributes and their success.

import matplotlib.pyplot as plt 
#other imports omitted

# Sample data for Olympic swimmers' height and weight
data = {
    'Name': ['Michael Phelps', 'Ian Thorpe', 'Katie Ledecky', 'Caeleb Dressel', 'Katinka Hosszu'],
    'Height_cm': [193, 195, 183, 191, 175],
    'Weight_kg': [88, 104, 73, 88, 68],
    'Medals': [28, 9, 10, 7, 5]
}
# ... omitting the plot config code for the post

plt.show()

I couldn’t help but laugh. GPT plotted 5 famous swimmers, their gold counts, and their respective biometric data. For fun, I ran the code and got the following plot:

What does this mean? Taller Olympians weigh more? Michael Phelps is the GOAT? Michael is leaner than Caeleb? Classic GPT… But, I still think it’s onto something interesting. So I dug deeper:

Q3c: Is there data beyond just 5 people? There should've been hundreds of swimming Olympians over time.

A3c:
Unfortunately, I’m unable to generate a comprehensive dataset of hundreds of Olympic swimmers here. However, I can guide you on how to find and analyze such data.

Steps to Obtain and Analyze a Larger Dataset**
Omitted for succinctness, but it essentially listed a few orgs to download and prep the data.

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Load the dataset (replace 'path_to_your_dataset.csv' with your actual file path)
df = pd.read_csv('path_to_your_dataset.csv')

# ... omitting the plot config code for the post

plt.show()

I let out a bigger laugh - it just told me to grab the data and plot it. Thanks, GPT. Now, to be clear, I’m not here to hate on GPT. Just the opposite. I believe in its potential and what it can unlock for our species; I’ve built a company betting on that very vision. But the reality is that GPT is limited in its capabilities where it doesn’t have the relevant data.

I recently listened to an episode of The Dwarkesh Podcast with Francois Chollet. Francois eloquently outlines the nuance of “intelligence” vs. “memory”. They’re far more intelligent than I am (and probably have bigger memory banks than I do, whatever that means), so I recommend you listen to the episode to get the full take. I’ll do my best to summarize here (mostly taken from the 0:28-18:52 mark).

Francois launched the ARC as an IQ test for machine intelligence designed to be resistant to memorization. The puzzles require very little “knowledge” to solve but are unlikely to have been encountered even by memorizing the entire internet.

He argues that LLMs are very good at memorizing static programs. They have a large bank of “solution programs” they can call on to solve specific, relevant problems. It looks like it’s reasoning, but it’s doing on-the-fly fetch (retrieval) of the programs. If you scale up the size of your DB and cram more knowledge and patterns into it, you can improve on the existing benchmarks. But that process of optimizing for the benchmark is not necessarily increasing the “intelligence” of the system; it is increasing the “skill”. Sure, you’re increasing its usefulness, and its scope of applicability, but not its “intelligence.”

So how does that apply to the Olympics? We see in our example where it starts by performing the logical step of searching and summarizing medal count (which in itself is multiple steps of “sub-program” by his definition). It conducts this for Q1, Q2, and Q3a. When Q3b arises, it pulls from its memory about some of the top swimmers it knows about and deices that plotting their height/weight/medal count using the appropriate visualization - a bubble chart, which is perfect for 3 measures - is the way to go. Unfortunately, most of us will see the chart and argue that it doesn’t lead to the analysis we want… but it’s directionally close!

It tells you that it “searched a few sites.”

Ultimately, if the LLM hadn’t encountered any writing that outlined the specific details of what you’re inquiring about, it can’t quite reason out the next step without some intervention. I’m sure there are numerous facts that GPT can teach me about the Olympics. Its history, complexity, budget, impact - essentially all language about the Olympics on the internet. But pushing beyond that memorized information requires it to default to its logical reasoning, which currently falls short. I’m also certain that development in coding agents by firms like Cognition, CoPilot, and OpenAI (we leverage several of these tools too) will improve reasoning and long-term planning for more complex problems.

Side note - this is related to another situation where I leverage LLM to generate demo data with unfavorable results - follow us for that future post.

So what does this mean for your Enterprise? Unlike the Olympics, it’s unlikely many people have written about your business’s private data. Companies like Writer and Glean will allow you to leverage your existing documents to fine-tune or augment via RAG, generating impressive ROI through faster knowledge discovery and content creation.

What about your analytics, you ask? Don’t worry - no need to be nervous about GPT messing up basic math for Q1.

At HyperArc, we’re focused on amplifying human analysts by codifying their “memory.” What did they analyze? Why? What was the result? How do they relate? Why is it important to your business? We believe that human intuition is difficult to replace, and LLMs are great at memorizing just about everything. By building the first-of-its-kind “complete-memory” BI, we’re able to help you generate intelligence via your queries and their results. This leads to the generation of the “small programs” that Francois referenced that are specific to your business to augment your BI effort.

Let’s look at some examples:

While GPT gave me the “shell” to get started, I wrote the code - using my human intelligence - to extract, load, and transform some messy Olympics data that I discovered online. Incidentally, one of the datasets does stem from the source that GPT suggested. I’ll chronicle that data extraction process in another post.

The Methodology:

I typically engage in some exploratory analysis to get to know a new dataset. This means slicing the data quickly and seeing the result. In this case, that meant ad-hoc exploration around weight, height, and medal count (regardless of medal type). I did the general discovery, followed by selecting a few specific sports including Swimming, Gymnastics, and Basketball. I then filtered for sports that didn’t have a weight class, which would skew this analysis. An entry-level analyst would be able to quickly conduct these steps.

As a bonus, I also leveraged HyperGraph‘s auto-exploration agent to crawl the search spaces and automatically discover useful queries.

The entire manual exploration process took 15 minutes through our drag-and-drop UI.

The Result:

The analysis generated the following HyperGraph. The HyperGraph is a graphical representation of insights about all queries done through our BI platform, which powers intelligent insight search as well as augmented analytics.

Let’s re-ask the more complex Q3a from above:

Question: Is there any association between Olympic winners and non-winners in terms of their height and weight, specifically in swimming?

HyperGraph AI Answer:

Let’s look at the outcome:

The result first summarizes what the AI observed.
It then returns the relevant nodes, along with context for why it’s important. We can quickly see that it selects a node that directly compares the height/weight when grouped by the medal type. This yields the actual query results, executed by our query engine. GPT gets to avoid doing any math.
The real magic happens when you look a few nodes further down, where HyperGraph highlights several other nodes that might give us more context. I use a shortcut to compare those charts, and it pulls up a comparison of height/weight/medal for swimmers against the other sports that I explored.

To put a bow on it, I ask a simple question that I purposefully didn’t issue the appropriate query for:

Question: What country has the most medals?

HyperGraph AI Answer:

While I didn’t directly make this query, the HyperGraph Agent covered this question as part of the auto-exploration. We’ll go into more detail on the HyperGraph Agent in a future post.

What if we look at something that the HyperGraph Agent didn’t query on my behalf?

Question: What country has the most medals while having a low population size??

HyperGraph AI Answer:

Despite definitively lacking population data in our dataset, HyperGraph AI leverages its contextual knowledge about the world country population to produce a relatively accurate result. The expected answers, after some Googling, are the Bahamas, Hungary, Finland, and Sweden. For such a vague question, this is incredibly neat to see.

Conclusion:

I don’t fault GPT for not having as detailed of an answer - it didn't have access to the raw data! So what are my takeaways?

Data is still key. None of the queries were possible without something resembling queryable data, and plenty of reconciliations were needed to clean up differently spelled names and annually changing events.
LLMs can rapidly accelerate how we learn about both conceptual knowledge and analytics around your private data. They aren’t amazing at math (yet), but have perfect memory of everything that they have been trained on.
To that point… it’s entirely possible that GPT would’ve known about all of the questions I asked, had it had access to sufficient training data. The internet starts to look small once you consider all of the possible human knowledge ever created. The future state is combining the power of AI while maintaining control and visibility for the analyst to bring that human knowledge to bear.

Why this matters:

Your analysts and data scientists issue queries every day. While a small percentage of those queries may make it into a dashboard, most will be stale and obscure in a matter of a few quarters. At the same time, there isn’t a public dataset being used for the GPT models that includes YOUR private business data. HyperArc automatically generates the building blocks of proprietary insights that will serve as the foundation of your future AI+BI efforts.

P.S.

The market is a bit nuts today, but go USA nonetheless!

HyperArc

Olympics, AI, and some BI

Discussion about this post