Author here. I agree with you - the number of metrics I can experiment with in P...

Tade0 · on May 4, 2025

If I may, I would like to propose an, ahem, sport:

https://m.youtube.com/@JellesMarbleRuns

Greg Woods' commentary really brings this world of marble racing to life.

pncnmnp · on May 4, 2025

Hehehe! I love Jelle's Marble Runs - long-time subscriber. John Oliver introduced me to it - https://www.youtube.com/watch?v=z4gBMw64aqk

rybosome · on May 4, 2025

This is a great premise, and that underlying pipeline you mention sounds like a generally useful system for live commentary with the appropriate abstractions.

I’m curious to know more about how you retrieve from this ecosystem of data to add color. You mentioned nearest neighbor search, is that over game state? How is the data stored and queried?

pncnmnp · on May 4, 2025

Absolutely! I can elaborate on that part.

The code starts by simulating 15 tournament years (like from 2010 to 2024), with each year containing 4 grand slam tournaments - held in a knockout format. There are 64 players in the pool, all starting with an initial ELO score.

These players compete in the tournaments, with outcomes predicted based on their ELO ratings. ELO is then updated after each match. We rank players solely based on their ELO. Once the simulation completes, it generates a wealth of data. For each game, details such as points scored, points allowed, fastest ball speed, number of aces, point-by-point results, and more are simulated.

We can then cache and use this information for a ton of color commentary. For example, we can identify the GOATs of the game, highlight players who are performing exceptionally well, pinpoint underdogs, find matches similar to the one currently being played, etc.

However, I am just scratching the surface. Imagine having a function that considers "age" alongside ELO. Then, you could simulate performance based on age as well - and show things like the younger generation overtaking older players, or veterans still competing despite being past their prime. With a fn like this, you could simulate matches that span the past 75-100 years, generating a ton of nice data to analyze.

Data itself is not fun - you need nice metrics too - for fun correlations! See https://en.wikipedia.org/wiki/Baseball_statistics. The metrics don’t have to be perfect, after all, humans aren’t perfect. The key is engagement.

To find similar games, I store and cache all historical matches in a KD-tree, then use a NN search to find similar games - that's quite fast!

Some commentary can also be dynamically generated at runtime - for example, locker-room whispers. It is important to provide GPT with a decent historical window to avoid generating contradictory info in such cases.