mezark's comments

mezark · 2025-09-19T17:11:52 1758301912

We look at how comparative advantage from economics applies to LLM inference - some GPUs are relatively better at FLOPs, others at memory bandwidth. What happens if you let each do what it’s best at?

mezark · on March 11, 2025

Huge congrats - and when you look at the latency graphs as well it really shows the value of these specialised systems!

mezark · on Oct 28, 2023

TitanML Takeoff Inference Server demonstrating controlled generation

mezark · on Aug 1, 2023

Drop in replacement for HF's TGI server. The fastest and easiest way to inference LLMs locally

Github: https://github.com/titanml/takeoff Docs: https://docs.titanml.co/docs/titan-takeoff/getting-started Discord: https://discord.gg/83RmHTjZgf

mezark · on July 5, 2023

Falcon 7B running real time on CPU

smcleod · on July 5, 2023

The linked video seems to have no context provided? What is a titan ML server? Is 7B actually that useful? How does the model compare to others? Etc…

mezark · on July 6, 2023

Hey there - TitanML is these guys: https://www.titanml.co/ . I think the impressive thing isn't actually whether the model is good (although it is a good model especially when fine-tuned) - but how fast this model runs on CPU with the TitanML server compared with before.

mezark · on April 14, 2023

Annoying because they stole my company's name (TitanML - https://www.titanml.co/) Fortunately they haven't trademarked it, but still unideal.

gentschev · on April 23, 2023

I'm not a lawyer, but if you've been using it commercially, I believe you have the trademark. Theoretically, you could sue them for infringement, although I don't know whether you could claim damages or just get them to stop using it.