We look at how comparative advantage from economics applies to LLM inference - some GPUs are relatively better at FLOPs, others at memory bandwidth. What happens if you let each do what it’s best at?
Hey there - TitanML is these guys: https://www.titanml.co/ . I think the impressive thing isn't actually whether the model is good (although it is a good model especially when fine-tuned) - but how fast this model runs on CPU with the TitanML server compared with before.
I'm not a lawyer, but if you've been using it commercially, I believe you have the trademark. Theoretically, you could sue them for infringement, although I don't know whether you could claim damages or just get them to stop using it.