The llm model outputs a vector of probabilities for tokens, and the llm user picks a token from the most likely list using a random number
The llm model outputs a vector of probabilities for tokens, and the llm user picks a token from the most likely list using a random number