When you predict with the small model, the big model can verify as more of a bat...

		cma 15 days ago \| parent \| context \| favorite \| on: The path to ubiquitous AI (17k tokens/sec) When you predict with the small model, the big model can verify as more of a batch and be more similar in speed to processing input tokens, if the predictions are good and it doesn't have to be redone.