Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Well the API calls worked perfectly. The LLM didn’t misinterpret that.

The data extraction via tesseract worked too.

The whisper transcript was pretty good. Not perfect, but when you do this daily you are easily able to work around things.

The summaries of the calls were very useful. I could easily verify those because I was on the calls.

The interview - again, transcript is great. The bulleted narrative was guided - again - by me having been on the call. I certify he quotes against the transcript, and audio if I’ve got any doubts.

Scrapers - again, they worked fine. The LLM didn’t misinterpret anything.

Podcasts - as before. Easy.

Article to voice - what’s to misinterpret?

Your criticism sounds like a lot of waffle with no understanding of how to use these tools.



How do you know a summary of a podcast you haven't listened to is accurate?


Firstly I am not summarising the podcast, simply using whisper to make a transcript.

T even if I was, because I do this multiple times a day and have been for quite sone time I know how to check for errors.

One part of that is a “fact check” built into the prompt, another part is feeding the results of that prompt back into the API with a second prompt and the source material and asking it to verify that the output of the first prompt is accurate.

However the level of hallucination has dropped massively over time, and when you’re using LLMs all the time you quickly become attuned to what’s likely to cause them and how to mitigate them.

I don’t mean this in an unpleasant way but this question - and many of the other comments responding to my initial description of how I use LLMs - feel like the story is things that people who have slightly hand wavey experience of LLMs think, having played with the free version of ChatGPT back in the day.

Claude 3.7 is far removed from ChatGPT at launch, and even now ChatGPT feels like a consumer facing procure while Claude 3.7 feels like a professional tool.

And when you couple that with detailed tried and tested prompts via the api in a multistage process, it is incredibly powerful.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: