Hacker Newsnew | past | comments | ask | show | jobs | submit | latemedium's commentslogin

We need to know if big AI labs are explicitly training models to generate SVGs of pelicans on bicycles. I wouldn't put it past them. But it would be pretty wild in they did!


I think part of the reason why just a few people write custom CUDA / triton kernels is that it's really hard to do well. Languages like Mojo aim to make that much easier, and so hopefully more people will be able to write them (and do other interesting things with GPUs that are too technically challenging right now)


The only question will there be a benefit in writing your own kernels in something like Mojo than to skip that part altogether and use the primitives with already highly optimized kernels that frameworks like torch provide especially when it comes to performance.


There's an important difference between Gemini and Claude that I'm not sure how to quantify. I often use shell-connected LLMs (LLMs with a shell tool enabled) to take care of basic CSV munging / file-sorting tasks for me - I work in data science so there's a lot of this. When I ask Claude to do something, it carefully looks at all the directories and files before doing anything. Gemini, on the other hand, blindly jumps in and just starts moving stuff around. Claude executes more tools and is a little slower, but it almost always gets the right answer because it appropriately gathers the right context before really trying to solve the problem. Gemini doesn't seem to do this at all, but it makes a world of difference for my set of problems. Curious to see if others have had the same experience or if its just a quirk of my particular set of tasks


Claude has always been the best at coding, no matter what all the benchmarks says, the people have spoken and the consensus is that Claude is the best.


What's a shell connected LLM and how to do that?


Look up Claude Code, Cursor, Aider and VSCode's agent integration. Generally, tools to use AI more actively for development. There are others as well. Plenty of info around. Here's not the place for a tutorial.


My experience is starkly different. Today I used LLMs to:

1. Write python code for a new type of loss function I was considering

2. Perform lots of annoying CSV munging ("split this CSV into 4 equal parts", "convert paths in this column into absolute paths", "combine these and then split into 4 distinct subsets based on this field.." - they're great for that)

3. Expedite some basic shell operations like "generate softlinks for 100 randomly selected files in this directory"

4. Generate some summary plots of the data in the files I was working with

5. Not to mention extensive use in Cursor & GH Copilot

The tool (Claude 3.7 mostly, integrated with my shell so it can execute shell commands and run python locally) worked great in all cases. Yes I could've done most of it myself, but I personally hate CSV munging and bulk file manipulations and its super nice to delegate that stuff to an LLM agent

edit: formatting


These seem like fine use cases: trivial boilerplate stuff you’d otherwise have to search for and then munge to fit your exact need. An LLM can often do both steps for you. If it doesn’t work, you’ll know immediately and you can probably figure out whether it’s a quick fix or if the LLM is completely off-base.


That’s fair but it’s totally different use cases than the linked post discusses.


The click baited title is “I genuinely don’t understand how some people are still bullish about LLM”.

I guess the author can understand now?

When something was impossible only 3 years ago, barely worked 2 years ago, but works well now, there are very good reasons to be bullish, I suppose?

The hypes cut both way.


> When something was impossible only 3 years ago, barely worked 2 years ago, but works well now

Are you talking of what exactly? What are you stating works well now and did not years ago? Claude as a milestone of code writing?

Also in that case, if there are current apparent successes coming from a realm of tentative responses, we would need proof that the unreliable has become reliable. The observer will say "they were tentative before, they often look tentative now, why should we think they will pass the threshold to a radical change".


How did you integrate Claude into your shell


I wrote my own tool for that a while back as an LLM plugin, so I can do this:

    llm cmd extract first frame of movie.mp4 as a jpeg using ffmpeg
I use that all the time, it works really well (defaulting to GPT-4o-mini because it's so cheap, but it works with Claude too): https://simonwillison.net/2024/Mar/26/llm-cmd/


I hacked something together a while back - a hotkey toggles between standard terminal mode and LLM mode. LLM mode interacts with Claude, and has functions / tool calls to run shell commands, python code, web search, clipboard, and a few other things. For routine data science tasks it's been super useful. Claude 3.7 was a big step forward because it will often examine files before it begins manipulating them and double-checks that things were done correctly afterwards (without prompting!). For me this works a lot better than other shell-integration solutions like Warp


Claude Code is available directly from Anthropic, but you have to request an invite as it's in "Research Preview"

There are third party tools that do the same, though


I'm reminded of the metaphor that these models aren't constructed, they're "grown". It rings true in many ways - and in this context they're like organisms that must be studied using traditional scientific techniques that are more akin to biology than engineering.


Sort of.

We don’t precisely know the most fundamental workings of a living cell.

Our understanding of the fundamental physics of the universe has some hold.

But for LLMs and statistical models in general, we do know precisely what the fundamental pieces do. We know what processor instructions are being executed.

We could, given enough research, have absolutely perfect understanding of what is happening in a given model and why.

Idk if we’ll be able to do that in the physical sciences.


Having spent some time working with both molecular biologists and LLM folks, I think it's pretty good analogy.

We know enough quantum mechanics to simulate the fundamental workings of a cell pretty well, but that's not a route to understanding. To explain anything, we need to move up an abstraction hierarchy to peptides, enzymes, receptors, etc. But note that we invented those categories in the first place -- nature doesn't divide up functionality into neat hierarchies like human designers do. So all these abstractions are leaky and incomplete. Molecular biologists are constantly discovering mechanisms that require breaking the current abstractions to explain.

Similarly, we understand floating point multiplication perfectly, but when we let 100 billion parameters set themselves through an opaque training process, we don't have good abstractions to use to understand what's going on in that set of weights. We don't have even the rough equivalent of the peptides or enzymes level yet. So this paper is progress toward that goal.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: