I also like meaningful commit names. But am sometimes guilty of “hope this works now” commits, but they always follow a first fix that it turns out didn’t cut it.
I work on a lot of 2D system, and the only way to debug is often to plot 1000s of results and visually check it behaves as expected. Sometimes I will fix an issue, look at the results, and it seems resolved (was present is say 100 cases) only to realize that actually there are still 5 cases where it is still present. Sure I could amend the last commit, but I actually keep it as a trace of “careful this first version mostly did the job but actually not quite”
I mean “mistakes” can be hard to define. IMHO there is an area of responsibility between the LLM, the LLM user, and the code itself.
Did it make a mistake because I didn’t follow instructions properly or hallucinated some content?
Did it make a mistake because the prompt was unclear/open to interpretation or plain wrong?
Did it make a mistake because it lacked some context? Or too much context and it starts getting confused?
Is not handling edge cases automatically when that was not requested a mistake?
I am not just trying to defend LLMs, in many cases they make obvious mistakes and just don’t follow my arguably clear instructions properly. But sometimes it is not so clear cut. Maybe I didn’t link a relevant file (you can argue it could have looked to it), maybe my prompt just wasn’t that clear etc
If I take the example of code, but that extends to many domains, it can sometimes produce near perfect architecture and implementation if I give it enough details about the technical details and fallpits. Turning a 8h coding job into a 1h review work.
On the other hand, it can be very wrong while acting certain it is right. Just yesterday Claude tried gaslighting me into accepting that the bug I was seeing was coming from a piece of code with already strong guardrails, and it was adamant that the part I was suspecting could in no way cause the issue. Turns out I was right, but I was starting to doubt myself
I think over time we will find better usage patterns for these machines. Even putting a model in a position to gaslight the user seems like a complete failure in the usage model. Not critiquing you at all on this, it's how these models are marketed and what all the tooling is built around. But they are incredibly useful and I think once we figure out how to use them better we can minimise these downsides and make ourselves much more productive without all the failures.
Of course that won't happen until the bubble pops - companies are racing to make themselves indispensable and to completely corner certain markets and to do so they need autonomous agents to replace people.
I mean I do agree, and on iNat I can clearly see my house and the house of a few other people in the neighborhood. However you can easily find the current owner information for a given house in the state I live in, and since we bought the house, our name.
I guess it is different once you look at people renting, and also you could track a specific person posts to see when they are posting away from home for example. But as far as revealing your home address, sadly there are many other ways in a lot of cases
Some of the scenes from the video remind me of Manifold Garden [1] - only 3D but a 3-torus [2] and you can change the direction of gravity, i.e. what is up and down. And also visually beautiful.
The additional code he wrote to make a 4d game work in the Unity game engine is available on github and MIT licensed: https://github.com/HackerPoet/Engine4D
When I first saw the title, I assumed this game was implemented in the same engine. I believe there are a few already.
I am not sure, but you don’t have to technically bet on assassination. You can bet on an event which would happen as a result of said assassination. X won’t get re-elected. Company Y CEO will change in 2027. This is artist Z last tour. Athlete K won’t participate in this event etc.
It's the inverse, here how you have to bet (unless you plan to be doing the hands on assassination works) :
X will get re-elected. Company Y CEO will not change in 2027. This is not artist Z last tour. Athlete K will participate in this event etc.
Like I said elsewhere in this thread the bet have to be lost if you want your target dead.
The issue is the combined risk of insider trading coupled with the bias of disaster-centric betting, or at least event-centric betting. This means if you have the means to create an “out of the ordinary” event you have a strong incentive to make it happen and to bet on it. These must be controllable events, so not natural or complex systems. On the gentler side it would be sports fixing, which has always existed. On the worse side it would be causing war, making economic decisions that will impact many, betting on people death and so on. These kind of things are seemingly already happening to a certain degree.
I am curious about how much energy needs to be expanded to contain the anti-matter. Say it the matter/anti-matter is to be used for propulsion/energy generation can we reach a threshold were we are actually energy positive
Yes it feels like a full time job just to try to keep up. And I’ve been in AI for close to 10 years so I feel like I have to keep up at least a minimum.
An other thing for me is that it has gotten a lot harder for small teams with few ressources, let one person, to release anything that can really compete with anything the big player put out.
Quite a few years back I was working on word2vec models / embeddings. With enough time and limited ressources I was able to, through careful data collection and preparation, produce models that outperformed existing embeddings for our fairly generic data retrieval tasks. You could download from models Facebook (fasttext) or other models available through gensim and other tools, and they were often larger embeddings (eg 1000 vs 300 for mine), but they would really underperform. And when evaluating on general benchmarks, for what existed back then, we were basically equivalent to the best models in English and French, if not a little better at times. Similarly later some colleagues did a new architecture inspired by BeRT after it came out, that outperformed again any existing models we could find.
But these days I feel like there is nothing much I can do in NLP. Even to fine-tune or distillate the larger models, you need a very beefy setup.
I work on a lot of 2D system, and the only way to debug is often to plot 1000s of results and visually check it behaves as expected. Sometimes I will fix an issue, look at the results, and it seems resolved (was present is say 100 cases) only to realize that actually there are still 5 cases where it is still present. Sure I could amend the last commit, but I actually keep it as a trace of “careful this first version mostly did the job but actually not quite”
reply