More

GaggiX · 2026-05-09T23:34:03 1778369643

The C compiler written by Claude a few months was able to compile a hello world.

The main problem I think that it was extremely slow.

GaggiX · 2026-04-28T18:41:24 1777401684

Using larger contexts often costs more in the APIs or consume more of your quota but this is becoming less of a problem with models using more clever attention mechanisms and not just full attention on all layers.

You can look at: https://sebastianraschka.com/llm-architecture-gallery/ and see how much things have changed.

margalabargala · 2026-04-28T20:00:11 1777406411

This is also something of a non issue because as context grows and attention gets diluted, the models perform worse. It'll cost Anthropic more to run your 900k context session, yes, but it's in your interest not to have a 900k session in the first place.

great_psy · 2026-05-09T11:04:54 1778324694

You’re right about performance degradation, but good luck trying to sell that as a product.

You can drive this car, but the last mile of this trip will use as much gas as the first 20 miles.

I think it’s in anthropics interest to keep this fact hidden from CEOs who push for ai adoption.

GaggiX · 2026-04-28T09:07:51 1777367271

First G-shock G-LIDE*

GaggiX · 2026-04-22T15:25:50 1776871550

At 4-bit quantization it should already fit quite nicely.

Aurornis · 2026-04-22T15:42:23 1776872543

Unfortunately not with a reasonable context length.

regularfry · 2026-04-22T22:14:49 1776896089

I've got 139k context with the UD-Q4_K_XL on a 4090, q8_0 ctk/v. Could probably squeeze a little more but that's enough for me for the moment.

corysama · 2026-04-23T00:00:35 1776902435

Hey, buddy! Can I bum a command line arg list off ya?

GaggiX · 2026-04-22T16:57:14 1776877034

The model uses Gated DeltaNet and Gated Attention so the memory usage of the KV cache is very low, even at BF16 precision.

kkzz99 · 2026-04-22T16:25:30 1776875130

It really depends on what you think a reasonable context length is, but I can get 50k-60k on a 4090.

GaggiX · 2026-04-17T22:45:47 1776465947

Fil-C is much slower, no free lunch, if you want the language to be fast and memory safe you need to add restrictions to allow proper static analysis of the code.

GaggiX · 2026-04-07T14:51:24 1775573484

Much better idea to just buy oil and gas from Russia /s

croes · 2026-04-07T14:54:22 1775573662

Better than nuclear power plants getting hit by drones.

We will have Chernobyl longer than dependency on Russian oil and gas

GaggiX · 2026-04-05T16:51:24 1775407884

There are plenty of good models on Openrouter that are very cheap, maybe it's time to experiment with alternatives.

sfmike · 2026-04-05T17:02:31 1775408551

what are some of them?

GaggiX · 2026-04-05T17:19:44 1775409584

MiniMax M2.7, MiMo-V2-Pro, GLM-5, GLM5-turbo, Kimi K2.5, DeepSeek V3.2, Step 3.5 Flash (this last one is particularly cheap while still being powerful).

subscribed · 2026-04-05T17:48:46 1775411326

Can't judge on the quality of the comparison but I'd start from https://arena.ai/leaderboard/code and maybe from OpenRouter's ranking.

oompydoompy74 · 2026-04-05T17:06:09 1775408769

Kimi K2

GaggiX · 2026-04-03T09:35:45 1775208945

>so despite the name it is probably best compared with the 8B/9B

It runs much faster than a standard 8B/9B model, the name is given by the fact that it uses per-layer embedding (PLE).

GaggiX · 2026-03-31T17:43:01 1774978981

There is also: https://github.com/linto-ai/whisper-timestamped

It doesn't use an extra model (so it supports every language that works with Whisper out of the box and use less memory), it works by applying Dynamic Time Warping to cross-attention weights.

oezi · 2026-03-31T19:35:25 1774985725

Just a warning that plain WhisperX is more accurate and Whisper-timestamped has many weird quirks.

GaggiX · 2026-03-29T13:47:53 1774792073

I will copy the supermarket and paste it somewhere else.

I'm also going to download a car.