I'm curious how much trained in bias damages in-context performance. It's one th...

heavyset_go · 2025-10-19T07:04:59 1760857499

I can't prove it, but my experience with commercial models is that baked-in bias is strong. There have been times where I state X=1 over and over again in context, but get X=2, or some other value, back sometimes. There are times where I get it every time, or something different every time.

You can see this with some coding agents, where they are not good at ingesting code and reproducing it as they saw it, but can reply with what they were trained on. For example, I was configuring a piece of software that had a YAML config file. The agent kept trying to change the values of unrelated keys to their default example values from the docs when making a change somewhere else. It's a highly forked project so I imagine both the docs and the example config files are in its training set thousands, if not millions of times, if it wasn't deduped.

If you don't give access to sed/grep/etc to an agent, the model will eventually fuck up what's in its context, which might not be the result of bias every time, but when the fucked up result maps to a small set of values, kind of seems like bias to me.

To answer your question, my gut says that if you dumped a CSV of that data into context, the model isn't going to perform actual statistics, and will regurgitate something closer in the space of your question than further away in the space of a bunch of rows of raw data. Your question is going to be in the training data a lot, like explicitly, there are going to be articles about it, research, etc all in English using your own terms.

I also think by definition LLMs have to be biased towards their training data, like that's why they work. We train them until they're biased in the way we like.

ryukoposting · 2025-10-19T03:33:37 1760844817

> I'm curious how much trained in bias damages in-context performance.

I think there's an example right in front of our faces: look at how terribly SOTA LLMs perform on underrepresented languages and frameworks. I have an old side project written in pre-SvelteKit Svelte. I needed to do a dumb little update, so I told Claude to do it. It wrote its code in React, despite all the surrounding code being Svelte. There's a tangible bias towards things with larger sample sizes in the training corpus. It stands to reason those biases could appear in more subtle ways, too.

andy99 · 2025-10-19T11:00:55 1760871655

Coreference resolution tests something like this. You give an LLM some sentence like “The doctor didn’t have time to meet with the secretary because she was treating a patient” and ask who does “she” refer to. Reasoning tells you it’s the doctor but statistical pattern matching makes it the secretary, so you check how the model is reasoning and if correlations (“bias”) trump logic.

https://uclanlp.github.io/corefBias/overview

zmmmmm · 2025-10-20T05:54:59 1760939699

that's really interesting - thanks!

dotancohen · 2025-10-19T10:38:02 1760870282

The question of bias reduces to bias in factual answers and bias in suggestions - both which come from the same training data. Maybe they shouldn't.

If the model is trained on data that shows e.g. that blacks earn less, then it can factually report on this. But it may also suggest this be the case given an HR role. Every solution that I can think of is fraught with another disadvantage.