How does a massive company like Intel who's work heavily involves code/compiler ...

bri3d · on July 25, 2022

Why _would_ you expect anything different to happen at a massive company?

It's not like the team who write the drivers are likely to know of the team working on optimizing compilers, profilers, or anything at all really.

My experience has been that especially in companies working in diverse disciplines across disparate codebases, very little is shared. A team of 8 in a tiny company is just as likely to make the same mistakes as the team of 8 in a bigger company. At large companies with more unified codebases and disciplines, maybe one person or team has added some process which helps identify egregious performance issues at some point in the past. But such shared process or tooling would be really hard at a company like Intel where one team makes open-source Linux drivers while another makes highly specialized RTL design software, for example.

henrydark · on July 25, 2022

This reminds me of a line from Great Gatsby, where Jordan says she likes big parties because they're so intimate

gspetr · on July 25, 2022

>Why _would_ you expect anything different to happen at a massive company?

Because a massive company has enough money around to put the processes in place and hire skilled people to do both deep[0] testing and system[1] testing.

[0] https://www.developsense.com/blog/2017/03/deeper-testing-1-v...

[1] The definition of "system testing" I'm using: "Testing to assess the value of the system to people who matter." Those include stakeholders, application developers, end users, etc.

Heston · on July 25, 2022

Exactly what I was thinking. Not to mention the prior experience with similar technologies.

wmf · on July 25, 2022

Maybe they did profile it and this fix is the result. Or maybe Vulkan raytracing on Linux for an unreleased GPU is lower priority and they just recently got around to noticing it.

jussayonething · on July 25, 2022

Massive companies are more prone to silly errors like this.

Source: I work for a similar massive company. You would not believe the amount of issues similar to this. This one is gettin attention because it happened in open source code.

gspetr · on July 25, 2022

Does the company have people whose job description includes looking for deeper problems such as this one?

I don't know what your position or political standing in the company is, but I assume that with the tech job market the way it is, if you still work there you care about the company to some degree. So perhaps bringing this issue up with (more) senior management is the way to go.

And if they say there is no budget, or that it would take a bureaucratic nightmare to make space for it in the budget, ask them what the budget is for dealing with PR disasters such as this one.

needusername · on July 25, 2022

It’s my general impression that software is not one of Intel´s strengths.

spicymaki · on July 25, 2022

That's really ignorant given that Intel has thousands of software engineers supporting hundreds of opensource projects you use daily. Including Linux where Intel has consistently a top ten contributor for years.

gumby · on July 25, 2022

A little concerning given how much software there is inside their CPUs (not just microcode but the management processor et al).

drfuchs · on July 25, 2022

And how much software is used to generate what’s inside them; search for “Pentium FDIV bug”.

dathinab · on July 25, 2022

I'm not sure about this.

This mistake could easily have been in other vendors Linux GPU drivers, they in the end don't have nearly the same priority (and in turn resources) as the Windows GPU drivers. And it's a very easy mistake to find. And I don't know if anyone even cared about ray tracing with Intel integrated graphics on Linux desktops (and in turn no one profiled it deeply). I mean ray tracing is generally something you will do much less likely on a integrated GPU. And it's a really easy mistake to make.

And sure I'm pretty sure their software department(s?) have a lot of potential for improvement, I mean they probably have been hampered by the same internal structures which lead to Intel faceplanting somewhat hard recently.

actually_a_dog · on July 25, 2022

Even so, the very first thing anybody learns about GPU programming is to use the VRAM on the card whenever possible, and to minimize transfers back and forth between VRAM and main memory. This is a super basic mistake that should have been caught by some kind of test suite, at least.

operator-name · on July 25, 2022

> There was no local memory bit when you landed this so not your fault ;) https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17...

pclmulqdq · on July 25, 2022

Intel's high-level software teams are okay, and their hardware teams are great, but their firmware teams are a bit of a garbage fire. I assume that nobody really wants to work on firmware, and the organization does not encourage it.

mort96 · on July 26, 2022

I'm not sure it seems like something you'd easily find through profiling? The change was changing a memory allocation to use GPU memory rather than system memory. Allocating system memory probably isn't noticeably slower than allocating GPU memory, so the line that's at fault wouldn't show up when profiling. Instead, memory access in GPU-side raytracing code is just a bit slower when accessing the allocated memory.

So you would have to profile GPU-side code, which is probably really hard; and you'd have to find slow memory accesses, not slow code or slow algorithms, which is even harder. And those memory accesses may be spread out, so that each instruction which uses the slow memory won't stand out; the effect may only be noticeable in aggregate.

coliveira · on July 25, 2022

People working at big companies are ALWAYS worried about releasing lots of code that they need to fulfill some monthly or quarterly goals. These ideas that they have time to profile, improve, or check results are inconsistent with reality. When you see real code produced at big companies, it is barely good enough to satisfy the requirements, forget about any sense of high quality.

djmips · on July 26, 2022

Not just Intel but programmers in general have got to demand better tools and use the tools they have. This is an obvious problem if you can see it. It needs to be on every programmers checklist to profile.

secondcoming · on July 25, 2022

Just the other day, an Intel graphics-related driver (dwm or something) was using 13GB of my laptop’s memory.

mort96 · on July 26, 2022

That could be a GPU memory leak in an application, no? When an application allocates GPU memory, that's taken from main system memory on integrated chops, and the Intel driver would be responsible for that.

secondcoming · on July 26, 2022

It seems not, and it's not fixed. I suspect it's related to using a second monitor over thunderbolt

[0] https://www.intel.com/content/www/us/en/support/articles/000...

cjbgkagh · on July 25, 2022

In my circles Intel is a famously awful company to work for.