Hacker Newsnew | past | comments | ask | show | jobs | submit | bjackman's commentslogin

You are always gonna have some downtime in a homelab setup I think. Unless you go all in with k8s I think the best you can do is "system reboots at 4AM, hopefully all the users are asleep".

(Probably a lot of the services I run don't even really support HA properly in a k8s system with replicas. E.g. taking global exclusive DB locks for the lifetime of their process)


> You are always gonna have some downtime in a homelab setup I think. Unless you go all in with k8s I think the best you can do is "system reboots at 4AM, hopefully all the users are asleep".

Huh, why? I have a homelab, I don't have any downtime except when I need to restart services after changing something, or upgrading stuff, but that happens what, once every month in total, maybe once every 6 months or so per service?

I use systemd units + NixOS for 99% of the stuff, not sure why you'd need Kubernetes at all here, only serves to complicate, not make things simple, especially in order to avoid downtime, two very orthogonal things.


> I don't have any downtime except when I need to restart services

So... you have downtime then.

(Also, you should be rebooting regularly to get kernel security fixes).

> not sure why you'd need Kubernetes at all here

To get HA, which is what we are talking about.

> only serves to complicate

Yes, high-availability systems are complex. This is why I am saying it's not really feasible for a homelabber, unless we are k8s enthusiasts I think the right approach is to tolerate downtime.


> So... you have downtime then.

5 seconds of downtime as you change from port N to port N+1 is hardly "downtime" in the traditional sense.

> To get HA, which is what we are talking about.

Again, not related to Kubernetes at all, you can do it easier with shellscripts, and HA !== orchestration layer.


I run my stuff in a local k8s cluster and you are correct, most stuff runs as replica 1. DBs actually don't because CNPG and mariadb operator make HA setups very easy. That being said, the downtime is still lower than on a traditional server

I don't think RPi is the gold standard nor is Chinese production that strongly correlated with poor SW support?

Raspberry Pi usually requires customisation from the distro. This is mitigated by the fact that many distros have done that customisation but the platform itself is not well-designed for SW support.

Meanwhile many Allwinner and Rockchip platforms have great mainline support. While Qualcomm is apparently moving in the right direction but historically there have been lots of Qualcomm SBCs where the software support is just a BSP tarball on a fixed Linux kernel.

So yeah I do agree with your conclusion but it's not as simple as "RPi has the best software support and don't buy Chinese". You have to look into it on a case by case basis.


If your benchmarks are fast enough to run in pre-commit you might not need a time series analysis. Maybe you can just run an intensive A/B test between HEAD and HEAD^.

You can't just set a threshold coz your environment will drift but if you figure out the number of iterations needed to achieve statistical significance for the magnitude of changes you're trying to catch, then you might be able to just run a before/after then do a bootstrap [0] comparison to evaluate probability of a change.

[0] https://en.wikipedia.org/wiki/Bootstrapping_(statistics)


If you've had the problem it solves you don't really need an explanation beyond "Change Detection for Continuous Performance Engineering" I think.

Basically if I'm reading it correctly the problem is you want to automate detection of performance regressions. You can't afford to do continuous A/B tests. So instead you run your benchmarks continuously at HEAD producing a time series of scores.

This does the statistical analysis to identify if your scores are degrading. When they degrade it gives you a statistical analysis of the location and magnitude of the (so something like "mean score dropped by 5% at p=0.05 between commits X and Y").

Basically if anyone has ever proposed "performance tests" ("we'll run the benchmark and fail CI if it scores less than X!") you usually need to be pretty skeptical (it's normally impossible to find an X high enough to detect issues but low enough to avoid constant flakes), but with fancy tools like this you can say "no to performance tests, but here's a way to do perf analysis in CI".

IME it's still tricky to get these things working nicely, it always requires a bit of tuning and you are gonna be a bit out of your depth with the maths (if you understood the inferential statistics properly you would already have written a tool like this yourself). But they're fundamentally a good idea if you really really care about perf IMO.


Does a cache help with inference workloads anyway?

I don't know much about it but my mental model is that for transformers you need random access to billions of parameters.


It's streaming access, and no not as far as I'm aware. APUs have always been hilariously bottlenecked on memory bandwidth as soon as your task actually needed to pull in data. The only exception I know of is the PS5 because it uses GDDR instead of desktop memory.

I have had so many "why don't you just" conversations with academics about this. I know the "why don't you just" guy is such an annoying person to talk to, but I still don't really understand why they don't just.

This article pointed to a few cases where people tried to do the thing, i.e. the pledge taken by individual researchers, and the requirements placed by certain funding channels, and those sound like a solid attempt to do the thing. This shows that people care and are somewhat willing to organise about it.

But the thing I don't understand is why this can't happen at the department level? If you're an influential figure at a top-5 department in your field, you're friends with your counterparts at the other 4. You see them in person every year. You all hate $journal. Why don't you club together and say "why don't we all have moratorium on publishing in $journal for our departments?"

No temptation for individual research groups to violate the pledge. No dependence on individual funding channels to influence the policy. Just, suddenly, $journal isn't the top publication in that field any more?

I'm sure there are lots of varied reasons why this is difficult but fundamentally it seems like the obvious approach?


> If you're an influential figure at a top-5 department in your field ... you all hate $journal.

That's the problem, they don't hate these journals, they love them. Generally speaking they're old people who became influential by publishing in these journals. Their reputation and influence was built on a pile of Science and Nature papers. Their presentations all include prominent text indicating which figures came from luxury journals. If Science and Nature lose their prestige so do they (or at least that's what they think)

This was very apparent when eLife changed their publishing model. Their was a big outpouring of rage from older scientists who had published in eLife when it was a more standard "high impact" journal. Lots of "you're ruining your reputation and therefore mine".


Maybe I am underestimating the gap in status between the "influential figures" I imagine and the people I actually know.

I see: my friend has 10-15 years of experience in their field, they have enjoyed success and basically got the equivalent of a steady stream of promotions.

I map this onto my big tech/startup experience. I mentally model them as: they are "on top of the pile" of people that still do technical work. Everyone who still has the ability to boss them around, is a manager/institutional politician type figure who wouldn't interfere in such decisions as which journal to publish in.

But probably this mapping is wrong.

Also, I probably have a poor model of what agency and independence looks like in academia. In my big tech world, I have a pretty detailed model in my head of what things I can and can't influence. I don't have this model for academia which is gonna inevitably lead to a lot of "why don't you just".

Same thing happens to me when I moan about work to my friends. They say "I thought you were the tech lead, can't you just decree a change?" and I kinda mumble "er yeah but it doesn't really work like that". So here I'm probably doing that in reverse.


it has been known to happen.

For example, spearheaded by Knuth, the community effectively abandoned the Journal of Algorithms and replaced with with ACM Transactions on Algorithms.

however it's difficult. a big factor is that professors feel obligated towards their students, who need to get jobs. even if the subfield can shift to everybody publishing in a new journal, non-specialists making hiring decisions may not update for a few years which hurts students in the job market.


I think the call for top-down policy makes sense b/c otherwise this is like every other tragedy of the commons situation. Each of those top-level researchers also has to think, "my department has junior faculty trying to build their publications list for tenure, we have post-docs and grad-students trying to get a high-impact publication to help them land a faculty job, we have research program X which is kind of in a race with a program at that other school lower down in the top 20. If we close off opportunities with the top journals, we put all of those at a competitive disadvantage."

For the grad students especially, there’d be a career advancement incentive to still publish in the top journals. The professors might still want to publish in them just out of familiarity (with a little career incentive as well, although less pronounced than the grad students).

I think it’d be a big ask from someone whose role doesn’t typically cover that sort of decision.


There are hundreds of reputable research universities around the world. Top-5 departments can't meaningfully change the culture of a field on their own. Top-100 perhaps could, but the coordination problem is much bigger on that level.

Grant funding reporting requirements. It would be easy to say self publish for free via the institutional library. But the NIH would not like that use of their money.

I like the author's idea:

> So the solution here is straightforward: every government grant should stipulate that the research it supports can’t be published in a for-profit journal. That’s it! If the public paid for it, it shouldn’t be paywalled.

The article then acknowledges this isn't a magic solution to all the problems discussed, but it's so simple and makes so much sense as a first step.

I'm no expert here and there are probably unintended consequences or other ways to game that system for profit, but even if so wouldn't that still be a better starting point?


I think that's also a good proposal, and I don't think it conflicts with the "prestigious departments stop publishing in $journal" idea at all. Probably we want both.

Only difference is that the author is writing for a wide audience and his best angle to change the world is probably to influence the thinking of future policymakers. While I am just an annoying "why don't you just" guy, my "audience" is just the friends I happen to have in prestigious research groups.

Adam M also probably has lots of friends in prestigious research groups (IIUC although he complains a lot about academia he was quite successful within it, at least on its own terms). And the fact that he instead chooses to advocate government policy changes instead of what I'm proposing, is probably a good indication that he knows something I don't about the motivatioms of influential academics.


Imagine being a scientist and reading “if you take this grant, you cannot publish your results in any of the most prominent journals in your field.” Sounds good?

But IIUC there are entire fields where basically the whole US ecosystem is funded by federal grants. So if this policy gets enacted those journals are no longer prominent.

(Maybe you'd need an exception for fields where the centre of mass for funding is well outside of the US, though).


The result is that open access journals would very rapidly, perhaps instantly, become prominent.

I explain here (https://news.ycombinator.com/item?id=47250811) but tl;dr it's because Universities need this system to get money and to give money. Nobody has yet proposed a solution which solves the money/prestige problem. With no money there's no research.

How do you imagine a secret designation would work..?

I’m not sure what you’re referring to. It’s not (typically, as far as we know) a secret designation. We know of other companies designated as supply chain risks: Huawei, ZTE, and Kapersky are the first ones that come to mind.

> there are at least a dozen companies that provide non-Anthropic/non-OpenAI models in the cloud, many of which are dirt cheap because of how fast and good open weights are now.

Oh yeah, seems obvious now you said it, but this is a great point.

I'm constantly thinking "I need to get into local models but I dread spending all that time and money without having any idea if the end result would be useful".

But obviously the answer is to start playing with open models in the cloud!


Well they are doing that because of the nature of matrix multiplication. Specifically, LLM costs scale in the square length of a single input, let's call it N, but only linearly in the number of batched inputs.

O(M * N^2 * d)

d is a constant related to the network you're running. Batching, btw, is the reason many tools like Ollama require you to set the context length before serving requests.

Having many more inputs is way cheaper than having longer inputs. In fact, that this is the case is the reason we went for LLMs in the first place: as this allows training to proceed quickly, batching/"serving many customers" is exactly what you do during training. GPUs came in because taking 10k triangles, and then doing almost the exact same calculation batched 1920*1080 times on them is exactly what happens behind the eyes of Lara Croft.

And this is simplified because a vector input (ie. M=1) is the worst case for the hardware, so they just don't do it (and certainly not in published benchmark results). Often even older chips are hardwired to work with M set to 8 (and these days 24 or 32) for every calculation. So until you hit 20 customers/requests at the same time, it's almost entirely free in practice.

Hence: the optimization of subagents. Let's say you need an LLM to process 1 million words (let's say 1 word = 1 token for simplicity)

O(1 million words in one go) ~ 1e12 or 1 trillion operations

O(1000 times 1000 words) ~ 1e9 or 1 billion operations

O(10000 times 100 words) ~ 1e8 or 100 million operations

O(100000 times 10 words) ~ 1e7 or 10 million operations

O(one word at a time) ~ 1e6 or 1 million operations

Of course, to an extent this last way of doing things is the long known case of a recurrent neural network. Very difficult to train, but if you get it working, it speeds away like professor Snape confronted with a bar of soap (to steal a Harry Potter joke)


I agree but I still have that itch to have my own local model—so it's not always about cost. A hobby?

(Besides, a hopped-up Mac would never go to waste in my home if it turns out the local LLM thing was not worth the cost.)


Nobody is saying it makes "financial sense", it's about control.

I have always taken plenty of care to try and avoid becoming dependent on big tech for my lifestyle. Succeeded in some areas failed in others.

But now AI is a part of so many things I do and I'm concerned about it. I'm dependent on Android but I know with a bit of focus I have a clear route to escape it. Ditto with GMail. But I don't actually know what I'd do tomorrow if Gemini stopped serving my needs.

I think for those of us that _can_ afford the hardware it is probably a good investment to start learning and exploring.

One particular thing I'm concerned about is that right now I use AI exclusively through the clients Google picked for me, coz it makes financial sense. (You don't seem to get free bubble money if you buy tokens via API billing, only consumer accounts). This makes me a bit of a sheep and it feels bad. There's so much innovation happening and basically I only benefit from it in the ways Google chooses.

(Admittedly I don't need local models to fix that particular issue, maybe I should just start paying the actual cost for tokens).


It’s a luxury for the wealthy to be honest. At least for now. These prices are ridiculous

Just use an open weight model like GLM-5 behind an aggregator (OpenRouter, NanoGPT) then. That is a commodity market, right now.

No, it's bizarre that this isn't normal.

The law is an expression of our desire that our industry doesn't exploit forced labour. The fact that this mostly only counts when the forced labour takes place in our own country is a weird historical detail, long outdated by globalisation.

Either you think that forced labour in Malaysia is OK in which case this seems bizarre, or you think it's not OK in which case we need a way for the law to discourage forced labour in Malaysia. The only way it can do that is through the supply chain.


"Either you think that forced labour in Malaysia is OK in which case this seems bizarre"

It would be an interesting poll to see what the populace actually things about this statement...


I think revealed preferences are more useful than a poll would be.

"Do you think forced labour in Malaysia is OK" - nobody answers yes to this.

"Are you willing to make sacrifices, such as imposing liability on local business, in order to discourage forced labour in Malaysia?" - this is the question.

This pattern applies to a lot of stuff. All politicians claim to have a solution to the housing crisis. But most "solutions" are suspiciously absent of downsides. If nothing you propose involves sacrificing anything or creating any losers I conclude you don't actually care about the housing crisis.

You can use this on your managers too. "What are we gonna do about the tech debt?" If the answer doesn't involve delaying features then you should interpret it as "nothing".


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: