Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm having trouble seeing how this is actually different from a normal generational collector to begin with in practice (as opposed to implementation details).

The differences I read into it are:

- There is one heap rather than a young and tenure heap, but tenured objects are marked as such.

- Rather than moving an object from one heap to another, the write barrier tags the object being written to a tenured object (and presumably all its children). Not all collectors move actively (as opposed to lazily at collection time) to the tenured heap, but I'm pretty sure it has been done.

- 'Young generation' collections happen at a specific time: when a goroutine terminates, and happens more or less like a quick version of a semiheap copying collector, in that it just drops all objects not marked reachable.

So there is interesting stuff in there, and it seems like a very appropriate approach for an event-driven environment, but it still seems like a generational collector to me.

Also, I'm confused how the separation of the goroutines is maintained in this heap. When a goroutine advances its current pointer, what happens if it bumps into another goroutine's initial pointer?



> - There is one heap rather than a young and tenure heap, but tenured objects are marked as such.

That's a big difference. One of the largest, if not the largest, advantages of a standard generational GC in practice is bump allocation in the nursery. If I understand correctly, because of the fact that GCs don't unconditionally tenure surviving objects (instead only tenuring objects that cross thread boundaries), this algorithm loses the fast bump allocation because objects in the nursery can become noncontiguous.


Right, I was actually going to comment that the allocation scheme in this algorithm seems painfully slow compared to the kind of allocation you can get out of a separate nursery where you're just allocating the next free space. I forgot by the time I finished the post.

First-fit seems like it could be quite slow here.

The implication seems to be that they can at least avoid locking for allocation (and deallocation and promotion) as long as they find a fit within the reserved region for the goroutine, but I'm still confused how they maintain this separation safely.


They like their (mostly-)concurrent collections, and concurrent and compacting together seem harder to do (think e.g. Azul) than just concurrent or just compacting. And of course without compacting, a slower allocator that can use already-fragmented memory becomes attractive.

Don't know how much of a factor it is in this design, but might also just be expensive to introduce heap compaction to Go now, after going this far with heap objects having stable pointers. Might be a lot of random spots in the code that still assume that.

Not that there aren't benefits to compacting, including the bump allocation, less bloat as programs run a long time, and help with locality. Just, describing where Go's at.

> I'm still confused how they maintain this separation safely

Don't know about this design, but in other places they have some goroutine-local data structures and then do the expensive thread coordination stuff after they run out of resources. Seems plausible that could be the general approach here too.


I read the other day a comment that I thought was pretty interesting regarding compacting vs non-compacting. The idea was that compacting took the amortized cost of searching for free space and shoved it all into a single time span. Allocations are then just simple fast increments.

EDIT: Found the post.

> In the CLR allocation is blazing fast: just increment the heap head pointer. You don't have to take care of problems traditional allocators face: find a big enough fitting free hole. This has a cost: when objects are destroyed, the free memory on the heap becomes fragmented. This needs to be compacted, and this is where the GC pause is needed, as the references for moved objects need to be updated. Basically the cost of finding a free hole in memory at allocation was moved here.

> This was done intentionally, as studies found that most real world applications produce a load where most objects are short lived: this cheap allocation is needed, and a multigenerational GC can take care of them cheaply after destroyed (as for most of the time you only scan and free the youngest generation for unreachable objects, limiting the GC pause time).

https://news.ycombinator.com/item?id=11943816


plus Go has now acquired a culture of heavily optimising their code to "not allocate", so allocation is less of an issue for them than runtimes like Java.


It's not that those libraries "not allocate" but they work with stack allocation / provided buffers. You can reuse the provided buffers / output objects ... But a lot of req/resp code doesn't so it's just pushing the cost higher up the stack (app vs libraries). Which it self is a good argument for "transactional" GC.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: