Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
We iterate so that you can recurse (wingolog.org)
62 points by Tomte on Dec 13, 2022 | hide | past | favorite | 3 comments


As I wrote on Twitter, neither depth first nor breadth first is likely to give you a cache-friendly layout post-GC. Allocation order would not be bad (and would require a mark-sweep-compact GC instead), but I feel like a GC should be able to do even better than that. It should basically be able to construct a near-optimal layout, perhaps even cache oblivious. After all, it is inspecting the whole connectivity graph and moving objects.


I'm pretty sure optimal layout in the general case is NP-hard, even if you have a perfect predictor for future access patterns.

It's been a while since I've read the Garbage Collection Handbook, but it cites several studies showing some workloads are better served by depth-first traversal for compacting GC and other workloads are better served by breadth-first traversal. In general, you want to place an object near the field references it frequently de-references.

The problem with depth-first traversal is that an object ends up near its first child and its first grand-child and its first great-grandchild. The problem with breadth-first traversal is that objects reached early wind up near their children (good), but there's no space to put the grandchildren near the children (bad). You might have a sampling profiler show you which field accesses are causing pipeline stalls, and when traversing the object graph, use a double-ended queue for the grey list. Push the frequently stalling references at the front of the queue and push all of the others to the rear of the queue. Though, you'd need to keep long enough history to avoid alternating between good layouts and poor layouts every-other GC compaction.


There are "hierarchical" schemes which attempt to clump graphs together. SBCL has a special case of trying to lay down lists sequentially.

Anything with a read barrier might also be able to see which slots are hotter, and thus prioritise copying those first; there was one paper from the 90s showing profiling helped get a better layout.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: