OK here’s some patches I found to speed it up a little but overall…couldn’t see much, except that past like 4-6 threads it seems to get all choked up on global locks allocs and realloc’s…dang…
(you can see by running gdb against it:
gdb -ex “set pagination 0” -ex “thread apply all bt” -batch -p
or using some other profiler.)
At the same time it did get to like 10% idle cpu on a box with 8 cores I was using for it. Maybe that’s good?
Guess there’s multi process if you need to use 100% of the cores
Or maybe somebody could dive in and make the bdwgc much faster somehow (more aggressive thread local sructures?)…or a different GC altogether… :)