The One Lab

xz-utils and the case of the limited memory

Sat Mar 6, 2010 10:57:00 -0700

For all those out there that know me, you probably know that my laptop is pretty dear to me -- I was pretty fussy when choosing it at the time I bought it. In the end, I settled on a Lenovo ThinkPad X200 because it was fast, light, solid, and most importantly, field servicable. I've had the thing for about three years now, and I can say that it really is one of the toughest laptops I've ever owned.

Unfortunately, it's broke right now.

Yeah, you read that right -- it's broke. No, the harddrive didn't break or anything. Believe it or not, it was the silly squirrel fan used for the CPU heatsink that finally gave up the ghost. The entire system is built inside of a magnesium cage, so it's pretty hard to hurt this thing, but apparently that doesn't help ball bearings. =o)

So despite my babying, it still managed to get dropped on the floor recently and the bearings inside the fan died. As a result, I've had to truck out my first laptop until the replacement part arrives: a Sony Vaio Picturebook PCG-C1VN. It's pretty anemic, being based upon the original Transmeta Crusoe TM5600, but it works in a pinch, and it's rather pathetic lack of resources is actually what prompted this post.

See, I decided to try my hand at installing Gentoo on the little guy because it compiles down so small and fast. Naturally, since I like living a little dangerously, I added ~x86 to the ACCEPT_KEYWORDS in make.conf and started re-emerging the world. Shortly thereafter, though, it ran headlong into a brick wall: xz-utils was complaining about a lack of memory!

I thought this a bit odd -- I mean, I knew the machine was limited in RAM (it has about 128MB total) but I had supplied a 1GB swap partition in addition to the main install, so it should have been fine, right? Well, apparently the devs for xz-utils thought it best to second-guess the kernel in terms of memory limits and management. They actually took great pains to add in a memory limiter inside xz-utils and set the default memory limit to about 40% of whatever they can detect. As a result, if unxz runs into a situation where it needs more than that limit, the application dies -- just completely gives up.

So here's a thought that flickered across my neurons: you (the programs) live and die in userspace -- a place carefully monitored and controlled by the kernel. Why the hell would you ever want to limit yourself on top of the kernel's existing limitations using a hand-rolled algorithm that is by its very nature going to be non-portable and buggy? So I started digging around for more info and came across this choice paragraph in the man page for unxz:

To prevent uncomfortable surprises caused by huge memory usage, xz
has a built-in memory usage limiter. The default limit is 40 % of
total physical RAM. While operating systems provide ways to limit
the memory usage of processes, relying on it wasn't deemed to be
flexible enough.

"Uncomfortable surprises"? If by "uncomfortable surprises" they mean like unxz failing to decompress an archive because it's artificially limited by a broken algorithm? Further, after digging around with Google, I can't find any mention or discussion as to why "...[the OSes limits weren't] deemed to be flexible enough." WTF? If xz-utils is killed because of over-memory usage, well, guess what? It dies in nearly the same way it has been artificially written to die! Even more to this point, without such an algorithm, the program will be able to use swap space as necessary to compensate for the problem of low memory. In fact, if you manually override the default memory limit by setting --memory=100%, unxz has no problems whatsoever. So unless they've written an entirely different method of malloc(2)-ing memory, they're never going to trigger the Linux kernel's OOM killer. Maybe on Windows they might run into a problem, but then according to xz-utils' webpage, it was written for POSIX operating systems first, and then Windows as an afterthought.

So, ranting aside, I took the time out to hunt through Gentoo's bug system to see if there were any upcoming patches to fix the problem. Sadly, most of the developers don't seem to be interested in actually fixing this broken behavior. Quoting from a reply from Peter Volkov on 2010-02-09 in bug #303975:

Actually I don't see anything extraordinary here. Please, read man
unxz... [ED: skipping quoted man page] ...So this is documented
behavior...

After that, the bug was marked as RESOLVED with a resolution of INVALID. So I have one last question: when was it that broken behavior became acceptable if it was documented? This is the second such oddball bug "resolution" I've found in Gentoo since I started using it -- the first being the broken implementation of the games group and how that affects nethack, angband and other roguelikes based upon them (which, incidentally, has been left open for just about three years now).

I've since posted my response to the bug along with a followup patch that fixes the issue by setting the default memory limit to 100%. Unfortunately, though, it really is just that: a patch. If unxz and it's related utilities ever need more memory than what it can detect as present on the system, they'll still bail out rather than let malloc(2) and friends tell them when they're out of RAM. I didn't have a whole lot of time to attempt to excise or #ifdef out the entire algorithm because it's actually repeated throughout the codebase multiple times. Worse still, instead of coding the 40% using a #define or a const somewhere in the codebase, they have the number 40 repeated over and over (I counted it at least three times in the code I looked through) -- at the very least they should have made this a #define changeable via autoconf's configure script.

With any luck, the request to remove or at least disable this code will make it upstream eventually. If not, well, I guess we're all the worse off. Until then, let it be known that you can set XZ_OPT to --limit=100% to work around the related portage emerge issue.

Post a comment

blog comments powered by Disqus