I've always been puzzled by the little attention HN pays to demoscene. Coders of those little gems are truly expert hackers in their domains, getting the most of a limited amount of resources with pretty clever algorithms.
Pretty neat that they could fit all of that into such a small package. Even as compilers advance more and more, I wonder if they'll ever end up hitting a "wall" in the sense that they could be extremely efficient and run up against Kolmogorov complexity:
If you've never heard of it, it basically has to do with the 'information' inherent in some data; you could generate a plot for a function like y = x^2 + 5, but the bitmap would require much more data than the equivalent ASCII that stores the actual equation (and so is less efficient).
I haven't checked out the demo code, but I imagine that they could be doing some self-modifying code or using part of their image as an executable to save space. I just don't think compilers will ever be able to get up to that kind of efficiency, unless someone finds a new and unique way of writing them (genetic algorithms, perhaps?).
There's no real motivation for compilers in such a niche area. Most programming these days is about gluing lots of code together, which requires a higher level of common idioms than has historically been commonplace.
For example, people using Python, Ruby and Javascript rely on those languages' built-in dictionaries, arrays, array comprehensions, closure support, etc., just as part of the interface between modules. Similarly, Java depends on the standard collections and their corresponding interfaces; much .NET these days depends on extensive reflection and dynamic runtime code generation (directly or indirectly using Reflection.Emit, perhaps through LINQ).
A certain amount of this infrastructure needs to be present and, more importantly, taken for granted, for high-level libraries to be built and used freely without worrying about conflicting idioms. Contrast that with the historical situation in C, where the lingua franca is more often than not text files and perhaps a struct of callbacks. Every module has its own ideas about containers, and ideas like LINQ don't get much of a look in because it requires too much infrastructure.
So, if one is building a graphical program for .NET, one is probably going to be using the Bitmap class at some point, or a specialized version (texture etc.) for DirectX. Compilers will try and focus on the runtime execution profile of the agglomeration of different modules, and make the hot path run as fast as possible, possibly inlining and optimizing across module boundaries. The hot path might add up to less than 4KB of code (though that's smaller than the CPU cache, so probably wouldn't be worth the effort in this specific case). But the starting-out executable probably isn't 4KB in size, and the high-level libraries its linking against will certainly take it well beyond 4KB. Those high-level libraries have nice rich info for debugging, intellisense support and object-oriented development.
So, compilers do focus on efficiency, and can focus that efficiency in ways that humans would find difficult and tedious, through cross-module interprocedural optimization. Separation of concerns is not just good engineering practice; it's also a matter of economics and specialization of labour. A human trying to optimize the critical path needs to know a lot about all the layers in the stack, while a compiler works best when it does the work of optimizing those layers so long as the individual layers are reasonably well-design and efficient to begin with.
Another point: none of the demo code runs on my machine, they all either give a blank screen or crash on startup. Perhaps I have the wrong aspect ratio, perhaps I have too many monitors, perhaps I have a piece of hardware (geforce 8800gtx) they haven't tested fully. It's pretty trivial for code to be the smallest and fastest in the world if it doesn't need to work reliably.
Since the discussion is about toolset and compilers, it's worth noting that 4kb intros on windows platforms are nowadays built using a "compressing linker" called crinkler.
Instead of compressing the binary code of the original executable as data which then gets uncompressed when the real executable is run (in that area, kkrunchy is the state of the art on windows), crinkler produces a compressed executable directly, at linking time.
This comments seems a little bit shortsighted. Where do you stop? Is it okay if you write your own interface to the hardware drivers, bypassing DirectX/OpenGL? Or do you also have to write your own drivers to the graphics and sound? Your own operating system? Design your own hardware? There's no such thing as an "all-in" app.
All of those things have certainly been done on the demoscene. In the demoscene, size-limited competitions have implicit convections, unwritten rules if you will, to ensure that everybody competes on fair terms. For 4kb intros it is currently accepted (if not encouraged) to use the DirectX/OpenGL APIs. It is also considered acceptable to use the font-writing API's, and some groups exploit this to generate vector graphics out of obscure font symbols.
For sound, it's expected that you either write your own software synthesizer (or collaborate with somebody who has written one), or use the MIDI APIs. For 1kb intros, MIDI API sounds are expected. There was a big controversy in 2008, when the 4kb intro Texas (http://pouet.net/prod.php?which=51448) won the 4kb competition at NVScene. Many people considered its use of samples from mp3s included with the Windows operating system cheating, since it went outside the bounds of what is considered acceptable use of the OS resources.
So there are some implicit conventions about what you can and cannot do, and if you want to compete fairly and impress your fellow sceners, you should follow them.
Of course, most demoscene parties feature the wild demo competitions, where pretty much anything goes. Demosceners always reward hard work and clever hacks, so if you want to impress, create a demo on your self-built hardware and enter into this competition. You will have a fair chance at one of the prizes.
Usually, they go with whatever resources are available to the OS they're using. C64 demosceners statically link; Windows demosceners dynamically link, because they can.