> But that's sort of my point; today's optimisation can be tomorrow's performance regression, security bug, or be outdone by an update by the OS vendor.
>It's very rare that a program's performance is being held back by a standard library function (speaking of libc here). I remain highly skeptical that the algorithm isn't the real issue instead of the implementation of standard library functions.
In the last year, we encountered a bug with Solaris Sun Studio 12 on x64 where memcpy wasn't automatically inlined, forcing a full function jump every time it was invoked. That was a major performance hit, and forced us to switch to an internal implementation(that normally is worse on Solaris). IIRC, we didn't have much luck getting a patch out of Oracle for the issue.
So no, this really isn't true. In an ideal world, it would be.
How about, "use the standard library function unless proven guilty"? Sure, if compiling with Solaris Sun Studio 12 on x86 is a loss, but what about SPARC? Or Linux GCC?
SPARC is pretty much a lost cause for us. I believe that we have had better performance with the system memcpy over anything we've written.
Linux GCC was all over the place, depending on the Red Hat Enterprise version. IIRC, RHEL 4 and above, our internal code worked better, but with 5 and 6, the included memcpy is generally better.
If you're writing code that has serious performance requirements, experimentation is key. There's absolutely no guarantee that the system call will be better than a hand rolled call.
If you are forced to implement your own implementation I would rather switch to it completely. IMO it's better to be bold and get problems detected by having a wide adoption of a function rather than hide it in an edge case where problems might hide.
Whether it was actually a bug is unclear from your description. By default, Sun Studio intentionally doesn't inline functions defined in system header files unless specifically requested.
It's also at the discretion of the compiler whether to permit some functions to be inlined. The compiler man page outlines this caveat, and mentions that inlining standard library functions is discouraged as it can cause errno to become unreliable.
Finally, there's also a question as to whether (again) there was a bad algorithm being used as opposed to the fault being with a standard library function. Yes, it's possible there was a performance pathology with the particular use case you have, but there's almost always a better way to resolve an issue like that than hand-rolling a standard library function which inevitably causes unexpected issues.
>It's very rare that a program's performance is being held back by a standard library function (speaking of libc here). I remain highly skeptical that the algorithm isn't the real issue instead of the implementation of standard library functions.
In the last year, we encountered a bug with Solaris Sun Studio 12 on x64 where memcpy wasn't automatically inlined, forcing a full function jump every time it was invoked. That was a major performance hit, and forced us to switch to an internal implementation(that normally is worse on Solaris). IIRC, we didn't have much luck getting a patch out of Oracle for the issue.
So no, this really isn't true. In an ideal world, it would be.