The tl;dr: >we don’t use Elastic Block Storage (EBS), which is the main comp...

pilif · on April 25, 2011

This gets even more weight when you consider that EBS was broken across multiple availability zones, which means that, had they used EBS, their first point would be invalidated.

mikeryan · on April 25, 2011

More importantly smugmug was smart enough, when moving to the cloud to realize which components were the most failure prone and to stay away from those.

Not using EBS wasn't luck it was a conscious decision.

SoftwareMaven · on April 25, 2011

They didn't not choose it because of concerns about availability, they didn't choose it because of run-time performance concerns. I don't think you can argue that those concerns even imply anything about availability, much less have some kind of causal relationship.

SmugMug got lucky in their choice. If performance had been consistent with EBS, they would have used it and most likely gone down like so many others.

onethumb · on April 25, 2011

Not true. Our primary decision was based on unpredictable latency, but the fact that we didn't/don't trust EBS played a huge role. EBS mucks up our basic availability scenario - systems are no longer individual, disposable, replaceable units. I'm sorry if that wasn't clear from the blog post - I'll go re-read that part and update.