SmugMug and Amazon S3


Digg It Del.icio.us Reddit My Web

When online photo site SmugMug initially contacted me, it was in the context of some of the pieces that I had written about competitor Flickr and about some of the issues associated with protecting photographers’ works online. In a nutshell, relative to Flickr, SmugMug has opted for less of a open community orientation and more for ways to store and display photos with a rather granular set of access controls. (See some discussion by CEO and “Chief Geek” Don MacAskill here.)

These are important topics that I’ll be discussing further in due course, but today I’m going to focus on SmugMug’s physical infrastructure. During our conversation last week, President Chris MacAskill made some points about using Amazon’s Simple Storage Service (S3) that may not be widely appreciated. (S3 is Amazon’s “Storage-as-a-Service” storage offering that users pay for based on the amount of storage space used and data transferred. Like Amazon’s EC2 compute service, it falls roughly into the “Hardware-as-a-Service” concept.)

SmugMug was one of the earliest S3 users. As Chris tells the story, SmugMug was buying a “mindblowing” number of Xserves from Apple. The Silicon Valley-based company was running out of power and space–the usual story.

However, Chris raised another point that bears mention. They were having to buy all this gear up-front, in advance of the revenues (i.e. user subscriptions) that it would hopeful generate. This was difficult from a cash flow perspective–especially for a company that wasn’t VC-funded. But the reality is actually worse. Not only were the expenses up-front, but they were capital expenses. From an accounting perspective, this means that the depreciation on the systems hit the P&L in a given year. Result? You may look profitable, but cash flow is tight and you could end end up effectively “pre-paying” taxes.

Then Amazon called him out of the blue after a conference and told them about S3. At Amazon’s initial target of 50 cents per GB, it was intriguing. When Amazon ended up pricing their offer at 15 cents, Chris says that their “jaws dropped.”

Initially SmugMug used Amazon S3 for backup while keeping all their primary storage in-house. At beginning, they weren’t thrilled with uptime, but say they weren’t disappointed either. More troubling was that Amazon wasn’t so transparent about when were outm and for how long. (This latter point seems to remain a bigger issue with Amazon outages than the outages themselves.) However, over time, SmugMug started seeing better uptime from Amazon than they could deliver in-house. They now have over 400 TB of photo and video storage on S3, and can add as much as 1 TB on busy days.

Now that they’ve switched much of their primary storage to S3 as well, there’s another economic point worth making. Were SmugMug to host all this storage in-house, they’d actually have to buy more like 1.2 PB, because they’d need enough to support any growth spurts, and they’d need enough for backup as well as primary storage. With Amazon S3, you effectively get backup for “free.” (Of course, that assumes you trust Amazon not to lose data but, as far as I know, there has been no data loss associated with any Amazon outages.)

SmugMug are also heavy users of Amazon’s Elastic Compute Cloud (EC2), even though the service is still in beta. One of the most appealing features of EC2, according to Chris, is that it can handle load spikes without paying for the capacity all the time. For example, loads go way up after a three-day holiday weekend when people upload all their pictures on Tuesday.

All that said, the company does maintain some of its own servers. It does this, in part, to provide a sort of cache for “hot” photos. (Chris estimates that 10 percent of the photos on the site get 90 percent of the traffic.) Related is the fact that they run their MySQL database servers in-house (where they’ll be physically close to the hot photos.) Amazon’s recently announced SimpleDB could potentially offer an alternative, but it’s missing some features that SmugMug’s software, as currently written, requires. (See some technical discussion here.)

I suspect that we’ll see these hybrid architectures–even at aggressive Cloud Computing adopters–a lot. You sometimes need that little bit of customization or specialization that you can’t get from a service that has to be relatively standardized. That said, SmugMug is an aggressive adopter and gives us some good insights into what can be gained by making the infrastructure largely someone else’s problem.

Gordon Haff is a Principal IT Advisor with Illuminata, Inc. and has over 20 years of IT industry experience. He blogs about what’s happening with enterprise servers and datacenters, “Yotta-scale” computing, and related software and device trends as part of the CNET Blog Network.

Leave a Reply


 

Loan
Wristbands
Promotional Products