I was asked today, “what’s this whole ‘Microsoft Azure’ thing about?” I haven’t done anything with Azure, EC2, or Google Apps Engine, but why let my ignorance stop me from blogging about it? So here’s a brief summary of what I understand about these cloud technologies – I find the whole idea fascinating and am looking for an excuse to do more with one of these implementations. Consider this a plea for my friends, the Elegant Code Peanut Gallery, to help bring understanding to us all.
The tl;dr version
Azure involves a classic tradeoff: being able to scale very effectively and not having to make major capital investments, vs. having to pay more to your cloud computing environment for the convenience of it. It’s suitable for some applications and businesses, but certainly not all. To make the most of it, you should (or, must) architect your application around the whole cloud-notion to begin with. How much, and how bad it’ll hurt, depends on platform and your application.
The Long Winded Version
Let’s say you’re a big online retailer, selling books online (as an example I just made up). For 3 months of the year, your servers are completely slammed with orders – you need lots of bandwidth, lots of CPU, everything. I’m talking about renting out a big data center for $BidMoney a month, hiring SysAdmins, NetAdmins, all of that. The other 9 months of the year, its boring. You could just have a server under your desk, and hire some high school kid to keep it running.
This is why Amazon’s EC2 was invented (and why Microsoft is competing with them with Azure). Rather than buying racks of servers, hiring staff to keep it running, and so on – for stuff that you won’t need for most of the year, you deploy your application to Azure – when times are busy, you spool up additional servers easily. When the rush is over, just throw those extra instances away.
Your startup get mentioned in the WSJ? Add servers to deal with the traffic, so you don’t look stupid with “Server Unavailable” errors. Any sort of business model that is built around ‘bursts’ makes a lot of sense for this.
The tradeoff: Cost. You pay for CPU time, storage space, transactions, bandwidth. Each of these are really small numbers like “$0.15 / GB stored / month” and “$0.01 / 10k transactions” but they can add up quickly – a Death of a Thousand Cuts. Plus, with EC2 and Azure, the CPU time is not “cpu time doing something useful,” it is “amount of hours the virtual server was running.” So if your app spends most of its time doing nothing…
For a small app with little traffic, this could add up to $100s / month, instead of just paying $VeryLittle/month for really lousy shared hosting, or a little more for less-lousy virtual hosting. And as I understand it, all of these services are “No Refunds” so if you get hit with a big wave of traffic, or forget to turn a server off when you were done with it, or whatever, … too bad, pay up!
It’s not all bad though – Google App Engine is free for 500 MB of storage and 5 million page views, and when you’re ready for more you can set up a quota and daily budget so you don’t get any nasty surprises. And while Azure is in CTP, its without charge (with a few modest quotas applied). So, you can start writing your Facebook-killer app right now, and deal with the costs later 🙂
Application Design
With Google App Engine and Amazon EC2, and mostly with Microsoft Azure, you must design your applications with cloud operation up front. It’s not exactly trivial to take an app written for a more traditional web server/database structure and deploy it to the cloud: you should be using the cloud’s data access APIs, not storing anything on the server itself, and have ways to get your data to and from the cloud (racking up transaction fees in the process). And don’t forget that your data is out there in hostile territory – how are you going to keep it safe? Plus your choice of cloud dictates what languages and platforms you can choose from.
I said “mostly” because with Azure you can pay a little extra to get access to “SQL Azure” if you need relational database storage. Which I think is great, although you’d have to balance those costs against the effort of using a more transaction based storage model.
So there you go, a tiny bit of understanding packed into a dangerous form of leaping conclusions. 🙂