Andrew’s Website
Cloud Computing's Green Benefits
2008-04-22 20:44:40 google amazon

In honor of Earth Day, I'm going to talk a little bit about why these new utility computing services (or "cloud computing") are good for the environment (and business too).

One of the tenets of cloud computing is that you use what you need and you pay for what you use. Amazon S3 and Simple DB and Google App Engine all charge based on storage, bandwidth, and CPU time. Additionally, these services run on shared infrastructure, so you don't have separate physical boxes serving your traffic. This allows Amazon and Google to run at high utilization knowing that—statistically—not all of their users will be hammering the service at once. For a detailed discussion of these statistics, read Power Provisioning for a Warehouse-sized Computer [PDF]. I guess you could say the same thing about Dreamhost (or any other shared server provider) since they cram many people on a box and run at high utilization, but they don't provide the same level of scalability as Amazon or Google (you are limited to a single box).

Why does utilization matter? If you can do the same amount of work on fewer computers by having higher utilization, you save the environmental cost of building those computers. Additionally, computers use lots of energy idling. Even though you increase single-computer power usage when consolidating, net power decreases as you take other systems offline. For a concrete example, imagine you have two computers that use 100W idling and 200W at full load. If each machine is one-third utilized, it takes 266W to perform your work (133W for each computer). However, consolidating this onto a single machine results in a total draw of 166W, or a 37% savings. Now… imagine running a datacenter at 80% or more load.

In the second paragraph I didn't mention Amazon's EC2. EC2 differs from S3 or GAE because you do pay for what you don't use. EC2 charges based on how long you have a computer under your control, not by the utilization of that computer. However, this is still a much more granular level of control than colocation, since you can scale your fleet up or down as needed to meet load. When you return a machine, someone else is free to grab it. The end result is high utilization, because people won't hold on to a machine they are idling, and will thus reduce their fleet size to compensate.

In my opinion this makes Hadoop the killer app for EC2. Users can spin up a cluster, run their job at full bore, and then return the computers back to "the cloud". Companies such as Powerset and the NY Times have used EC2 for this very reason. This is a triumphant example of the market's ability to reduce energy consumption and resource usage (in the form of unneeded computers), because it directly translates into monetary savings—no heavy handed government mandates required.

Let me use an example of my own usage of S3 for backups. All the numbers will be based on those in the power provisioning paper mentioned above. I have 20 GB of data backed up in S3 and I want it to be available for instant access. This necessitates a constantly spinning hard disk. If I were to do it myself, I could just add another hard disk in my computer and stick a copy on that. Power usage of hard disk: 12W. This is cheap and easy, but doesn't give me offsite backup in case of a disaster. For the storage server—using the numbers from the paper—we have 200W base power plus 12W for each disk. Giving the server eight disks results in a total draw of 296W and 8192 GB of disk space. Let's double that to take into account battery backup and air conditioning in the datacenter, so 592W. Since the server is shared with others (we are aiming for high utilization, remember?), I'll have to figure out my usage as a proportion of the total. Before I do that, I should take durability into account. Distributed file systems store multiple copies of each file because hard disks and computers constantly fail in large systems. The industry standard seems to be three copies—judging by GFS and HDFS. Three copies results in 60 GB dedicated to my backups. Adding this all up results in the follow equation: 592W * (60 GB / 8192 GB) = 4.33W. As you can see, even with all the additional overhead and infrastructure, using S3 saves two-thirds the power versus having an extra spinning hard disk in my own machine. Plus, my apartment is no longer a single point of failure.

Hopefully this article has provided you a new look at utility computing. Next time the media tries to make a fuss over the growing power usage of datacenters, think back to this and realize they might actually be saving energy. Happy Earth Day, Andrew.

Announcing A BigTable Web Service
2008-04-14 22:54:59 google

Last week, rumors started to surface that Google would be releasing BigTable as a web service for developers. While open source clones are being created, so far only Googlers have had access to the real BigTable. This announcement would have also been the biggest competition to Amazon's Web Services to date. The actual launch was App Engine—hosted middleware on Google's platform. App Engine runs on BigTable, but doesn't expose all the nitty gritty details of BigTable. This makes it simpler but less flexible, and despite all the comparisons to AWS online, the two offerings are quite different.

I registered the night App Engine was released and soon got my invite. I then came up with the crazy idea to offer BigTable as a web service using App Engine. It would be an infinitely scalable database running in Google's datacenters. I spent my weekend learning Python and hacking together an implementation. Now I'm happy to present the BigTable Web Service. It models the API of Hbase—a BigTable clone. Now you can have simulated BigTable running atop App Engine, which itself provides an abstraction on top of the real BigTable.

The site describes the API of BigTable and gives examples of how to call it. I've also included a Python client for writing software against it. You must register for an account and create tables using the site, but everything after that is done through pseudo-RESTful service calls. I'm allowing free, unlimited access of the service… up to the limits imposed by Google.

Third Annual Fucking Hills Race
2008-02-28 21:36:03

Last Sunday was .83's third annual Fucking Hills Race on Bainbridge Island. I met my hobo friends under the Alaskan Way Viaduct at the ungodly hour of 8:45. I registered with Derrickito and got my paramilitary-styled pirate flag made with scraps from OR's dumpster. After registering, I ate the breakfast of champions: half a donut and a swig of vodka. We swarmed the ferry, paid our dues, and lined up to board. We were the first on, but unfortunately the cars disembarked before us.

On the ferry I had a cup of coffee and sausage, egg, and cheese muffin. My heart is still thanking me. I almost got stuck in a swarm of squids while finding .83's seats. Fortunately the spandex allowed me to slip right past. Apparently the Chilly Hilly was scheduled for the same day? Odd. I also ate some stale Clif Shot Blocks for dessert.

Kevin

We gathered in a pullout after getting off the ferry. Derrick said we couldn't start until we finished a bottle of Evan Williams. Ben the Angry Hippy–owner of said bottle–complained and lifted the requirement. And then we were off! I wore the same outfit as last year: a pair of blue jeans and my Bob Marley jacket. Fortunately the weather was nicer, so my pants didn't get soaking wet during the journey.

Knowing the approximate route made the ride go faster because I didn't always think the end was right around the corner. Even though I was out of shape, I was still able to ride at a fairly quick pace. I don't know where all the energy came from.

The advantage of being a second year rider is that I knew where the finish line was located. I placed about 25th. Fortunately, they had plenty of chili this year. I had two bowls of that and then went for beer. While waiting for people to roll in, I ended up drinking four Mongoose IPAs. My friend Kevin eventually rolled in sporting his Amazon.com jersey. I selected a cool Cadence t-shirt for my prize, which I had been eyeing for a while.

While waiting for the ferry, a few of us went to a coffee shop and I had a piece of coffee cake. I had a corn dog on the ferry back, in true .83 fashion. I eventually made it home–sore–and took a shower.

This year's FHR was great. They had plenty of chili, I got a cool prize because I didn't stop to drink a beer on the way, and the weather was amazing. Thanks to Derrickito for organizing, our sponsors for the great prizes, btah and Kalen for manning the finish line, and anyone else I'm forgetting. See you next year!

I Lost My Virginity Last Night
2008-02-03 19:01:29 personal

More than a decade after first watching the movie, I finally made it to a live midnight performance of The Rocky Horror Picture Show. The show was put on by The Vicarious Theater Company at the Admiral theatre in West Seattle. The price was right at only $5. My friend Martin organized the outing and brought the requisite rice, newspapers, and (Amazon.com branded) playing cards.

The devirginizing was mild and painless. They were overwhelmed by first timers and so–unfortunately–didn't have enough supplies for all of us to partake in the food-based rituals.

The average age of the audience skewed younger than I expected–I was probably older than more than half of the audience. Still, many of the youngsters and first timers had impressive costumes. The audience was very enthusiastic with feedback and everyone partook in the Time Warp. Some of the quips were quite clever, but the part about Brad being an asshole got old. I'd definitely go again, I just wish the location wasn't so inconvenient.

Is Google Afraid of the Big Bad Microsoft?
2008-02-03 18:35:27 google microsoft yahoo

Earlier today Google posted its opinions about Microsoft's bid for Yahoo on its blog. I must say, I'm disappointed. In the post, Google not-so-subtly asks for government intervention in the takeover. Google claims they want this in the name of openness and innovation, but what freedom? Specifically, the freedom from government intervention in private business. The software industry is healthy and prosperous with minimum government contact. Allowing government interference would result in corruption, cronyism, and inefficiency, just like every industry it touches.

Regarding the purchase, neither are majority players in search or advertising–likely the main reasons for the acquisition–even combined. Contrast this to Google and DoubleClick at the time of that purchase. Demanding the right to buy DoubleClick and then asking the government to restrict Microsoft's purchase of Yahoo smacks of hypocrisy. Last year Microsoft tried to prevent the DoubleClick acquisition, but I expect this sort of hypocrisy from Microsoft. Now it seems Google is trying to pull the same thing. You have let me down Google, I thought better of you.

Anyway, I don't see what they are worried about. Google should be happy about the acquisition, the merger will more than likely result in a clusterfuck that drives more users away from Yahoo properties. I know that if Microsoft starts fucking with the Yahoo services I use, I'll jump ship.

View All Entries
Creative Commons License
This work is licensed under a Creative Commons Attribution 2.5 License.