Issue # 2 Cloud Economics FTW or WTF?
Hello, and good morning. I am Manu, and this is the Digital Atmosphere. Welcome to Issue # 2.
There was a time, not too long ago when I used to listen to The NPR Politics Podcast. It was relevant at the time — any changes in USA’s domestic policies, especially around immigration could directly affect my life. Times changed, I moved back to India, and that isn’t the case anymore. However, at the end of every show, the podcast had a section called “Can’t Let it Go”. The hosts would go around the table and reveal one thing — a news item, an observation about the world or a general happening in their lives that had dominated their thought cycles that week, to the extent that they kept thinking about it, constantly — couldn’t let that thought go! The more I learn about the public clouds and associated economics, the list of things that I cannot let go of keeps increasing. Let’s talk about at least one of those. I am sure more will come up in due time.
Let me state this up front - cloud economics is fascinating. Remember the time when you wanted to assemble a desktop rig? What did you do? You went to Nehru Place computer market, picked up and paid for every component on your list — $15 for 8 GB DDR3 DRAM; $1451 for that Intel Core i5, etc., etc. And once you had paid for those, you owned them. They were yours to do whatever with. Want to plug that DIMM in your desktop? Sure! That’s what it’s for. Create a modern art piece with all those components you just bought? Why not! Break them all down with a hammer? I don’t see why you’d want to do that, but that’s valid too!
Cloud economics turns this fairly simple model of paying $s to own stuff, on its head. Probably with good reason too, but I am not sure yet. Once you create that VM in the cloud, you are a tenant than an owner. I mean, technically, you own the VM, but the physical server that it runs on co-hosts many other VMs, all of whom are (quite literally) called tenants. And tenants pay rent.
So, how can one charge rent for a VM? There could be two ways of doing it. First is where you are given pre-packaged configurations to choose from, each with a fixed set of resources (e.g. x vCPUs, y GB DRAM, z GB HDD based storage etc.) and an associated, predictable cost. The other one is where you can create your own VM with whatever resources you see fit, and then pay for them according to capacity and usage - the à la carte model. Both have their pros and cons, and both have been adopted by providers. Linode (at least so far) is on the pre-packaged end only whereas everyone else, especially the big 3 public cloud providers offer a combination of pre-packaged and à la carte, with a focus on the latter. After all, having lots of choices is better, right?
Given the number of options of components to choose from, the Cloud Economists at large public cloud providers have decided that the à la carte model is better for the customer, and maybe for them as well. And there is some merit to the customer side of that argument. Consider cases where customers need to temporarily spin up a large number of low-resource VMs (which are not on the pre-packaged list) or large, custom ones with lots of resources. It makes sense for the customer to go for the à la carte model in these cases, since the requirements are specific and will vary depending on the use case. And once the cloud providers start handing out custom machines, which are spun up for fixed amounts of time, there needs to be a mechanism to price them also. So, a notion of time has to be added to component capacity (e.g. GBs of DRAM) or their numbers (e.g. # of provisioned vCPUs) to create billable entities. If you instantiate a custom VM, you will be paying for the sum of the VM’s parts over the time for which you use the machine. This has created some very interesting billing granularities.
In your VM, DRAM gets charged by GB-hour. It’s the cost of having 1 GB DRAM at your disposal, attached to your VM for one hour. So, if this DRAM is billed at $0.005 / GB-hour, the VM has 8 GB DRAM and you run it for an hour, you pay $0.04 (0.005 x 8). If you run the machine for an entire day, that’ll cost you $ 0.96 (0.04 x 24). Keep this thing running 24x7 for a month, you’ll rack up a rent of $28.8. And that’s just for DRAM. The rent for a single vCPU is typically an order of magnitude higher than DRAM (e.g. $0.05 / vCPU-hour)2. So if the same VM has 2 vCPUs, both running 24x7 for a month, that adds up to $72. Add some super-cheap storage, static IP addresses etc., and you can easily rack up a $100 bill, per month, every month just for keeping this one machine running. Irrespective of it’s actual utilization.
And BTW, this is the rent for a single VM in a given geographical location. The exact same VM in a different location (i.e. datacenter) which has higher real estate and/or electricity prices may increase the cost of the same VM by 100%. I am not kidding.
Oh, and when you are being charged à la carte for your VM, you are really being charged separately for each component. Let me illustrate
Want your storage to be on an SSD than traditional spinning drives? Of course you can! For a price.
Oh, are you running a server that creates lots of network ingress and egress traffic? Good for you, but we will make sure to add all those GBs to the bill. Guess that is good for us as well!
Did you hear that AMD came up with a new line of EPYC CPU servers? We did too. Bought a bunch of them and upgraded a portion of our fleet with the latest EPYC architecture. And, now you can migrate your VMs to get better performance on those machines. For an increase vCPU-hour rate, of course.
For large public cloud providers, the number of options you can choose from for every component are mind boggling. Its like selecting a tea leaf brand in a superstore. Unless you know a lot more about the tea industry than the average consumer, you will probably pick a leaf that 1) you have heard of (or came recommended), and 2) fits your budget. Once you buy the pack, go home and actually make some tea is when you figure out if you like the taste. It’s somewhat a similar process while creating VMs. Most people aren’t provisioning VMs that have the exact configuration that their workload needs. They provision the VM with an informed guesstimate of the peak requirements of the workload and then add a little sprinkle of resources on top, just to be sure. And there are multiple reasons for doing that. First, people tasked with creating cloud based infrastructure may not have a very good sense of workload requirements, especially if it is a rapidly changing one. Second, they get an option to go back and revise machine requirements if the guesstimate was incorrect. Third, even with a good idea about workload requirements, it’s insanely difficult to benchmark a VM’s performance accurately, especially if it’s colocated, which the ones on public clouds are. Finally, like everyone else, even cloud infrastructure creators (which in many cases are developers themselves) want to be safe than sorry.
So, if people are over-provisioning their VMs, then the à la carte pricing model works out well for cloud providers. And maybe for the customer as well, if they are careful about provisioning VMs and running them only when they are needed. Which might be possible for VMs running periodic, batch jobs; the VMs need to be turned on only for fixed time durations. However, if the VMs are being used for serving traffic to live users, then they have to be kept on 24x7, racking up the rent.
And you know what the other neat thing about rents is? They go up as the landlords see fit.
Yes, I know that no one pays for anything in $s at the Nehru Place market.
These numbers will vary across cloud providers and geographical locations. For the purposes of this post, I have picked approximate values from the GCP pricing page.