Public vs. Private Cloud - Follow Up from the Fight!
You may have read a recent post by my colleague Chris Bunch covering his thoughts on public vs. private cloud computing. If not, I suggest giving it a read before diving in to my post.
Chris’ article led to some discussion (great!) through a few channels. One of the comments raised related to the viability of larger data sets in the public cloud - due to the cost of data transfer in and out (for AWS specifically in this instance, and for a few terabytes per month).
I’d like to counter this viewpoint, as I believe public cloud (and notably AWS) still to be the winner of the fight even in this scenario. I’ll outline my thoughts below.
The case to stay private
Private cloud typically runs in a datacenter/colo provider, and I do not know anyone who builds their own datacenter anymore. The cost of connecting from this colo provider to the customer's sites still stands. However, if you are running your own datacenter, then perhaps the argument that public cloud provides overly expensive bandwidth is valid and you should indeed use continue to use private cloud (as per the original article, think "investment cycle").
The case for public
1) Let's talk numbers rather than perception: Even if one agreed (and I don’t) that the cost of data transfer in AWS is "excessive", there is clear data showing that a 5-yr TCO of developing, deploying, and managing an app in AWS delivers a 72% saving when compared with deploying the same app on-premise or in hosted environments. This data is based on 2013 AWS pricing - things have gotten cheaper since then.
2) Every service provider, big or small, including Google and Amazon, has to pay tier-1 transit providers (it’s essentially a cartel!) in order to transit traffic to the Internet. Furthermore, serious investment in private circuits, engineering and hardware also goes in to create peering connections with smaller Internet Service Providers - this rather lengthy but excellent article explains how this works. To summarise, the charge for data transfer is to cover the cost of delivering a service across a delivery medium/network not wholly controlled by Amazon or Google - the Internet. The situation is the same as that of Royal Mail and other couriers: they may have control of their own fleet of vehicles, but they still have to pay the road tax for these vehicles in order to deliver your parcel. Data transfer is certainly not an area that cloud providers are looking to make money, in fact you will find the cost of data transfer is very similar across the many cloud providers out there.
3) All cloud providers only charge for data transfer out (egress), while data transfer in (ingress) is completely free, which is incredible. If you think about data analytics use cases and big data applications, the amount of data you ingest into the cloud for analysis is significantly more than the results returned. Similarly, if you consider using the cloud as for offsite backup and long term archiving, you simply pay nothing for the transporting the backup data - imagine if you do not have to pay Iron Mountain to take your tapes offsite. As we will see below, traditional hosting providers charge for pure bandwidth in both directions and therefore such use cases are impossible to realise economically in those traditional hosting environments.
4) Depending on volume of traffic and the enterprise’s current WAN topology, AWS DirectConnect offers a genuinely cost-effective option. Because traffic flows directly between your network and Amazon's, the data transfer costs are significantly reduced to reflect that - $0.03 versus $0.09 per GB.
5) Let us consider the traditional colocation or datacenter hosting providers and how they charge for bandwidth before we label cloud data transfer as costly. Here are the typical options:
- Fixed bandwidth: you predict your bandwidth requirements, you buy a CIR in Mb/s and thats it. Might cost you less, yes, but consider there is no scaling, no flexibility and no elasticity.
- Burstable bandwidth: you pay a fixed price upfront for the cost of an interface (say 1Gb/s), then you commit to a minimum throughput, say 100Mb/s, and if you exceed that you pay for every additional 1Mb/s chunks (based on 95th percentile) at an excessive rate - typically £10 per 1Mb/s per month. This can be very wasteful and very expensive overall if not considered carefully.
- (FREE): nothing in life is free, if you see free it means the service provider is covering the cost of bandwidth elsewhere. Typically this is what you see if you rent a dedicated physical or virtual server, not if you are renting colocation space.
6) Not all bandwidth is created equal. Similar to the argument made in the original article about security, smaller colocation and hosting providers cannot compete with the likes of AWS and Google when it comes to recruiting, innovating and operating networks at scale. This is a particularly hard one to measure, as all service providers are constantly balancing customer demand with the available transit bandwidth and aggregate capacity on their network, and in most cases they overcommit in order to control cost and engineering effort. So, what good is a cheap data transfer rate if it doesn't work when you need it?
7) What is even more interesting is that the cost of data transfer, just like everything else in the cloud, is going down. Amazon have recently slashed down the data transfer costs by 6-43% depending on your region. Economy of scale works also for bandwidth and data transfer costs, so how are the traditionally smaller hosting providers ever going to be able to reduce your bandwidth costs?
8) Finally, the cost of PEOPLE is typically overlooked in these discussions. I intend to write a future blog about what AWS call "The undifferentiated heavy lifting of IT" that I did for a good part of seven years of my life. To give you an example, one of my responsibilities were to get rid of cardboard boxes, plastic foam and other packaging in the skip outside the datacenter. Practically everything from racks to disks came in this oversized packaging and it was the engineering team’s responsibility to dispose of it responsibly. Just think, over seven years of working there, how much time I spent doing that and what else could I have been doing instead. Oh yes, and don't get me started on the daily 30 minute walk across all the data rooms to visually check for alarms or flashing lights "just in case" the monitoring system didn't pick these up.
Although from afar it may appear that public cloud providers are overcharging for bandwidth, the reality is that, if you look beneath the surface, you realise that abundant bandwidth and data transfer costs are compelling reasons why you SHOULD use public cloud. Do feel free to comment below with your thoughts.