Storage Appliances in the Cloud
AWS announced Elastic File System (EFS) as General Availability earlier this year, meaning we are a little closer to finally having a managed file storage service within AWS. However, its release is limited to a handful of regions and a limited set of features.
Enterprise File Services in the cloud
Azure has an equally lacking offering today with Azure Files. We are all aware of how quickly both cloud providers mature their products, but in the meantime what alternatives do we have for providing an enterprise ready file system within our favourite cloud platforms?
Luckily there are some powerful alternatives to the vendors’ own offerings that can help customers today, but it can be a confusing landscape to navigate. Let’s review some of these alternatives, and provide a comparative overview of each.
Your file systems storage options
Where possible it is always advisable to design loosely coupled stateless infrastructure using object storage offerings such as S3 or Azure Blob storage. However many traditional applications and database vendors such as SAP or Oracle require NFS or CIFS to provide a shared file system. If you are looking for multi-instance read and write access to the same data set such as a windows file share, then object storage will not suffice, and again CIFS/NFS protocols are required.
Outside the native AWS and Azure offerings, there are generally three approaches to providing this:
- Connecting External Storage to the cloud – Connecting an externally hosted storage array into AWS/Azure via Direct Connect or Express Route
- Marketplace Appliances – Use a virtual storage appliance. These can be typically found in the Azure or AWS Marketplaces
- Homegrown – Create a file service using open-source or commercial alternatives such as GlusterFS, Microsoft DFS-R, or NFS + DRDB
Connecting external storage to the cloud
Many of the Direct Connect and Express Route enabled colocation / hosting providers have excellent low latency connectivity between their data centres and the key cloud locations for Azure and AWS. For example, in some AWS Direct Connect hosting locations you can expect latencies of circa 1-2ms RTT which is comparable to EBS SSD storage within AWS itself. This low latency provides the ability to mount CIFS/NFS volumes from Network Attached Storage (NAS) that are located in another datacentre to instances hosted in the cloud, with little impact to performance.
So why would we do this?
Well there are a few reasons:
- You may have already purchased a high performance/capacity storage array and are only in year one of your three year support contract but wish to get a better ROI from it
- You want access to richer storage features such as de-dupe, compression, backup, encryption, dedicated disks etc. than is available in AWS or Azure
- You want access to cloud economies but also to leverage the storage efficiencies of on premise storage arrays
- You want control of your data, or have a data compliance mandate that prevents you from putting protected data into the cloud. i.e. it has to stay on-premise
Many storage arrays that support IP based storage protocols such as CIFS or NFS can be used to present storage to cloud instances in AWS or Azure. Some storage vendors like NetApp have simplified this for customers, and produced reference architectures for connecting into AWS & Azure using their storage arrays from on-premise locations. These reference architectures take into account the routing, switching and security requirements associated with integrating your storage infrastructure with AWS or Azure.
The speed and ease of which you can connect into AWS and Azure has also opened up a market for Storage-as-a-Service (STaaS) type offerings, which provide a managed version of what is described above. Cloudreach partner with Zadara who offer managed storage (Zadara Storage Clouds are built in data centres adjacent to Amazon & Azure regions), connected into your VPC or Virtual Network at low latency allowing for externally hosted storage to be shared across multiple cloud instances.
A comparison of managed file storage services for AWS and Azure
Zadara has a rich list of enterprise features that are offered as part of managed service. When compared with AWS EFS or Azure Files, Zadara offers encryption, snapshots, integrated backup and full protocol support, but also access to dedicated disks for high performance legacy workloads that still demand a shared file system. Zadara also offer on-premise storage appliances that allows for replication of on-premise files systems into their datacentres for presentation to AWS and Azure virtual instances.
High availability file services using Zadara and AWS
Virtual Filer appliances can be deployed directly into Azure and AWS from their respective marketplaces, and leverage the existing storage services of the cloud providers which can include block and object storage. These appliances then abstract the native storage offerings of the cloud provider via standard network share protocols providing enterprise-grade NAS features that are yet to be made available in AWS and Azure. For example, you can:
- Protect your data with snapshot copies and cloning
- Replicate across availability zones or regions, or with some appliances you can replicate from your onsite arrays into AWS & Azure using the same replication engine
- Reduce your storage consumption with data deduplication, compression, and thin provisioning efficiency features
- Leverage CIFS, NFS, and iSCSI storage protocols
An example of a marketplace appliance (SoftNAS) within AWS
There are also numerous Total Cost of Ownership (TCO) comparisons available online covering the commercial benefits of Virtual Filer Appliances vs. EFS (for example) with stark examples of cost savings. SoftNAS are an industry leader in this area and claim over 50% cost savings vs. using AWS EFS. Other alternatives include NetApp Cloud ONTAP which can been connected to on-premise NetApp storage arrays allowing for full control over the movement of your data.
Unlike marketplace appliances, home-grown solutions can take more effort and time to implement and support as you are responsible for deploying and supporting the solution. However, there can be significant cost advantages in doing so as you can often avoid paying license or support fees.
GlusterFS is a popular open source network attached files system that can aggregate memory and storage and present them as a pool resources or volumes to an application over the network using NFS. GlusterFS scales out in building blocks and uses the concept of "bricks" which are simply EC2 or Virtual Instances which have virtual disks attached. As you require more storage or I/O, you add additional "bricks" which are are presented as a logical volume which is then mountable.
If you are an invested Microsoft customer and require a multi-master replication engine that can synchronize files and folders between multiple servers and locations, then Distributed File System Replication (DFS-R) for Windows 2012 R2 is a powerful solution. When combined with DFS-Namespaces you can group files and folders across many servers and present them using one or more local or global namespaces, allowing you to force users to use particular locations when accessing files.
For a more cost effective equivalent to DFS-R, you can configure virtual linux instances as NFS servers and use Distributed Replicated Block Device (DRDB) to provide block level replication for your file system.
All of these home-grown solutions require manual configuration, and inevitably manual support too which may not suit your needs.
Comparing popular Marketplace Appliances and home-grown solutions
As we have seen there are many approaches to providing shared file services within AWS & Azure, either by using the native products from the cloud providers themselves, connecting in external storage solutions, using marketplace "off-the-shelf" appliances, or from building and customising a solution yourself. As an organisation, you need to fully review the spectrum of differences between each approach, taking into account the cost, time to value, configuration and operational implications. One of the core benefits of cloud is that it allows you to perform quick technology evaluations and "fail fast". Use this to your advantage and ensure that the chosen end solution will meet your desired goals.
Cloudreach has developed extensive experience in designing and implementing the options compared here, using our expert knowledge of Azure and AWS to support a number of uses cases. Please get in touch if you require our help.