To make informed decisions consistently, it’s important to first start with a set of problems to solve and have a crisp understanding of the situation at hand. This requires clear visibility into all the variables involved, a list of available options, and a plan to execute the chosen strategy.
This article will help set the basis for understanding the TCO optimization problem in a multi-cloud environment.
Understanding the differences between cloud providers
AWS has evolved from offering one single service in 2006 to more than 200, as of today. It was years before the competitors, and the original AWS offerings clearly influenced the rest. The first services from AWS in 2006 were EC2 and S3. Azure shortly followed with Virtual Machines and Azure Blob, and Google Cloud Platform with Compute Engine and Google Cloud Storage.
Eventually, the service offerings became more specialized. All providers came up with their own set of strengths – Azure created specific cloud versions of the Microsoft products (such as SQL Server), and Google innovated with Kubernetes (which was implemented by all providers).
There has also been a race in areas like Machine Learning, Artificial Intelligence, Data Mining, and Data Lakes. Each provider has been implementing their visions and presenting significant differences in the implementations. Some of them implemented managed services. Others implemented PaaS. In some cases, pre-configured environments were used (with the end-user getting full control and the use of underlying services such as Compute Engine).
The Big 3
For the three biggest services (PaaS SQL Databases, Virtual Machines, and Storage), there is a 1:1 mapping among all the three major cloud providers. There are various flavors within each service.
For instance, Google Compute Engine allows flexible selection of the compute and memory capacity, in contrast to the fixed compute to memory ratio offered by AWS or Azure.
In the case of SQL Servers, licensing management is very different. While some of them allow Hybrid (“Bring your own license” equivalent) for enterprise agreements, others do not.
Finally, for Object Storage, there are plenty of replication options (single availability zone, replication within AZ, region, global, reduced, infrequent access, etc.) that are not available across all providers.
Why should these differences matter, and what does this have to do with the TCO? Why should you care about standardizing the vocabulary?
There are two main reasons:
- It’s essential to compare the same service offering from the user’s point of view from one cloud to another. This gives a clear understanding of the situation. What are you spending on each cloud provider, and on what?
- To know the options and compare them, it’s necessary to ensure that the offerings are the same. In some cases, it means that you’ll need an extra complimentary service in order to equalize.
It’s also interesting to note that as the services get more specialized for each provider, a 1:1 comparison won’t be possible. It would make better sense to compare the overall spend in a specific area such as Machine Learning – Training, whose service models could differ for each.
A good solution could be creating a mapping between the set of services to a canonical name, which should be cloud-agnostic. Each cloud provider provides the usage report using some files with different column names.
Using these two columns for each provider covers most of the cases described:
|Cloud Vendor||Column 1||Column 2|
|Cloud Vendor||Column 1||Column 2||Canonical|
|Azure||Microsoft.Cdn||Content Delivery Network||CDN|
Or even more complex mappings such as the Virtual Machine (VM) service, considering the Saving Instruments.
|Cloud Vendor||Column 1||Column 2||Canonical|
|Azure||Microsoft.Compute||Virtual Machines Licenses||VM|
For fine-grain control, add a third column and implement additional filters (for instance, identify when the Saving Plan covers a Lambda Function).
A mapping table like the one above can represent costs per cloud per canonical service and compare it across the providers.
Once the unit costs per cloud provider are identified, you can score the price of each one and shift the workload to the most cost-effective option with a multi-cloud-aware load balancing mechanism.
Finally, and especially for debugging purposes, it’s necessary to keep the original name and switch quickly from one naming convention to the other. At Cloudwiry, we have implemented a dashboard where it’s possible to have a consolidated stacked view based on the canonical services and the value per cloud provider.