Cloud Economics with Kubernetes and Containers
I recently was asked about how to estimate the cost utilized within a Kubernetes environment. The economics of dealing with On-Premises to the Cloud is well defined. However, the capability of estimating the cost of running orchestrated containerization is not well defined. In this article, I am going to explain how to estimate this cost and setup involving baselines to help analyze cost. Let’s start with a baseline. The baseline can be created on a couple of factors: Availability and Resource Allocation.
Availability identifies how long an application or service needs to be available based on the demand. The availability is determined by the importance of the data.
Example Baseline for Availability: This is different than the availability of Kubernetes Master Clusters. The Kubernetes master clusters are inherited by the support provided by the Service Provider. For the context of this article, the Service Provider is meant more as a Platform as a Service (PaaS). The application provided to the customer in this sense will need to be highly available. That is to include Front-end, Back-end API, and Database.
This area identifies the resources that are required by the application or service. In Kubernetes, you identify the resources within the specifications. However, the specifications of what the application provider compared to the owner of the PaaS are two different things. A baseline MUST be created by the PaaS owner to accurately identify costs. For example, if a customer identified that they will need 4 CPU cores and 6000gbs of ram for a single Angular application; well this might be an overestimation. Instead, there should be a basic threshold put in place for a thing like Angular Application, Spring Boot Applications, etc.
Example Baseline for Resource Allocation: For this example, we will determine that a Spring Boot application for the Application Program Interface (API) will require 2 CPU Cores and 4 GBs of Memory. Of course, these estimates are determined to base on the Limits, not the Request. If you understand Kubernetes, then Limits are the thresholds and Requests are where the application will start. Since we are estimating a three-tier stack we will estimate the other two components. The Angular application will be the front end. This will need 1 CPU Core with 1 GB of memory.
PaaS baselines for Kubernetes Instance Types will be t4g.xlarge that is on a Linux Operating System and 100GBs of space. This will run around 71.54 a month and 858.48 a year.
Resource Allocations — Database
Databases are usually where most of the costs come from. Now, most Cloud Service Providers (CSP) do not charge for data come into the cloud. So we are good on that part. Especially since we are working with Kubernetes in the cloud. This usually leads to more of a Cloud Native approach. For an RDS instance running on Oracle and around 50 TB of data; the cost is around 33,777.84. As we start to decrease the size of the data, so does the amount of the cost. For instance, the removal of 40 TB of data (10 TB) took the cost down to 14,936.24. 1 TB of data took the cost down to 10,696.88. Now if we move to MySQL database for 50 TB of data, the cost is 16,888.92. For a PostgreSQL RDS at 50 TB of data, the cost is 17,110.84. Of course, MSSQL databases are more expensive. This database came at around 45,884.52. Note that all of these costs above are per month! So let’s make our first baseline for our customers. 1. What will be the data requirements of the customer. The more data the higher the costs. 2. What type of database is required? Two important questions. An average application we can assume may run on 2 TB of data. We also need to determine the COR because all the estimates above were done on a 40 COR System with 160 GBs of Memory. This was a DB.m4.10xlarge. So the next question. 3. What are the transaction latency requirements? Most applications do not need that much power unless there is a large number of consumers hitting the application and querying the database. Now, this is not referencing Machine Learning, this will need a different level of analysis. So let’s create our baseline as the owner of the PaaS.
- Databases MUST be either an RDS or a containerized DB.
2. Setting the price of the database to a standard instance type. DB.m4.2xlarge which contains 8 Cores, and 32 GB of Memory.
3. Standing the database to only a few options to start with: PostgreSQL, MSSQL, and Oracle. A cost analysis will be based upon 1 TB size limits. PostgreSQL will be 1,301.32 per month. MySQL will run 1,257.52. Oracle will run 1,257.52. Last but least the oddball MSSQL. MSSQL runs at 7,056.64.
So we can make another factor as the PaaS. Let’s create two main categories, non MSSQL vs MSSQL database applications. Now let’s create our overage cost to run these environments. We will grab the largest of the Non-MSSQL and that cost will be 1,301.32 x 12 (months) at 15,615.84 a year. Great So we will start with that.
A database that is acceptable to containerization will follow a different process. These databases will be associate with specific instance type costs. The instance type of m4.2xlarge. The cost of this instance is 0.40 cents per hour. Comes down to $288 a month and $3456 a year. Wow! RDS comes out of the box with multiple services available. If you are willing to spend time doing the grunt work then the costs can be reduced. However, we have another cost to remember, storage. Containers still need a StorageClass and a type of Storage allocation. We will start a baseline metric at 1 TB of Elastic Block Storage. This currently runs at 158.09 a month and 1897.08 annually. For this to work with a database you will need to remember snapshots as this is also another cost. 1 snapshot will change the cost of the EBS from 158.09 to 705.59. For a database snapshots will be needed. However, can be inherited by a team to control so that EBS Snapshots do not get out of control. Let’s baseline are-occurring EBS snapshot of 1 per instance.
For this example, our onboarding application will utilize a PostgreSQL RDS Instance with high availability. This means that two will need to be created. Annual estimated cost if 15,615.84 x 2. Of course, this does not include read replicas and backups methods.
So as you can see the real cost is how the database is going to be handled. These must be controlled by the PaaS team to help monitor and establish baselines.
Summary of Baselines
So we established some baselines for an onboarding application running a three-tier stack.
Application Instance Costs: 858.48 per instance and the customer needs 3 * 2 for a total of 6. Total estimated costs = 5150.88 Annually to run all the applications.
Database Costs: The customer has decided to go with an RDS package running PostgreSQL. The estimated cost of running two RDS will be 31231.68.
This will give a total infrastructural cost of $36,382.56. For an enterprise-level application, this is pretty amazing!
Cost of Owner of PaaS
Now, what is the cost to run Kubernetes? Well if we think about it a High Availability Cluster usually consists of 3 Master Nodes, however, there can be more. We will start as our base with 3 Master Nodes. Also, remember that those master nodes are only for the Kubernetes Engineers to monitor and make changes via the API. This can be utilized as well if there is a GitOps approach using tools such as Flux and ArgoCD. Either way, the owner needs to purchase these nodes. These nodes do not have to be as robust as the Node provided to the customer. The baseline instance type for the master nodes will be a t4g.xlarge running 100GB of storage. A single instance cost per month will be 71.54 at an annual cost of 858.48. The cost of three will be annually 2575.44. Now overtime the baseline for how many clusters and master nodes will change.
Things to Consider:
- The clients being brought on maybe charge an upfront cost however the bill is monthly. Meaning that this will provide a capability for the provider to either establish a solid baseline by asking for the cost upfront or allocate the cost monthly.
- The baselines will consistently evolve. These baselines should be monitored and change incrementally as needed.
What is the customer paying for:
- Allocated Instance Types where containers reside. (to included database if containerized)
- Allocated Databases (RDS) (if required)
- EBS storage.
- Possibly a small extra percentage for things like Network, Latency, Transfers, etc for running Kubernetes environment.
Agile Cloud Economic Process
- Assess your current baseline
- Identify Improvements
- Test Hypothesis with a Successful Outcome
- Re-baseline based on a new assessment
Assess your current baseline
If you notice above the first thing that happened was to identify the PaaS thresholds and resource allocations. A nightmare will occur when dealing with a cost if the resource is continuously fluctuating. It’s hard to estimate the cost of a person shopping in a mall grabbing random items.
Metrics are a good indicator of whether or not the correct resources were allocated in the baseline. Improvements could identify a need to increase the cost as services maybe 90% utilized. The same goes for a decrease cost. By changing an Instance Type to a large instance type; more containers will be able to share the resources. The metrics may identify that the cost decreases.
Test Hypothesis with a Successful Outcome
Do NOT make changes without a projected success. For instance, if an instance type was increased in response to reduce cost, that is our metric for success. However if we see that larger instances are being spun up to support and our cost increase; well the test failed and should go back to the original baseline for evaluation.
Rebaseline based on a new assessment
Hopefully, the team is Agile. That way the cost baselines can fluctuate based on the improvements to help provide a better ideal cost analysis.