Cloud cost optimisation is not just limited to buying savings plan or fishing for extra discounts but is also closely linked with your code quality and overall architecture. It also reflects how disciplined and frugal your team really is.
There are many ways to lower your AWS bills and all of them should be a part of your overall AWS cost optimization strategy. I like to categorize these measures in two major categories – organic and inorganic measures.
Organic Measures: Techniques which help you reduce your AWS bills by reducing your real usage of AWS services and resources. These measures usually take more time to implement but aim to reduce your bills by improving overall health of your infrastructure and application code.
Inorganic Measures: Techniques which are usually centred solely around pricing and discounts. These include measures like purchasing savings plans, reservations, negotiating pricing with AWS and so on.
Should you go all organic?
No, in fact it is mostly better to start with inorganic measures first like buying some savings plan. Organic measures usually require more planning and effort. Inorganic measures can give you some savings almost immediately. In this Notion page, I discuss how one can draft a holistic strategy to lower their AWS bills.
In this article, I discuss various ways to organically reduce your AWS bills. These are general recommendations that can be applied across various services. Lets dive right in!
Five ways to bring down your AWS bills
🗑️ Delete unused resources
Unused VMs, elastic IP addresses, disks, old snapshots, databases can add up to become a decent chunk of your AWS bills. A stopped instance also costs you money. I have seen accounts spending over $500 a month in just unused elastic IPs. It is pretty normal to have unused resources make up 10%-15% of your AWS bills.
This should be the first step in your cost optimisation journey because one deleted resource means one less thing to worry about while planning other activities to reduce your bills.
↔️ Right Sizing
Right-sizing simply refers to selecting or migrating your instances to cheapest instance type which can run your workload efficiently, wherever possible. For instance, if an application is running on m6g.xlarge
instance (16 GB RAM and 4 vCPU) but can also run on a c6g.xlarge
(8 GB RAM, 4 vCPU), it is better to change to new instance type and save costs.
Right sizing is a bit tricky for stateful services (like databases), but very easy for stateless services. It requires no changes in your codebase and therefore can usually be done pretty quickly by your devops team.
❄️ Spot Instances
When you spin up an EC2 instance, you can choose to either get an on-demand instance or a spot instance.
EC2 has a lot of unused capacity at any given time. These are basically instances which AWS currently has but have not been allotted to any customer. The availability is determined by a combination of region, instance type, current demand and several other factors. So, at any given time, an m6g.large might be available but c6g.large might be unavailable.
Spot instances are the cheapest (upto 90% discount from on-demand pricing), most flexible but slightly inconvenient way to optimise your AWS bills. There is no commitment required, but AWS can reclaim your instances when EC2 (on-demand) requires it. AWS provides a two-minute notification before reclaiming the spot instance allowing graceful shutdown of workloads.
If your applications are stateless and fault tolerant, you can leverage this and run your workloads extremely cheap by using spot instances. These are however not good for databases, caches etc as these services are inherently stateful.
✤ Auto Scaling
Traditionally, infrastructure has always been thought of as something that is always running. One important advantage of being on cloud is the ability to turn off and restart VMs as per your needs dynamically.
Implementing sound autoscaling strategy for your workloads can be a very effective strategy to reduce your costs very quickly. You can easily couple autoscaling with spot instances to slash your bills massively.
🐰 Application Performance
Application performance means optimising the code or your architecture to reduce the time taken to process a request. It can involve things like optimising code, caching, add DB indices, rate limiting, optimise UX to avoid unnecessary calls and so on.
This needs much more involvement from your application team, and full SDLC but it should always be on your long term roadmap.
Improving your application’s performance can directly impact the amount of resources it consumes. The more inefficient your code is, the more resources it will demand and higher your cloud costs will become.
An example: Let’s assume a request takes 1 second to complete and is called at an average throughput of 300 requests / min. This means that the request needs 300 seconds of CPU time every 60 seconds which in turn means that you’ll need to run at least 5 cores continuously just to serve this one request. Optimising this request to take 500ms instead of 1 sec can theoretically reduce the number of cores required to half and therefore allow you to run on a smaller machine.
There can be similar examples that focus memory consumption, but the calculations will remain same. In many large applications, I have seen the top 10 – 20 requests or queries (on DataDog / NewRelic / any other APM) taking up more than 90% of the resources. You must really focus upon these requests and try to optimise them for memory, CPU or throughput.
In a nutshell, it is important to have a long term and a short term strategy for optimising your AWS costs. While short term measures will give you instant savings, they will hide many real issues under the bed. Organic measures discussed here will take more effort to implement but will put your AWS account on a path to better health.