KH.
AWS cost optimisation and Terraform IaC migration
2025·SaaS / HR Tech·5 weeks

AWS cost optimisation and Terraform IaC migration

A pre-Series-A SaaS startup was paying $14,000/month on AWS with no visibility into where the money was going. No Terraform, no tagging, no cost allocation. Everything had been provisioned manually over two years. I audited the entire estate, terminated unused resources immediately, right-sized everything that was over-provisioned, migrated to Terraform, and introduced Reserved Instances and Spot for the right workloads. Monthly bill dropped to $8,100 — a $5,900/month (42%) reduction.

$14,000 → $8,100

Monthly AWS bill

$5,900 (42%)

Monthly saving

~$70,800

Annual saving

0% → 85%

Terraform IaC coverage

The challenge

The startup had two engineers who had been provisioning AWS resources manually through the console since the company was founded. Two years later, nobody had a clear picture of what was running or why. The only signal was the monthly AWS bill, which had been climbing consistently as the product grew.

There was no Terraform, no CloudFormation, no IaC of any kind. Resources had no consistent tagging, making cost allocation impossible. When I ran an inventory, I found 23 EC2 instances across 4 regions, 8 RDS instances (including 3 in regions where no application was running), 12 load balancers, and hundreds of forgotten EBS snapshots from instances that no longer existed.

The founder's concern wasn't just the cost — it was that a senior engineer would leave and take institutional knowledge about the infrastructure with them. They had already had one close call when a developer left and it took three days to figure out what a mysteriously named EC2 instance was doing.

The approach

01

Full inventory and immediate quick wins

I spent day one building a complete inventory of all AWS resources across all regions using AWS Config and a combination of CLI scripts. I identified $800/month of obviously unused resources immediately: stopped EC2 instances that had been stopped for 6+ months (and their attached EBS volumes), 3 load balancers with no targets, 2 RDS instances in eu-west-2 that the application hadn't used in 4 months, and 400GB of orphaned EBS snapshots. These were terminated after 24-hour confirmation. First week saving: $800/month.

02

Right-sizing analysis

Using CloudWatch metrics, I profiled CPU, memory, and I/O utilisation for every running EC2 instance and RDS instance over a 30-day window. Production application servers were running at 15–25% CPU on m5.2xlarge instances. I moved them to m5.xlarge with a 1-week trial period — no performance impact. Development and staging instances moved from m5.xlarge to t3.medium (they were running at 5% CPU). RDS production moved from db.r5.large to db.r5.medium based on connection count and query patterns. Right-sizing saving: $2,100/month.

03

Reserved Instances and Savings Plans

For the production database and the two production application servers that had been running continuously for over a year, I purchased 1-year Reserved Instances. For the baseline EC2 compute, I implemented a Compute Savings Plan covering 60% of the baseline spend. Savings Plans are more flexible than RIs for variable workloads — if the instance type needs to change, the savings plan follows. Commitment saving: $1,400/month.

04

Spot for batch and CI workloads

The company ran nightly ETL jobs and their GitHub Actions CI on on-demand EC2. I moved these to Spot instances with a Spot Fleet configuration using multiple instance types to reduce interruption risk. CI runners on Spot dropped from $600/month to $180/month. ETL jobs moved from on-demand to a mixed Spot/On-demand fleet (70/30) with checkpointing — Spot interruptions restart from the last checkpoint, not from scratch.

05

Terraform migration

With cost reductions validated, I began codifying the remaining infrastructure in Terraform. I started with the lowest-risk resources — security groups, IAM roles, S3 buckets — and used terraform import to bring existing resources under management. Existing resources were imported and verified with terraform plan showing no pending changes before I moved to the next resource. For new resources (load balancers, target groups, RDS parameter groups), I wrote Terraform from scratch. After 3 weeks, 85% of the estate was under Terraform control with remote state in S3 and DynamoDB locking.

Results

$5,900

Monthly savings (42% reduction)

$70,800

Projected annual saving

85%

Infrastructure under Terraform (was 0%)

3 weeks

Payback period on engagement cost

$800

Immediate savings from unused resources (day 1)

AWS cost optimisation and Terraform IaC migration — result screenshot

Need something similar?

Every engagement starts with a 30-minute call to understand your specific situation. No pitch — just an honest conversation about what you need and whether I'm the right fit.