By Inioluwa Shittu
Have you ever tried managing a company’s cloud budget for a month or so? Let’s say, whilst trying to manage a successful business operation, you ran out of money before the end of the month, and this bounces back negatively on the company, because it leads to more expenses – you’re not alone in this.
You aren’t the only one who’s felt that cold sweat tickle down your back when you realise your monthly cloud bill has exceeded the budget. Yes, the company is scaling, the user base is growing, and the services are expanding, but it all feels like a runaway train. This is because every new feature and every successful deployment seems to add another layer of cost.
This cost often feels confusing because you’re unaware of the reason why it’s happening, but a deeper scrutiny will reveal that managing cloud infrastructure isn’t just about the percentage of operation time, performance, and feature delivery; it’s deeply about fiscal responsibility.
This fiscal responsibility helps tackle ‘cloud sprawl.’ The adoption of cloud computing has transformed how we build and deploy applications, enabling scalability and innovation. Through a few clicks or a line of code, resources are easily provided. This improves development; however, it creates a hidden, often overlooked challenge – runaway costs, which is what we mean by ‘cloud sprawl’.
Cloud sprawl is a result of the uncontrolled and unmanaged growth of cloud resources and services within an organisation. This happens when departments allow cloud services without proper coordination. This results in increased costs.
This article isn’t to inspire you theoretically on cloud economics, but to take you pragmatically through strategies, personal experiences, and lessons learned from years of building and managing cloud-native infrastructure at scale. To equip you with knowledge and insights that would help save your budget for innovation and growth.
Anatomy of Cloud Sprawl
A significant portion of cloud spending is wasted due to cloud sprawl. For instance, a 2023 Flexera survey of global cloud decision-makers reported an estimated 28% waste in public cloud spending. According to CloudClevr, 30% or more of cloud budgets are wasted, and these cloud cost challenges do not stem from the complexity of the system, but rather from simple human error.
For example, a scenario where a Cloud Development Environment (CDE) was designed for short-term testing and automatically shut down after a few hours to conserve resources. However, a harmless misconfiguration in a forgotten exit zero results in a substantial bill for idle resources that had been running unnecessarily for a few weeks. This is known as the ‘zombie resource’ phenomenon, which leads to cloud sprawl. Even little oversight can cause cloud costs to spiral and lead to a major financial drain.
Some of these instances that lead to financial cuts are:
Zombie Resources: As stated earlier, these are forgotten instances in the likes of unattached storage volumes and the load balancers left running after decommissioning a project. They consume compute, storage, and network resources without providing any business value, and quietly gather charges.
Over provisioning: Sometimes, engineers, in the bid to be safe, select larger virtual machine (VM) sizes, a more powerful database, or higher capacity storage than what the actual workload demands. This is a safety net, but it comes at a premium, whether they are used or not.
Unoptimised Data Transfer: This is often the silent killer on cloud bills. Data moving out of a cloud region or to the internet, known as egress costs, can quickly escalate in cost. Unoptimised data access, cross-region replication, or inefficient content delivery can lead to surprisingly high charges.
Developer Autonomy Without Guardrails: Allowing the development team to provide their infrastructure is the basis of DevOps swiftness; however, without proper guardrails, cost awareness, and automated policies, this freedom can inadvertently result in unchecked spending and resource drain.
These instances collectively show the need for a disciplined approach to cloud financial management. It is a shared engineering responsibility, not just a finance department’s problem. It’s a continuous engineering challenge that requires architectural efficiency in its design and planning phases.
Gain Deep Insight into Your Cloud Spend
The first step to establishing a disciplined approach to your cloud finances and controlling your costs is to achieve complete transparency in every dollar spent. Without clear insights, any effort at improvement will be ineffective.
You can start by using the native cloud provider tools, such as AWS Cost Explorer, Azure Cost Management, and Google Cloud’s Billing reports. These platforms provide elaborate dashboards and reports that enable you to understand the trends in your costs, identify anomalies, and pinpoint the root causes of your expenditures.
Native cloud provider tools are resourceful; however, beyond these tools, it’s also important to tag your resources. This involves applying metadata, such as ‘department’, ‘project’, ‘environment’, or ‘application-owner’, to every cloud resource, from virtual machines, databases, storage buckets and network interfaces.
These tags enable accurate cost allocation, which allows you to determine the teams or projects responsible for specific ventures correctly.
This level of insight is crucial for facilitating accountability and identifying the specific areas that require improvement.
Additionally, if your organisation uses multiple cloud providers, compiling your billing data into a single view provides a comprehensive perspective, vital for identifying all trends and possible cross-cloud efficiencies that might otherwise be missed.
Investing in third-party cost management tools can be considered for more advanced analytics, automation, and cross-cloud visibility. The likes of Kubecost, which functions particularly in Kubernetes environments, as well as DoiT International, Flexera, and CloudBolt. These platforms often provide deeper insights and automated recommendations that go beyond what native cloud tools offer.
Improve Your Resources
After gaining a clear insight into your cloud spend, the next step is to eliminate waste and ensure that your resources are perfectly aligned with your actual operational needs. This is the fastest and most relevant path to your immediate savings.
You can start by identifying and removing idle resources. Look for virtual machines that are powered on but exhibit minimal CPU, memory, or network activity. Detach storage volumes, such as AWS EBS volumes or Azure unattached disks, that are no longer connected to any active instance. Additionally, remove unnecessary snapshots and backup copies that consume valuable storage space.
Eliminating waste is vital; however, right-sizing your resources is equally important, which includes ensuring that your instances, like VMs and databases, are scaled to their actual workloads. This means you must continually monitor the CPU, memory, and disk I/O usage to identify resources that are either over-provided for, meaning you’re paying for capacities you no longer use, or under-provided for, which could result in a performance bottleneck and user dissatisfaction.
As mentioned earlier, an often-overlooked area of relevant spending is the reduction in data transfer costs. Egress costs arise from data flowing out of your provider’s network, which can be surprisingly expensive. To correct this, design your architectures to combine resources that communicate frequently within the same area or available zone, minimising the cost of cross-region or zone data transfer charges.
Architecture for Cost Efficiency
In the context of this sub-title, prevention is cheaper than a cure. You don’t save more in cloud computing just by reacting to monthly bills; you save by incorporating efficiency into your architecture from the beginning, during the design and planning phase.
This is about making intentional decisions that set a cost management foundation for your infrastructure.
DeveOps architecture combines various functional principles and tools to design, build, and manage infrastructure and applications throughout the software delivery process. It creates an environment that fosters collaboration, automation, and continuous improvement.
DevOps architecture helps organisations speed up their time to market, improve the standard of their products, and, importantly, enhance their customer satisfaction.
It’s important to embrace serverless computing where applicable. Services like AWS Lambda, Azure Functions, or Google Cloud Functions allow you to run code without provisioning or managing servers. They allow you to pay only for the particular compute time consumed when your function is implemented, completely removing the cost of idle servers. This model is cost-effective for intermittent workloads.
Another architectural approach is containerisation, which is especially managed by platforms like Kubernetes – Amazon EKS, Azure AKS, Google GKE. These containers enhance resource utilisation by enabling multiple applications to share resources more efficiently across fewer underlying virtual machines. This results in effectiveness and lower computing costs.
While the principles of cloud cost improvements are universally applicable, businesses operating in Africa may find that certain aspects require particular emphasis. Firstly, bandwidth costs can be an important factor. In many parts of Africa, internet bandwidth is still relatively expensive compared to other regions. This highlights the importance of enhancing data transfer, particularly in terms of egress.
Secondly, beyond AWS, Azure, and GCP, the global scalers that are continually expanding their presence, it’s worth exploring local or regional cloud providers. These providers might offer competitive pricing for specific services, provide better hidden benefits for local users, or help address some particular data residency and compliance concerns more directly within the continent.
In conclusion, by employing these strategies and fostering an active, cost-conscious culture, companies both within Africa and globally can unlock substantial savings in their cloud computing expenses. These savings are not just reduced costs, but valuable capital that can be reinvested in innovation, growth initiatives, and the sustained success of the business.
About Inioluwa Shittu
Inioluwa Shittu is a certified AWS DevOps engineer and solution architect. His experience spans years of in-depth knowledge of cloud computing strategies (LAAS) and building, deploying, and maintaining cloud environments. He has experience in both Bash and Python scripting, with a focus on DevOps tools, CI/CD, and AWS cloud architecture.
He has been involved in the administration of environments using Windows, Ubuntu, Red Hat Linux, and CentOS, employing configuration management tools such as Chef and Ansible.
He’s certified by AWS as a Solution Architect (Associate), which confirms his understanding of best practices for architecting enterprise solutions that are cost-efficient, highly available, and scalable.