Why Cloud Budgets Don’t Stay in Check — And How to Make Sure Yours Do

January 28, 2022

Contributor: Lydia Leong

Each cause requires a unique course of action.

Cloud budget overruns don’t have a singular cause. Instead, they come in a bright rainbow of jelly bean flavors. And, as such, each requires a different kind of response.

Ungoverned costs

The organization has no idea what it’s spending, really, much less where the money is going, other than the big bills (or often, many little credit card bills) that it pays each month. Reining in these expenses requires basic cost hygiene: Analyze your cloud bills, implement a cost management tool and ensure it’s useful by implementing tagging or partitioning discipline.

Watch now: The Cloud Strategy Cookbook: Find the Recipe for Your Success

Unanticipated usage

In this situation, the organization is the victim of its own cloud success — more and more unanticipated cloud projects start showing up, blowing out the original budget estimates for resources. Those cloud projects are delivering business value, and it doesn’t make sense to say “no” to them (and even if central IT says “no,” the costs can usually be allocated to a line-of-business budget). Nevertheless, this can cause a lot of organizational angst because central IT or the sourcing team didn’t anticipate the spending. Organizations should take this as an opportunity to shift budgeting processes for the digital future. Cloud chargeback will help support future decision making.

No commitments

The organization could secure discounts by using public discounting mechanisms, such as AWS savings plans and Azure reserved instances, as well as making a contractual commitment for a negotiated discount. But because the organization feels like it can’t perfectly predict its use and isn’t sure whether it will use all of what it’s using today, it commits to nothing, therefore ensuring that it spends grotesquely more than necessary. This is universally a terrible idea. Organizations that aren’t in the early pilot stage have long-term production applications and some predictability of usage; commit to the stuff you know you’re not killing off.

Dev/Test waste

Developers are provisioning the biggest things they can get away with (or at least being overaggressive in their estimates of what they need); lots of abandoned resources are idling away and the Dev/Test infrastructure that isn’t used outside of business hours isn’t being suspended when unused. This is what cloud cost management tools are great for: identifying obvious waste so you can eliminate it, largely by shutting it down or suspending it, preferably through automation.

Download now: The IT Roadmap for Cloud Migration

Too much production headroom

Application teams haven’t implemented autoscaling for applications that can scale horizontally, or they’ve overestimated how much production headroom an application with variable usage needs (which may result in oversizing compute units or being overly aggressive with autoscaling). This requires you to implement autoscaling with some thoughtful tuning of parameters and to consider initiating a business value conversation about the costs and benefits of having higher application performance on a consistent basis.

Wrongsizing production

Production environments are statically overprovisioned and therefore overly costly. On-premises, 30% utilization is common, but it’s all capital expenditures (capex), and as long as it’s within budget, no one has traditionally cared about the waste. However, in the cloud, you pay for that excess resource monthly, forcing you to confront the ongoing cost of the waste.

Anyone who tells you to “just” rightsize has never actually tried to actually do this. The problem is that applications that scale vertically typically can’t be easily rightsized. It’s likely difficult to impossible to do automatically because of complicated application installation. The application is fragile and may be mission-critical, so you must be cautious about maintenance downtime. And the application team — the only people who really understand how this thing works — is likely busy with other priorities.

If this is your situation, your cloud cost management tool may cause you to cry hopelessly because you see the waste and know that taking remediation actions is a complicated, cross-functional dance and involves delicate negotiation that leaves everyone wondering whether it wouldn’t have been easier to just keep paying a larger bill.

Suboptimal design and implementation

Architects are sometimes oblivious to cost when they design cloud solutions. They may make bad design choices, or changes in application features and behavior over time may have turned out to make a design choice unexpectedly expensive. Developers may write poorly performing code that consumes a lot of infrastructure resources or code that makes excessive (and, cumulatively, expensive) calls to cloud services. Your cloud cost management tools are unlikely to be of any use for detecting these situations. This needs to be addressed through performance engineering with attention paid to the business value of the time, effort and money necessary to do so. For many organizations, this may require bringing in a third-party expert to diagnose the problems and offer recommendations.

Notably, the answer to most of these issues is not to implement a cloud cost management tool. The challenges simply aren’t really as simple as a lot of vendors (and talking heads) make them out to be.

In short:

  • There are seven main reasons that cloud budgets get out of control.
  • Once you understand the reason for overspending, you can take appropriate steps to mediate costs.
  • Despite what some vendors tell you, a cloud cost management tool will not solve all of your problems. 

Experience Information Technology conferences

Join your peers for the unveiling of the latest insights at Gartner conferences.