Looking for high level cost estimate for implementing a data platform, presumably Databricks or Snowflake. I understand it all depends, such as use cases, data volume, processing frequency, etc. Curious if anyone who has implemented or used one of those two platforms in your company could shed light on cost?
Sort by:
Thanks a lot for sharing! DBU cost estimate is a little challenging for us as we don't have tech and data details available at this point. On the other hand, we don't want to commit to usage that might exceed actual consumption at least in year 1 when we would be in crawl phase. For the same usage, do you have an idea whether pay as you go or a contract might be cheaper? Thanks again.
You could consider starting with pay-as-you go model and once you have an idea of your cost estimates for a few months, you can decide whether to proceed with a contract. You could also consider a small workload to give you an idea if you have limited budget instead of moving a large or multiple small/medium workloads.

We have chosen and implemented Databricks as our data platform and have done a 3 year agreement with a preset annual consumption and bought this via AWS marketplace. You either decide to pay as you go or you do a contract with them for 1 year, 3 years etc by estimating the DBU you need and the service tier you need. For DBU cost estimates, decide what workload type you’ll use (job/ETL pipeline, interactive notebook, SQL warehouse), estimate the size of your cluster (VM size, number of nodes), estimate hours it will run (remember: it’s billed per second) based on the cluster size decide and compute DBUs consumed per hour) × hours used = DBUs