A quick guide on how Lakesight computes Databricks costs.
Lakesight calculates costs as accurately as possible from public list prices and data available through Databricks REST API, but will not replicate the bill to the penny. The goal is to provide a consistent cost signal to identify what costs most and drive optimization decisions.
All costs are calculated the same way based on the sum of two components:
The underlying infrastructure to cloud provider.
Prices used by lakesight for these costs are coming from Azure public retail pricing page. They're based on region and retail rates.
Fetched daily to always reflect the latest available prices.
Databricks' pricing for the service provided.
Pricing depends on workspace tier, cloud provider and usage of photon, as described in Databricks official pricing page.
Additionally, depending on the type of Virtual Machine selected at cluster creation, the DBU/h price will vary.
All costs are calculated the same way from cluster events available through Databricks REST API calls.
The following cluster events are tracked and used to define cost-segments to be priced individually according to the number of running workers:
The driver node is considered always on from RUNNING to TERMINATING events and is priced for that segment.
VM cost = nodes × duration (h) × price/h
DBU cost = nodes × duration (h) × DBU/h × tier price
Segments
VM price $0.856/h · DBU 3 DBU/h
| Segment | Workers | Dur. | VM | DBU |
|---|---|---|---|---|
| CREATING → RUNNING | 1 | 4m | $0.05 | $0.00 |
| CREATING → TERMINATING(driver) | 1 | 32m | $0.40 | $0.21 |
| RUNNING → RESIZING | 1 | 5m | $0.06 | $0.03 |
| RESIZING → UPSIZE_COMPLETED | 1–6 | 3m | $0.15 | $0.08 |
| UPSIZE_COMPLETED → RESIZING | 6 | 1m | $0.05 | $0.03 |
| RESIZING → RESIZING | 4–6 | 1m | $0.05 | $0.03 |
| RESIZING → RESIZING | 2–4 | 2m | $0.06 | $0.03 |
| RESIZING → RESIZING | 1–2 | 3m | $0.05 | $0.03 |
| RESIZING → UPSIZE_COMPLETED | 1–6 | 3m | $0.14 | $0.08 |
| UPSIZE_COMPLETED → RESIZING | 6 | 1m | $0.05 | $0.02 |
| RESIZING → RESIZING | 4–6 | 1m | $0.08 | $0.04 |
| RESIZING → UPSIZE_COMPLETED | 4–9 | 4m | $0.35 | $0.18 |
| UPSIZE_COMPLETED → RESIZING | 9 | 1m | $0.07 | $0.04 |
| RESIZING → RESIZING | 5–9 | 1m | $0.07 | $0.04 |
| RESIZING → TERMINATING | 5–10 | 2m | $0.23 | $0.12 |
Lakesight uses the prices as documented in official Databricks public pricing page.
Jobs — Premium
$0.30
per DBU
Jobs — Standard *
$0.15
per DBU
Interactive — Premium
$0.55
per DBU
Interactive — Standard *
$0.40
per DBU
As documented in Databricks public pricing page, Photon usage will result in DBU emission rates 2.5x for jobs and 2x for interactive clusters vs. non-Photon.
* Standard tier will be deprecated on October 1st 2026.
Job clusters are automatically created when a job is launched. One job equals one unique cluster created when the job starts and deleted when it ends or fails.
Lakesight calculates costs of all jobs fetched through Databricks REST API and displays them in various charts to allow better understanding of what is costing most.
A job cluster having a series of cluster events but no TERMINATING yet represents a still-running job. Lakesight costs these clusters based on the available events and highlights them in a dedicated section for a near real-time overview of running jobs.
Interactive clusters are clusters manually created from the Compute section of a Databricks workspace. Their ID remains the same for the lifetime of the cluster, until it is deleted. Lakesight groups usage into sessions, where each session represents the period from the moment the cluster is started until it is stopped. Costs are calculated per session, and Lakesight displays a timeline of all sessions for each interactive cluster.
An interactive cluster with a CREATING event and no TERMINATING yet represents a running UI cluster. Lakesight costs these sessions based on the available events after the last CREATING event and highlights them for an overview of running interactive clusters at any time.
Lakesight aims to provide the most accurate cost picture possible with the data available. However, there are known gaps:
Potential negotiated Databricks DBU pricing cannot be taken into account, resulting in overestimation of actual DBU costs.
When clusters use instance pools, VMs kept warm in the pool consume resources even when no workload is running. This idle cost is not yet tracked.
Pre-purchased reserved VM capacity is priced lower than on-demand. lakesight.io uses on-demand rates.
Databricks serverless compute does not expose cluster-level events. Serverless runs are detected and flagged but cannot be costed.
Databricks SQL warehouse costs are not currently included.
We'd love to hear from you if you think something could be improved.
Contact us