Measuring the Platform: Why Metrics Matter
An Internal Developer Platform without metrics is like a car without a dashboard: you can drive it, but you do not know how fast you are going, how much fuel you have, or whether the engine is about to overheat. Metrics are essential for understanding whether the platform is generating value, identifying bottlenecks, and guiding investment decisions.
In this article, we will explore the two most authoritative frameworks for measuring software delivery performance and developer experience: DORA metrics and the SPACE framework. We will see how to implement concrete dashboards and how to use data for continuous improvement.
What You'll Learn
- The 4 DORA metrics: deployment frequency, lead time, MTTR, change failure rate
- The SPACE framework: Satisfaction, Performance, Activity, Communication, Efficiency
- How to collect data from GitHub, GitLab, Kubernetes, and CI/CD
- Benchmarking: comparison with elite, high, medium, and low performers
- Practical Grafana dashboard implementation for DORA metrics
- Traps to avoid: vanity metrics, gaming, and correlation vs causation
The 4 DORA Metrics
The DORA metrics (DevOps Research and Assessment) are the result of years of research conducted by Nicole Forsgren, Jez Humble, and Gene Kim, documented in the book "Accelerate." These 4 metrics are recognized as the best indicators of software delivery performance:
- Deployment Frequency (DF): how often the organization deploys to production. Elite performers deploy on-demand, multiple times per day
- Lead Time for Changes (LT): time from first code commit to production deploy. Elite performers have lead times under one hour
- Mean Time to Recovery (MTTR): average time to restore service after an incident. Elite performers recover in less than one hour
- Change Failure Rate (CFR): percentage of deployments causing a production failure. Elite performers have a CFR of 0-15%
# DORA Metrics: benchmarking table
dora-benchmarks:
deployment-frequency:
elite: "On-demand (multiple deploys/day)"
high: "Between once per day and once per week"
medium: "Between once per week and once per month"
low: "Between once per month and once every 6 months"
lead-time-for-changes:
elite: "Less than one hour"
high: "Between one day and one week"
medium: "Between one week and one month"
low: "Between one month and six months"
mean-time-to-recovery:
elite: "Less than one hour"
high: "Less than one day"
medium: "Between one day and one week"
low: "More than six months"
change-failure-rate:
elite: "0-15%"
high: "16-30%"
medium: "16-30%"
low: "16-30%"
# Targets for our platform
platform-targets:
current-state:
deployment-frequency: "weekly"
lead-time: "3 days"
mttr: "4 hours"
change-failure-rate: "22%"
6-month-target:
deployment-frequency: "daily"
lead-time: "4 hours"
mttr: "1 hour"
change-failure-rate: "10%"
The SPACE Framework
While DORA metrics focus on delivery performance, the SPACE framework offers a broader view of developer experience. Proposed by Nicole Forsgren, Margaret-Anne Storey, and other researchers, SPACE measures five dimensions:
- Satisfaction and well-being: how satisfied developers are with their work and available tools
- Performance: the output and impact of developer work (code quality, service reliability)
- Activity: volume of activity (commits, PRs, reviews, deploys) as an engagement indicator
- Communication and collaboration: quality of communication and collaboration within and between teams
- Efficiency and flow: ability of developers to work without interruptions and blockers
The key aspect of SPACE is that no single metric tells the complete story. You need to combine metrics from at least 3 of the 5 dimensions for an accurate understanding of developer experience.
Beware of Vanity Metrics
Measuring lines of code written, number of commits, or hours worked are vanity metrics that say nothing about real productivity. Worse still, they can incentivize wrong behaviors (verbose code, artificial atomic commits, presence vs productivity). Focus on metrics that measure outcomes, not output.
Data Collection and Instrumentation
Collecting DORA and SPACE metrics requires integration with multiple data sources:
- Git (GitHub/GitLab API): commit timestamps, PR creation/merge times, branch lifecycle
- CI/CD (GitHub Actions, GitLab CI): pipeline duration, success/failure rates, deployment timestamps
- Kubernetes API: deployment rollout status, pod restarts, resource utilization
- Incident management (PagerDuty, OpsGenie): incident creation, acknowledgment, resolution times
- Survey tools: developer satisfaction surveys (quarterly NPS, pulse surveys)
# Prometheus: rules for DORA metrics calculation
groups:
- name: dora-metrics
interval: 5m
rules:
# Deployment Frequency: deployments per day per service
- record: dora:deployment_frequency:daily
expr: |
sum by (service) (
increase(
deployment_total{environment="production"}[24h]
)
)
# Lead Time: average time from commit to deploy (in hours)
- record: dora:lead_time_hours:avg
expr: |
avg by (service) (
deployment_lead_time_seconds{environment="production"}
) / 3600
# Change Failure Rate: % failed deployments
- record: dora:change_failure_rate:ratio
expr: |
sum by (service) (
increase(
deployment_total{environment="production",status="failed"}[30d]
)
)
/
sum by (service) (
increase(
deployment_total{environment="production"}[30d]
)
)
# MTTR: average recovery time (in minutes)
- record: dora:mttr_minutes:avg
expr: |
avg by (service) (
incident_resolution_time_seconds{severity=~"critical|high"}
) / 60
# Alert if MTTR exceeds target
- alert: HighMTTR
expr: dora:mttr_minutes:avg > 60
for: 5m
labels:
severity: warning
annotations:
summary: "MTTR for {{ $labels.service }} exceeds 60 minutes"
Grafana Dashboard for DORA
A dedicated Grafana dashboard for DORA metrics provides real-time visibility into software delivery performance. Key visualizations include:
- Deployment Frequency: daily/weekly bar chart with trend line
- Lead Time: line chart with percentiles (p50, p90, p99) and targets
- MTTR: gauge showing current value vs target, with history
- Change Failure Rate: pie chart with percentage and monthly trend
- Overall DORA Score: traffic light indicating performance level (elite, high, medium, low)
The dashboard should be accessible to all teams, not just the platform team. Transparency around metrics incentivizes improvement and creates a data-driven culture.
Feedback Loops: From Metrics to Improvements
Metrics have value only if they guide concrete actions. The feedback cycle for continuous platform improvement includes:
- Collect: automatic collection of quantitative metrics and qualitative feedback
- Analyze: identification of trends, anomalies, and correlations
- Prioritize: rank improvements based on impact and effort
- Implement: execute prioritized improvements
- Measure: verify that improvements have the desired effect
# Feedback loop: quarterly review process
quarterly-platform-review:
week-1-collect:
- pull DORA metrics from Prometheus/Grafana
- run developer satisfaction survey (NPS + open questions)
- collect support ticket analytics
- review incident postmortems
week-2-analyze:
- compare metrics with previous quarter
- identify top 5 developer pain points from survey
- correlate support tickets with platform areas
- benchmark against industry standards
week-3-prioritize:
- impact-effort matrix for improvement initiatives
- align with business priorities
- estimate ROI for top initiatives
- present to engineering leadership
week-4-plan:
- create platform roadmap for next quarter
- define OKRs for platform team
- communicate plan to all engineering teams
- set metric targets for next quarter
Traps to Avoid
Metrics are powerful tools, but they can also be harmful if used incorrectly:
- Gaming: when metrics are used to evaluate individual performance, people manipulate them. DORA metrics should measure the system, not people
- Correlation vs causation: the fact that two metrics move together does not mean one causes the other
- Over-measurement: tracking too many metrics creates noise and confusion. Better to have 5 well-understood metrics than 50 ignored ones
- Metric fixation: optimizing a single metric at the expense of everything else (e.g., increasing deployment frequency without caring about quality)
Core Principle
Metrics should illuminate, not judge. Use them to understand where the platform can improve, not to evaluate individual developer productivity. A culture based on trust and transparency is the prerequisite for deriving value from metrics.







