KPI
Queue Age SLA Hit Rate
A KPI for how often live queues stay within your agreed response-time threshold.
- Scope: KPI
- Built for practical day-to-day operations
- Time to apply: 20-45 minutes
- Updated: 2026-02-19
Definition
Queue Age SLA Hit Rate measures the share of monitored intervals where queue age remains at or below your SLA threshold.
Formula:
Queue Age SLA Hit Rate = (intervals within SLA / total monitored intervals) x 100
Why this KPI matters
Coverage metrics show staffing health. Queue-age SLA shows customer impact.
Use this KPI to validate that your operating model protects service promises, not only schedule compliance.
How to calculate it in 5 minutes
- Choose one queue-age SLA threshold per critical stream (for example, 20 minutes).
- Pull monitored intervals for the day or week (usually 15-minute blocks).
- Count intervals where queue age is at or below threshold.
- Divide by total intervals and multiply by 100.
Example:
- 64 monitored intervals
- 53 intervals within SLA
- Hit Rate = (53 / 64) x 100 = 82.8%
Suggested operating bands
95-100: Strong control. Maintain cadence and monitor early warning signals.88-94: Watch zone. One recurring window is likely under-protected.80-87: At risk. Rebalance rules are not fast enough for live pressure.<80: Unstable service protection. Escalation model needs immediate tightening.
Segment cuts that matter
Break hit rate by:
- Hour window (opening, lunch overlap, late-day)
- Queue type (priority vs standard, channel, service line)
- Trigger type before misses (absence, handover drift, break overlap)
- Site or team
If one queue type drives most misses, fix ownership and rebalance for that stream first.
Instrumentation notes
Track:
- Timestamped queue age per stream
- Active owner per stream
- Coverage floor status by role
- Trigger code for each SLA miss
Common logging failures:
- Averaging queue age over long windows and masking spikes
- Missing stream-level ownership during misses
- Not tagging miss causes for weekly analysis
What to do when hit rate drops
- Identify the first repeated miss window in the day.
- Confirm owner assignment for that stream in that window.
- Add one pre-approved rebalance move before SLA is breached.
- Increase check cadence in high-risk windows.
- Review whether break or handover timing clusters with misses.
Weekly review questions
- Which queue stream contributed most SLA misses this week?
- Are misses detection-late, decision-late, or execution-late?
- Which action recovered SLA fastest?
- What one rule change can increase hit rate by 5 points next week?
Metric pairings
Use Queue Age SLA Hit Rate with:
- Coverage Floor Breach Rate to identify staffing-driven misses.
- Time to Coverage Recovery (MTTR-C) to assess correction speed after misses begin.
Read together:
- SLA down + breach rate low -> thresholds or routing may be weak, not raw staffing.
- SLA down + MTTR-C high -> response execution is too slow after detection.
Anti-gaming checks
- Do not inflate hit rate by widening SLA thresholds mid-period.
- Do not report only averaged queue age; keep interval-level misses visible.
- Do not exclude peak windows from monitored intervals.
Related guides
- Intraday Control Loop
- Real-Time Queue Rebalance Workflow
- Coverage Stability Score
- Time to Coverage Recovery (MTTR-C)
Where Soon helps
Soon helps teams spot queue-age drift early, assign owners instantly, and run fast rebalance actions to keep SLA performance stable.
Next actions