MetricKit Rebuilt: State-Aware Telemetry in iOS 27

Q: Do I have to move off MXMetricManager?

The session’s guidance is to migrate from MXMetricManager to the new MetricManager API, because every new capability covered (async streams, Codable reports, state-aware metrics, the new metric and diagnostic types) is exclusive to the new API set.1 MetricKit is the field half of a two-part story this year: Instruments shows you the hitch in the lab, and state-aware MetricKit tells you which screens hitch for real users, covered from the lab side in Instruments 27 and app responsiveness. The rendering work that actually fixes a 71 ms/s tab lives in SwiftUI performance and interop in iOS 27. And the reason blended averages mislead in the first place is the subject of the performance blind spot. The full series hub is the Apple Ecosystem Series.

Blake Crosley June 09, 2026 14 min read

metrickit statereporting performance telemetry ios27 wwdc26

Apple’s WWDC 2026 demo app reported a scroll hitch rate of 15 milliseconds per second, averaged across an entire day of use. Split per tab, the same data told a completely different story: 1 ms/s on one tab, 71 ms/s on the other.¹ One screen was nearly flawless; the other was, in the session’s words, “experiencing critical interruptions.”¹ The blended number hid both facts. Session 222, “Meet the new MetricKit,” is the story of how iOS 27 closes that gap: a ground-up rebuild of the framework’s API surface, and a new companion framework, StateReporting, that turns whole-app field metrics into per-state metrics. Field telemetry can finally answer the question every performance engineer asks first: which screen is slow?

TL;DR

In iOS 27, MetricKit “has been rebuilt from the ground up with a contextually rich and expressive modern Swift-first API,” and every new capability in the session is exclusive to the new APIs.¹
The entry point is the MetricManager class. Apps await the metricReports and diagnosticReports async streams at launch, and both report types are Codable, so a JSONEncoder sends them straight to your analytics server.¹
Reports are structured: intervalEntries hold a full-day entry plus smaller breakdowns, organized into metric groups like .cpu, .memory, .display, and .gpu, down to individual values such as peakMemory.¹
New data in iOS 27: a Metal frame rate metric for render performance, memory exception diagnostics for memory-limit terminations, and a crash category that ties individual crash diagnostics back to your metric trends.¹
The headline feature is the StateReporting framework: report the state your app is in (active tab, experiment arm, view configuration) and MetricKit aggregates metrics per state, replacing one blended number with a per-screen breakdown.¹

Rebuilt from the ground up

Watch on Apple Developer ↗

Yonni, an engineer on the MetricKit team, introduces the iOS 27 rebuild starting at 1:23.

MetricKit’s job has not changed: it is “the collection piece” of the performance workflow, providing two kinds of data. Metrics tell you whether an area of performance is improving or worsening overall; diagnostics tell you which code path caused a problem.¹ What changed is everything about how you receive that data. The session states it plainly: in iOS 27 the framework “has been rebuilt from the ground up with a contextually rich and expressive modern Swift-first API,” and “all of the advances I’ll be discussing today are exclusive to this new set of APIs.”¹

The new entry point is the MetricManager class. Instead of registering a delegate and parsing payloads, you await reports through the metricReports property as an async stream. Two operational rules come straight from the session: do the setup at app startup “to avoid any data loss from delayed subscription,” and keep MetricManager alive “so that the streams can continue to deliver reports as subsequent data becomes ready.”¹ Apple recommends running the work in a detached task or a dedicated service class as soon as the app launches.¹

The session presents the code on slides, so the snippets below are illustrative call shapes that match its description; confirm exact signatures against Apple’s documentation before shipping.

// Illustrative call shape based on session 222; verify against the docs.
let manager = MetricManager()

Task.detached {
    for await report in manager.metricReports {
        // Encode and ship, or inspect specific groups.
    }
}

Shipping a report to your server used to mean handling opaque payload data. Now MetricReport values are Codable: “Just create a JSONEncoder and encode the entire report.”¹ If you want a specific value instead of the whole document, the report is fully structured. You iterate through intervalEntries, which “includes a full-day aggregated entry and smaller breakdown windows when available,” typically a few hours each and present only when metrics exist for them.¹ Inside each interval, metrics are organized into metric groups, where “each group represents an aspect of the system, things like .cpu, .memory, .display, and .gpu.”¹ Filter down to the group you care about (the session’s example pulls memoryMetrics), then switch over the metric cases to reach an individual value such as peakMemory.¹

The metric catalog grows in iOS 27 too. Alongside launch time histograms (the session’s example shows most launches landing between 510 and 540 milliseconds), hangs, animation metrics, and resource consumption like CPU, GPU, disk writes, and network transfers, MetricKit adds a Metal frame rate metric. The session calls frame rate “a key metric for game developers to understand render performance” and points to “Find and fix performance issues in your Metal game” for the optimization side.¹

Lean on MetricKit’s launch metric rather than instrumenting launch yourself. Apple measures launch time from the moment the user taps the app icon to the first frame drawn, which is before your process even exists.³ A hand-rolled timer can only start once your code runs, so it misses the pre-main window entirely and undercounts the launch your users actually feel. Read the histogram MetricKit ships instead of building a timer that cannot see the part that matters.

Diagnostics: backtraces, memory exceptions, and crash categories

Metrics tell you something regressed; diagnostics tell you where. When something goes wrong, like a crash or a hang, “the system captures a diagnostic on device” and a diagnostic report “packages up the details and delivers it immediately to your app through MetricKit.”¹ Many diagnostics include backtraces showing the exact call stack at the time of the event. In the session’s walkthrough, the symbolicated backtrace starts at thread start in system code, crosses into the app, and stops at the app’s submitReport() function, which marks the point of failure and the place to target a fix.¹

Crash diagnostics carry a backtrace, the termination reason, and an exception type. New in iOS 27, a termination category “indicates how each crash was accounted for in metrics,” so “if abnormal terminations are trending up, you can correlate those directly with individual diagnostics.”¹ The metric line on your dashboard and the individual crash reports behind it finally share a key.

iOS 27 also adds memory exception diagnostics: “when your app or extension is terminated for exceeding its memory limit, you get more insight on what happened.”¹ Extensions are explicitly in scope, which matters for anyone debugging widget or extension memory kills from afar.

Consumption mirrors the metrics side. You await diagnosticReports on your MetricManager instance, again from app launch in a detached task or service class, and DiagnosticReport values are Codable for the same encode-and-ship pipeline.¹ Because the reports are structured, you can switch on the diagnostic cases: the crash case yields the backtrace, the reason, and the category, while a hang case can route to different processing.¹

// Illustrative call shape based on session 222; verify against the docs.
for await report in manager.diagnosticReports {
    switch /* diagnostic case */ {
    case /* crash */: break  // backtrace, reason, category
    case /* hang */:  break  // handle separately
    default:          break
    }
}

StateReporting: from one blended number to per-screen truth

Everything above still describes whole-app telemetry, and whole-app telemetry has a ceiling. The session’s expense reporting app makes the problem concrete. The app organizes its features into a Reports tab and a Spending tab. Over a day, MetricKit reports 4.5 seconds of total hitch time across 5 minutes of scrolling: a hitch rate of 15 ms/s. But that number is “an average scroll hitch rate over all app usage, even if someone is going back and forth between the Reports tab and the Spending tab.”¹ You know the app hitches. You do not know where.

The new StateReporting framework removes the blend. States are “information you define that describes your app’s configuration or behavior, so that MetricKit can aggregate metrics as a function of those characteristics.”¹ As people move between tabs, the app reports each transition, and MetricKit intersects those states with metric and diagnostic data.¹

The payoff in the demo is the moment that justifies the rebuild. Instead of one blended 15 ms/s figure, metrics arrive per state: the Spending tab scrolled “incredibly smooth” at 1 ms/s, while the Reports tab “spiked to 71 ms/s.”¹ The session draws the conclusion the blended number could never support: “the Spending tab is performing great! But the Reports tab is experiencing critical interruptions, and that’s exactly where your optimization effort should focus.”¹ One number became a verdict and a work order.

States follow a transition model, not bracketing. “There’s no start or end pairs - the app reports the condition it is in, at any given time,” and MetricKit tracks how long the app remains in each state.¹

Domains, metadata, and encoding by state

Each state is scoped to a domain, which “describes a function or area of an app.” A domain can hold only one active state at a time, and separate domains let multiple states be in flight simultaneously.¹ The session’s example is an A/B experiment: with an experimental change on, expenses are fetched from the database in small batches; off, larger batches. Placing the tab state and the batch-size state in separate domains means “MetricKit will deliver separate metrics for each tab and each batch size.”¹ Per-screen telemetry and experiment readouts from the same pipeline, in the field.

Adoption has three steps in the session: import the StateReporting framework, create a domain (“typically a reverse DNS string”) and register it when you set up your MetricManager instance, then report transitions as the app enters each state, like transitioning to a state identified by the string “Reports”.¹ For finer grain, you define your own struct with the ReportableMetadata macro, create a StateReporter with that metadata type, and report transitions with both the label and your custom type. The session’s ViewConfiguration example carries a listSize value and whether the list is sorted.¹ Again: the session shows this flow on slides without full signatures, so treat the shape as something to confirm in the documentation rather than syntax to copy.

On the receiving side, the report grows a second axis. Before any states are reported, the stateEntries property on your metric report is empty. After adoption, the report carries StateEntry values, each holding “metric values aggregated across the time spent in that individual state.”¹ For the server pipeline, you can group the encoded output by domain: set the encodingFormatKey key on your JSONEncoder’s userInfo property to byStateReportingDomain, and the encoded report presents both state entries and interval entries “grouped by each domain and state that exists in the report.”¹

Best practices, and where to start

The session closes with guidance that reads like hard-won schema-design advice. Domains should be narrowly scoped to one app area. State transitions “should represent stable, meaningful phases, not transient UI events.”¹ Design each state so that when a regression appears, the state alone gives you enough information to target the fix. And resist the urge to instrument everything: “Too many states can result in data that’s too granular and can actually make it harder to interpret the overall picture,” and upper limits on the number of states exist to minimize overhead (the session does not give a figure).¹ Before shipping, validate that reported states match your expectations with the Points of Interest instrument.¹

Cardinality is the same trap from the metadata side. Bucket fast-changing state values into coarse categories (small, medium, large) rather than reporting exact counts. At Apple’s WWDC 2026 performance lab, where the team fielded live questions on MetricKit adoption and the broader power workflow (captured in what Apple’s performance team said in the WWDC26 lab), they noted that recording “1,000 versus 1,001 items” adds cost without insight: the two values land in the same performance regime, so a distinct state for each one buys overhead and nothing else.⁴ Pick the boundaries that change behavior and collapse everything between them.

The collection side is only half the system. The session is direct that “analyzing metrics across all devices is a data science problem”: you stand up a server that ingests reports, aggregate along the dimensions you care about, establish a baseline, and monitor for movement in either direction.¹ The Codable reports and byStateReportingDomain encoding exist to feed exactly that pipeline.

For existing adopters, the closing instruction is explicit: “if you’re using the MXMetricManager API, migrate over to the new MetricManager API to take advantage of all these new capabilities.”¹ Apple’s documentation now formalizes that migration: MXMetricManager is marked deprecated as of 27.0, with the guidance “Use MetricManager instead.”² The same staged 27-boundary enforcement landed elsewhere this cycle, including Image Playground’s ImageCreator deprecation in iOS 27, where a deprecation warning gives way to a hard break at the public release. The new APIs are where every advance in the session lives, and the session presents them as “the future of the framework.”¹

FAQ

What actually changed in MetricKit in iOS 27?

The framework was rebuilt with a modern Swift-first API. The entry point is the new MetricManager class; metric and diagnostic reports arrive as awaitable async streams (metricReports, diagnosticReports); reports are Codable for direct JSON encoding; and the structure is navigable in code via intervalEntries and metric groups. iOS 27 also adds a Metal frame rate metric, memory exception diagnostics, a crash category that links crash diagnostics to metric accounting, and the StateReporting framework for per-state metrics.¹

How does StateReporting decide which metrics belong to which state?

Your app reports transitions: the state it is moving to, within a domain you define. MetricKit tracks how long the app remains in each state and aggregates metric values across the time spent there. There are no start/end pairs; the app simply reports the condition it is in at any given time. Each state then gets its own StateEntry in the metric report.¹

Can I track more than one dimension at once, like screen and experiment arm?

Yes. Each domain can hold one active state at a time, but separate domains run concurrently. The session’s expense app puts the active tab in one domain and a database batch-size experiment in another, and MetricKit delivers separate metrics for each tab and each batch size.¹

Should I report every UI event as a state?

No. The session recommends states that represent stable, meaningful phases rather than transient UI events, domains scoped narrowly to one app area, and restraint overall: too many states make the data harder to interpret, and the system imposes upper limits on the number of states to minimize overhead. Validate your states with the Points of Interest instrument before shipping.¹

Do I have to move off MXMetricManager?

The session’s guidance is to migrate from MXMetricManager to the new MetricManager API, because every new capability covered (async streams, Codable reports, state-aware metrics, the new metric and diagnostic types) is exclusive to the new API set.¹

MetricKit is the field half of a two-part story this year: Instruments shows you the hitch in the lab, and state-aware MetricKit tells you which screens hitch for real users, covered from the lab side in Instruments 27 and app responsiveness. The rendering work that actually fixes a 71 ms/s tab lives in SwiftUI performance and interop in iOS 27. And the reason blended averages mislead in the first place is the subject of the performance blind spot. The full series hub is the Apple Ecosystem Series.

References

Apple, WWDC 2026 session 222, Meet the new MetricKit. Source for the iOS 27 ground-up rebuild and Swift-first API framing, the MetricManager entry point and the metricReports / diagnosticReports async streams, Codable reports and JSONEncoder usage, intervalEntries and metric groups (.cpu, .memory, .display, .gpu, peakMemory), the Metal frame rate metric, memory exception diagnostics, the crash termination category, the submitReport() backtrace walkthrough, the StateReporting framework (domains, transition model, StateReporter, ReportableMetadata, stateEntries, byStateReportingDomain via encodingFormatKey), the expense-app demo numbers (15 ms/s blended; 1 ms/s Spending tab versus 71 ms/s Reports tab), the state best practices and Points of Interest validation, and the guidance to migrate from MXMetricManager to MetricManager. ↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩
Apple Developer Documentation: MXMetricManager. Marked deprecated as of 27.0, with the guidance “Use MetricManager instead.” ↩
Apple Developer Documentation: Reducing your app’s launch time. Launch time is measured from the moment the user taps the app icon to the first frame drawn, before the app’s process exists. ↩
Apple, WWDC 2026 performance group lab, session 8003. Paraphrased from a locally transcribed recording; no official transcript is published. The team advised bucketing fast-changing state metadata into coarse categories, noting that recording “1,000 versus 1,001 items” adds cost without insight. ↩