Foundation Models Custom Adapters: When To Train One

Q: How big are adapters and where do they live?

Apple&rsquo;s documented number: &ldquo;Each adapter will take approximately 160 MB of storage space in your app.&rdquo;2 Adapters do not live in the app bundle. Apple routes them through Background Assets, hosted either on Apple servers (managed) or your own server (managed or unmanaged), and the device downloads the version matching its current system model.3

Q: What&rsquo;s the relationship between adapters and .contentTagging?

.contentTagging is Apple-managed and built into the framework: an internally specialized use case, not the formal Adapter type a developer trains. The formal type is what this post covers; the use cases are covered in the companion post.

Apple’s adapter docs define the path narrowly: use the base system model for most prompt engineering, guided generation, and tools; train a custom Adapter only when the task needs model-weight specialization and the team is comfortable with Python model training.¹⁴

Apple’s own documentation on the same type quotes verbatim: “Adapters consume a large amount of storage space and isn’t recommended for most apps.”¹ The earlier post in this cluster, Foundation Models Use Cases, covered the rails most apps should ride. This post is the third rail: the operational lifecycle of training, packaging, and shipping a custom adapter, and Apple’s explicit guidance about when not to.

TL;DR

Apple’s toolkit identifies the training method as LoRA: base weights stay frozen while adapter weights are trained.²
Each adapter is bound to a single system model version. When Apple updates the base model, you retrain the adapter.³
Apple’s recommendation, in their words: “Use the base system model for most prompt engineering, guided generation, and tools. If you need to specialize the model, train a custom Adapter… Use custom adapters only if you’re comfortable training foundation models in Python.”¹
Adapter files are large, should not ship in the main app bundle, and are delivered on demand through hosted asset packs and Background Assets.²³
Apple says to consider adapters only after prompt engineering or tool calling fails the task, or when an existing fine-tuned server LLM, subject-matter expertise, style/format/policy adherence, or latency target justifies the cost.²

What An Adapter Actually Is

SystemLanguageModel.Adapter is a struct, available on iOS, iPadOS, Mac Catalyst, macOS, and visionOS, all 26.0+.⁴ Apple’s description of the type:

“Use the base system model for most prompt engineering, guided generation, and tools. If you need to specialize the model, train a custom Adapter to alter the system model weights and optimize it for your custom task. Use custom adapters only if you’re comfortable training foundation models in Python.”⁴

The mechanism is documented. Apple’s adapter training guide states it directly:²

“The system model uses a parameter-efficient fine-tuning (PEFT) approach known as LoRA (Low-Rank Adaptation). In LoRA, the original model weights are frozen, and small trainable weight matrices called ‘adapters’ are embedded through the model’s network. During training, only adapter weights are updated, significantly reducing the number of parameters to train.”

LoRA is a public technique with a published paper history.⁵ Apple’s documented surface is the toolkit, .fmadapter export, asset-pack bundling, Background Assets delivery, and runtime loading through SystemLanguageModel.Adapter.²³⁴

The Type’s API Surface

Apple lists this Adapter surface:⁴

init(fileURL: URL) throws: “Creates an adapter from the file URL.”
init(name: String) throws: “Creates an adapter downloaded from the background assets framework.”
func compile() async throws: “Prepares an adapter before being used with a LanguageModelSession. You should call this if your adapter has a draft model.”
var creatorDefinedMetadata: [String : Any]: “Values read from the creator defined field of the adapter’s metadata.”
static func removeObsoleteAdapters() throws: “Remove all obsolete adapters that are no longer compatible with current system models.”
static func compatibleAdapterIdentifiers(name: String) -> [String]: “Get all compatible adapter identifiers compatible with current system models.”
enum AssetError: error type for asset-related failures.

SystemLanguageModel has a paired initializer for adapters: convenience init(adapter: SystemLanguageModel.Adapter, guardrails: SystemLanguageModel.Guardrails), with the description “Creates the base version of the model with an adapter.”⁶

The deployment-only entitlement key is com.apple.developer.foundation-model-adapter: “A Boolean value that indicates whether the app can enable custom adapters for the Foundation Models framework.”⁶ You don’t need it for training or for local Xcode testing; you do need it before shipping to the App Store.³

Apple’s “When To Consider An Adapter” Rubric

The toolkit page lays out concrete adoption signals:²

“You have a dataset suitable for use with an LLM” or you already use a fine-tuned server-based LLM and want on-device parity.
“You need the model to become a subject-matter expert.”
“You need the model to adhere to a specific style, format, or policy.”
“Prompt engineering isn’t achieving the required accuracy or consistency for your task.”
“You want lower latency at inference. If your prompt-engineered solutions require lengthy prompts with examples for every call, an adapter specialized for that task offers minimal prompting.”

The same guide also lists the costs you take on:²

A dataset of prompt and response pairs that demonstrate your target skill.
A process for evaluating the quality of your adapters.
A process to load your adapters into your app from a server.

And the storage tax: “Each adapter will take approximately 160 MB of storage space in your app. Like other big assets, adapters shouldn’t be part of your app’s main bundle because with multiple adapter versions your app will become too big for people to install.”²

The framework’s recommendation, restated by Apple in two separate places, is to default to prompt engineering plus tool calling and reach for adapters only when the rubric above clears.

Training, In Apple’s Shape

Apple says the toolkit contains Python sample code, model assets for a specific system model version, .fmadapter export utilities, and asset-pack bundling utilities.²

Dataset requirements are jsonl prompt/response pairs, roughly 100-1,000 samples for basic tasks and 5,000+ for complex tasks. Schema.md covers guided generation and AI-safety fields.²

Hardware requirements: “Mac with Apple silicon and at least 32GB memory, or Linux GPU machines.” Python 3.11 or later.²

Apple’s data-quality rule is simple: quality beats quantity.²

Training is invoked from the toolkit’s train_adapter entry point:

python -m examples.train_adapter \
  --train-data /path/to/train.jsonl \
  --eval-data /path/to/valid.jsonl \
  --epochs 5 \
  --learning-rate 1e-3 \
  --batch-size 4 \
  --checkpoint-dir /path/to/my_checkpoints/

Optionally, after training the adapter, you can train a matching draft model.² A draft model is a smaller version of the system base model that enables speculative decoding, a published inference-acceleration technique.⁷ Apple’s framing: “If you choose not to train the draft model, speculative decoding will not be available for your adapter’s use case.”²

The Single-Version Constraint

The most operationally consequential fact about adapters is the binding to a specific system model version:³

“Each adapter is compatible with a single specific system model version. You must train a new adapter for every new base model version. A runtime error occurs if your app runs on a person’s device without a compatible adapter.”

Retrieved May 4, 2026, Apple’s toolkit table lists Beta 0.1.0 and Beta 0.2.0 as removed, and 26.0.0 as the first full toolkit version. Apple’s cadence rule is one toolkit per system model update.² Full statement: “A new toolkit will be released for every system model update. The system model is shared across iOS, macOS, and visionOS, and system model updates will occur as part of those platforms’ OS updates (though not every OS update will have a model update).”²

The operational implication: app teams shipping adapters subscribe to a model-update lifecycle that runs on Apple’s cadence. Each base model bump is a forcing function for retraining, re-evaluation, and republishing.

Packaging As Asset Packs

Apple’s rule: adapter files are too large for the app bundle; host them through App Store Connect or your server, then download the device-compatible adapter on demand.³

The toolkit produces .fmadapter packages, which the toolkit also bundles as Background Assets asset packs. The ba-package command-line tool from Xcode 16 or later does the bundling work; the toolkit calls into it.³

Hosting choices:³

Apple-Hosted, Managed. Apple hosts the asset; the OS manages the download lifecycle.
Self-Hosted, Managed. You host on your server; the OS manages the download lifecycle.
Self-Hosted, Unmanaged. You host and manage the lifecycle yourself.

The required Info.plist keys differ by hosting choice:³ Apple-Hosted Managed needs BAHasManagedAssetPacks, BAAppGroupID, and BAUsesAppleHosting; Self-Hosted Managed needs the first two; Self-Hosted Unmanaged needs none. Each path also has an asset-downloader extension target that Xcode generates.

Choosing The Right Adapter At Runtime

When the asset-downloader extension’s BackgroundDownloadHandler.swift is generated, Xcode wires up a shouldDownload(_:) callback. Apple’s example body for adapter assets:³

func shouldDownload(_ assetPack: AssetPack) -> Bool {
    if assetPack.id.hasPrefix("mygameshader") {
        return true
    }
    return SystemLanguageModel.Adapter.isCompatible(assetPack)
}

Apple’s sample is the only documented runtime check. The expression SystemLanguageModel.Adapter.isCompatible(assetPack) is what the sample returns for adapter asset packs; treat the call as opaque beyond what the sample shows.³

Loading And Tracking Downloads

Once the asset is on the device, the load path is:³

SystemLanguageModel.Adapter.removeObsoleteAdapters()

let adapter = try SystemLanguageModel.Adapter(name: "myAdapter")

Construction kicks off a download if the device doesn’t have a compatible adapter cached. Apple’s note on UX: “Because adapters can have a large data size they can take some time to download, especially if a person is on Wi-Fi or a cell network. If a person doesn’t have a network connection, they aren’t able to use your adapter right away.”³

The status sequence comes from AssetPackManager:³

let assetpackIDList = SystemLanguageModel.Adapter.compatibleAdapterIdentifiers(name: name)
if let assetPackID = assetpackIDList.first {
    let statusUpdates = AssetPackManager.shared.statusUpdates(forAssetPackWithID: assetPackID)
    for await status in statusUpdates {
        switch status {
        case .began(let assetPack): ...
        case .paused(let assetPack): ...
        case .downloading(let assetPack, let progress): ...
        case .finished(let assetPack): ...
        case .failed(let assetPack, let error): ...
        @unknown default: ...
        }
    }
}

The five documented DownloadStatusUpdate cases: .began, .paused, .downloading, .finished, .failed.³ The framework’s @unknown default branch is mandatory because Apple may add cases in future SDK versions.

After the status reaches .finished, the adapter is ready to plug into a session:

let adaptedModel = SystemLanguageModel(adapter: adapter)
let session = LanguageModelSession(model: adaptedModel)

The Draft Model And Its Rate Limit

If your adapter shipped with a draft model, calling adapter.compile() prepares it for use. Apple’s documentation calls this out as a separate, computationally expensive step:³

“The first time a device downloads a new version of your adapter, a call to compile() fully compiles your draft model and saves it to the device. During subsequent launches of your app, a call to compile() checks for a saved compiled draft model and returns it immediately if it exists.”

There is a published rate limit:³

“Rate limiting protects device resources that are shared between all apps and processes. If the framework determines that a new compilation is necessary, it rate-limits the compilation process on all platforms, excluding macOS, to three draft model compilations per-app, per-day.”

The rate limit excludes macOS; on other platforms, new draft-model compilation is limited to three compilations per app per day.³ Apple recommends running compilation inside a Background Tasks-scheduled task so the work doesn’t block app launch.³

Apple’s Xcode testing warning: launching through Xcode changes the app UUID, so full compilation runs every launch and can trip the rate limit.³

Testing And The Simulator Constraint

Adapter testing requires a physical device. Apple is explicit: “Testing adapters requires a physical device and isn’t supported on Simulator.”³

For local testing in Xcode, you initialize from a file URL rather than a name:³

let localURL = URL(filePath: "absolute/path/to/my_adapter.fmadapter")
let adapter = try SystemLanguageModel.Adapter(fileURL: localURL)

For publishing, Apple says to import adapter files only for local testing, then remove them before release and download adapters on demand.³

What This Path Costs You In Operational Terms

Putting the lifecycle together, an app shipping a custom adapter signs up for:

Python training infrastructure. A Mac with Apple silicon and 32 GB of memory at minimum, or a Linux GPU machine.²
A retraining cadence on Apple’s clock. Every system model update means a fresh adapter and a fresh toolkit version.³
A serving stack. Either Apple-hosted asset packs through App Store Connect, or your own server running an asset-downloader integration.³
Per-version adapters in storage. Multiple base model versions in the wild means multiple adapters hosted, with the device pulling the matching one.³
An entitlement gate. The Account Holder of your Apple Developer Program membership requests it; you cannot ship without approval.²
The 160 MB tax per adapter version. Not in the app bundle, but on the user’s device after download.²
Physical-device testing. Simulator does not run adapters.³

That’s the operational cost in plain shape. The benefit, when adoption signals clear, is: the on-device model becomes specialized for the task with minimal prompting and lower latency. For apps that already pay for fine-tuned server-side inference and want on-device parity, that’s the trade in plain terms.

Takeaways

Adapters are LoRA, by Apple’s documented technique. Frozen base weights, small trainable matrices through the model’s network, only the adapter weights update during training.²
One adapter per system model version, no exceptions. Plan for retraining tied to OS releases.³
The .fmadapter flow is end-to-end. Train in Python, package with the toolkit, host as a Background Assets asset pack, load by name in your app, compile the draft model in a background task.
Apple itself recommends most apps don’t take this path. Two separate Apple documentation pages discourage it for most use cases.¹ Read the rubric on the toolkit page; the answer is “no” by default.
Test on hardware. Simulator does not run adapters. Build the test plan around physical-device sessions and the Background Tasks framework’s compilation slot.

The full Apple Ecosystem cluster: the decision rubric for built-in specialization; the Tool protocol at the core of the framework; the agentic workflow split between in-app and tooling LLMs; App Intents vs MCP for the broader routing question. The hub is at the Apple Ecosystem Series. For broader iOS-with-AI-agents context, see the iOS Agent Development guide.

FAQ

How do I know whether my app actually needs a custom adapter?

Apple recommends the base model for most prompt engineering, guided generation, and tools, and the toolkit guide says adapters have steep training and retraining requirements. The default answer is no unless Apple’s adapter signals apply.²⁴

What does the entitlement actually gate?

Apple documents the entitlement as required when deploying adapters, not for training or local testing.²³

How big are adapters and where do they live?

Apple’s documented number: “Each adapter will take approximately 160 MB of storage space in your app.”² Adapters do not live in the app bundle. Apple routes them through Background Assets, hosted either on Apple servers (managed) or your own server (managed or unmanaged), and the device downloads the version matching its current system model.³

What happens when Apple updates the base model?

When Apple updates the base model, train a compatible adapter for that model version; otherwise the app can hit a runtime error on a device without a compatible adapter.²³

What’s the relationship between adapters and `.contentTagging`?

.contentTagging is Apple-managed and built into the framework: an internally specialized use case, not the formal Adapter type a developer trains. The formal type is what this post covers; the use cases are covered in the companion post.

Do I have to train a draft model?

Apple documents only this consequence of skipping draft-model training: speculative decoding is unavailable for that adapter use case.²

References

Apple Developer, “SystemLanguageModel.Adapter”. Type description, recommendation against use for most apps, “Use custom adapters only if you’re comfortable training foundation models in Python.” Retrieved 2026-05-04. ↩↩↩↩
Apple Developer, “Get started with Foundation Models adapter training”. Toolkit overview, LoRA mechanism, hardware requirements, dataset shape, training CLI, draft model option, toolkit version table, entitlement request flow. Retrieved 2026-05-04. ↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩
Apple Developer, “Loading and using a custom adapter with Foundation Models”. Asset pack hosting, Info.plist keys, shouldDownload(_:) example, status sequence, rate limiting, Simulator exclusion, single-version compatibility constraint. Retrieved 2026-05-04. ↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩
Apple Developer, “SystemLanguageModel.Adapter”. Struct API surface: init(fileURL:), init(name:), compile(), creatorDefinedMetadata, removeObsoleteAdapters(), compatibleAdapterIdentifiers(name:), AssetError. Retrieved 2026-05-04. ↩↩↩↩↩↩
Hu et al., “LoRA: Low-Rank Adaptation of Large Language Models”, arXiv:2106.09685. The original LoRA paper that the technique Apple uses references. ↩
Apple Developer, “SystemLanguageModel”. The init(adapter:guardrails:) convenience initializer and the com.apple.developer.foundation-model-adapter entitlement description. Retrieved 2026-05-04. ↩↩
Leviathan et al., “Fast Inference from Transformers via Speculative Decoding”, arXiv:2211.17192, and Chen et al., “Accelerating Large Language Model Decoding with Speculative Sampling”, arXiv:2302.01318. The speculative decoding technique Apple’s draft model uses, cited directly by the adapter training guide. ↩