How Apple Foundation Models actually work for grammar checking

When I decided to build an offline grammar checker, the obvious question was: can Apple's on-device AI actually do this well enough?

I spent a few days reading the Foundation Models documentation and running experiments. Here's what I found — including where it works better than expected and where the limits are.

What the Foundation Models framework is

Apple ships a small language model on iPhone as part of Apple Intelligence. Starting iOS 18.1, developers can access it via the FoundationModels framework.

The model lives on-device. No network call. No user data goes anywhere. From a privacy angle, it's exactly what I wanted.

The API is straightforward:

import FoundationModels

let session = LanguageModelSession()
let response = try await session.respond(to: "Fix the grammar: \(userText)")

The harder part is getting structured, reliable output.

Structured output with @Generable

The framework supports a @Generable macro that lets you define Swift types the model will produce:

@Generable
struct GrammarReport {
    var errors: [GrammarError]
    var overallScore: Int
}

@Generable
struct GrammarError {
    var original: String
    var correction: String
    var explanation: String
    var errorType: String // "grammar", "spelling", "punctuation", "style"
}

You pass this to the session and it returns a populated struct:

let report = try await session.respond(
    to: prompt,
    generating: GrammarReport.self
)

This is genuinely useful. Instead of parsing free-form text, you get typed data you can work with directly.

What it handles well

After testing with real writing samples, the model is solid on:

Subject-verb agreement — catches most of these reliably
Article errors (a/an, the/missing) — consistently good
Comma splices — finds them well
Sentence fragments — reliable
Word choice (affect/effect, their/there/they're) — good

For typical writing — emails, messages, reports — it catches the things that matter.

Where it's limited

Token limit: The model handles roughly 800 words reliably. Beyond that, quality degrades. I cap the input at 800 words and show a warning at 700. Not ideal, but manageable for most use cases.

Complex style rules: Things like passive voice, nominalizations, sentence variety — the model catches some of these but not all. It's not trying to be a full style editor.

Speed: First inference takes 1-2 seconds while the model loads. After that, most checks complete in under a second. I prewarm the session on app launch to hide the startup latency.

Spelling: UITextChecker as fallback

For spelling specifically, I'm using UITextChecker alongside Foundation Models. It's the same spell-checker that iOS uses system-wide — reliable, fast, offline.

The grammar model catches context-aware errors ("their" vs "there") while UITextChecker catches outright misspellings. I merge the results and deduplicate.

One thing that surprised me

The explanations are actually useful. When the model identifies an error, it explains why — not just "grammar error" but "this sentence has two independent clauses joined by a comma without a coordinating conjunction." That's more useful than most grammar checkers I've tried.

I surface these explanations in the error popover so you can learn from corrections, not just accept them blindly.

Is it good enough?

For my target user — someone who wants a quick offline grammar check on a message or short document — yes. It's not going to replace a professional editor or catch every subtle style issue. But it catches the errors that matter, it's fast enough to feel responsive, and it does it all without your text leaving your phone.

That's the product. More in the next update.

Follow along on X @lucas_merr for build updates.