KotenOCR is an iOS app that performs OCR on classical Japanese texts using ONNX Runtime. It ships with 6 ONNX models totaling approximately 230MB on disk. After reaching 300 downloads, the crash rate was found to be 6.7% (20 crashes). No crash logs appeared in Xcode Organizer, so a different investigation approach was required.

Investigation Approach

Four parallel investigation tracks were pursued:

  • Memory and model size analysis
  • Image processing pipeline review
  • ONNX Runtime thread safety audit
  • Camera and UI lifecycle inspection

Root Causes

The investigation identified four issues, listed here in order of estimated severity.

1. All Models Loaded at Startup

All 6 ONNX models were loaded into memory at app launch. While the models total approximately 230MB on disk, ONNX Runtime’s in-memory representation expands to roughly 1.5-2.5x the on-disk size. This means the models were estimated to consume 350-550MB of RAM.

iOS enforces per-app memory limits through jetsam, a kernel-level memory watchdog that terminates processes exceeding their allocation. The relevant device RAM figures are:

DeviceRAMRisk Level
iPhone 82GBCRITICAL — lowest RAM among iOS 16 supported devices
iPhone X / XR / XS3GBHIGH
iPhone 114GBMEDIUM
iPhone 12+4-6GBLOW

On iPhone 8 and iPhone X, the per-app memory limit is estimated at 200-300MB. Loading all models at once exceeded this limit on its own.

2. Full-Resolution Image Preprocessing in DEIMDetector

DEIMDetector was creating a CGContext at the full input image resolution for square padding. For a 12MP photo (4032x3024 pixels), this produced a 4032x4032 pixel context consuming approximately 63MB of memory.

3. Unlimited Parallel Recognition Tasks

The recognition pipeline used withThrowingTaskGroup with up to 100 simultaneous tasks. Each task allocates memory during model inference, so memory usage scaled linearly with the number of concurrent tasks.

4. No Memory Warning Handling

The app did not observe UIApplication.didReceiveMemoryWarningNotification. This meant the last opportunity to release memory before jetsam termination was being missed entirely.

Fixes

Lazy Model Loading

Changed from loading all models at startup to loading only the models required by the currently selected mode. This reduced initial memory usage from approximately 350MB to approximately 150MB.

func loadModelsForMode(_ mode: OCRMode) {
    // Load only models required by the selected mode
    let requiredModels = mode.requiredModelKeys
    for key in requiredModels {
        if sessionCache[key] == nil {
            sessionCache[key] = try? createSession(for: key)
        }
    }
}

Image Preprocessing Optimization

Modified DEIMDetector preprocessing to resize the image before padding. Memory usage dropped from 63MB to approximately 2.4MB.

// Before: full-resolution square padding
// let paddedSize = max(image.width, image.height) // 4032

// After: resize first, then pad
let resized = resize(image, to: modelInputSize) // e.g. 640x640
let padded = pad(resized, to: targetSize)

Bounded Parallel Processing

Limited withThrowingTaskGroup concurrency to a maximum of 4 simultaneous tasks.

await withThrowingTaskGroup(of: RecognitionResult.self) { group in
    let maxConcurrent = 4
    for (index, region) in regions.enumerated() {
        if index >= maxConcurrent {
            _ = try await group.next()
        }
        group.addTask {
            try await self.recognizeText(in: region)
        }
    }
}

Memory Warning Handler

Added an observer for didReceiveMemoryWarningNotification that releases models belonging to inactive modes.

NotificationCenter.default.addObserver(
    forName: UIApplication.didReceiveMemoryWarningNotification,
    object: nil,
    queue: .main
) { [weak self] _ in
    self?.releaseInactiveModels()
}

Crash Monitoring with MetricKit

To address the absence of crash logs in Xcode Organizer, MetricKit was adopted as the crash monitoring solution. As an Apple-native framework, it reports jetsam terminations and memory usage metrics directly to the app. It was chosen over Firebase Crashlytics to avoid adding an external dependency.

class MetricsManager: NSObject, MXMetricManagerSubscriber {
    func didReceive(_ payloads: [MXMetricPayload]) {
        for payload in payloads {
            if let memoryMetrics = payload.memoryMetrics {
                log(memoryMetrics.peakMemoryUsage)
            }
        }
    }

    func didReceive(_ payloads: [MXDiagnosticPayload]) {
        for payload in payloads {
            if let crashDiagnostics = payload.crashDiagnostics {
                // Record crash diagnostics
            }
        }
    }
}

Takeaways

  • Testing on the oldest supported device (iPhone 8) is essential. Issues that do not reproduce on simulators or newer hardware were observed.
  • ONNX model memory footprint was found to be 1.5-2.5x the on-disk size. The 230MB model set consumed an estimated 350-550MB of RAM.
  • Parallel inference with Swift Concurrency’s withThrowingTaskGroup is effective for throughput but requires concurrency limits in memory-constrained mobile environments.
  • Handling didReceiveMemoryWarningNotification serves as the last opportunity to release memory before jetsam terminates the process.