TL;DR

  • Capture iPhone and iPad simulator screenshots in multiple languages using XCUITest
  • Generate marketing images with Python Pillow: gradient backgrounds, device frames, and text overlays
  • Record demo videos with xcrun simctl io recordVideo
  • Upload everything to App Store Connect via API
  • Run it all from a single shell script

Introduction

Preparing App Store screenshots involves a fair amount of repetitive work: iPhone 6.7-inch, iPad 12.9-inch, each in two languages – that’s 12+ images minimum.

Every time you update your app, you have to retake screenshots, create marketing images in Figma or Photoshop, and manually upload them one by one to App Store Connect.

This article walks through building a pipeline that handles capture, image generation, and upload in a single command.

Architecture Overview

capture_screenshots.sh
├── Step 1: Prepare simulators (boot + add test images)
├── Step 2: Capture screenshots with XCUITest (JA/EN × iPhone/iPad)
├── Step 3: Resize to Apple-required dimensions with sips
├── Step 4: Generate marketing images with Pillow
├── Step 5: Record demo videos with xcrun simctl io
└── Step 6: Upload to App Store Connect via API

Step 1: Capturing Screenshots with XCUITest

Test Code Design

Create a UI test class with three key design decisions:

  1. Auto-load test images: Pass an image path via the TEST_IMAGE_PATH environment variable to bypass PHPicker
  2. Language switching: Forward the language set by xcodebuild -testLanguage to the app via -AppleLanguages
  3. Suppress dialogs: Use launch arguments to prevent review prompts and onboarding from appearing
final class ScreenshotTests: XCTestCase {

    private var app: XCUIApplication!
    private let screenshotDir = ProcessInfo.processInfo.environment["SCREENSHOT_DIR"]
        ?? "/tmp/myapp_screenshots"

    override func setUpWithError() throws {
        continueAfterFailure = false
        app = XCUIApplication()
        // Skip onboarding, suppress review dialogs
        app.launchArguments += ["-hasCompletedOnboarding", "YES"]

        // Mirror the test language into the app's language setting
        let preferredLang = Locale.preferredLanguages.first ?? "ja"
        let langCode = preferredLang.components(separatedBy: "-").first ?? "ja"
        app.launchArguments += ["-AppleLanguages", "(\(langCode))", "-AppleLocale", langCode]

        // Pass test image path via environment
        app.launchEnvironment["TEST_IMAGE_PATH"] = "/path/to/test_sample.jpg"

        try FileManager.default.createDirectory(
            atPath: screenshotDir,
            withIntermediateDirectories: true
        )
    }

    func testCaptureScreenshots() throws {
        app.launch()

        // Wait for processing to complete
        let backButton = app.buttons["back_button"]
        XCTAssertTrue(backButton.waitForExistence(timeout: 300))
        sleep(2)

        // Main screen screenshot
        saveScreenshot(name: "04_result")

        // Navigate to other screens and capture...
        // ...

        backButton.tap()
        sleep(1)
        saveScreenshot(name: "01_camera")
    }

    private func saveScreenshot(name: String) {
        let screenshot = app.windows.firstMatch.screenshot()
        let attachment = XCTAttachment(screenshot: screenshot)
        attachment.name = name
        attachment.lifetime = .keepAlways
        add(attachment)

        let path = "\(screenshotDir)/\(name).png"
        try? screenshot.pngRepresentation.write(to: URL(fileURLWithPath: path))
    }
}

The saveScreenshot method both attaches the image to the test results as an XCTAttachment and writes a PNG to a specified directory. The filename prefix (01_, 04_, etc.) controls priority when selecting screenshots for marketing images later.

Demo Video Test Method

For App Store preview videos, create a test method that simulates realistic user interaction:

func testDemoVideoFlow() throws {
    // Disable auto-loading to show the camera screen first
    app.launchEnvironment["TEST_IMAGE_PATH"] = ""
    app.launch()
    sleep(3)  // Show camera screen

    // Select a photo from the library
    let photoButton = app.buttons["photo_library_button"]
    guard photoButton.waitForExistence(timeout: 5) else { return }
    photoButton.tap()
    sleep(2)

    // Tap the first image in PHPicker
    let firstImage = app.scrollViews.images.firstMatch
    if firstImage.waitForExistence(timeout: 5) {
        firstImage.tap()
    }

    // Show result, scroll, navigate back...
}

Record the screen with xcrun simctl io recordVideo while this test runs, and you get an App Store preview video.

Step 2: Running UI Tests from a Shell Script

Preparing Simulators

#!/bin/bash
set -euo pipefail

IPHONE_SIM="iPhone 16 Pro Max"
IPAD_SIM="iPad Pro 13-inch (M4)"
SCREENSHOT_DIR="/tmp/myapp_screenshots"

# Get simulator UDIDs
IPHONE_UDID=$(xcrun simctl list devices available \
    | grep "$IPHONE_SIM" | head -1 | grep -oE '[A-F0-9-]{36}')
IPAD_UDID=$(xcrun simctl list devices available \
    | grep "$IPAD_SIM" | head -1 | grep -oE '[A-F0-9-]{36}')

# Boot simulator and add test image to photo library
xcrun simctl boot "$IPHONE_UDID" 2>/dev/null || true
sleep 3
xcrun simctl addmedia "$IPHONE_UDID" "path/to/test_sample.jpg"

Using xcrun simctl addmedia to pre-populate the simulator’s photo library lets you automate PHPicker image selection during demo video recording.

Capturing Screenshots per Language

capture_device() {
    local UDID="$1"
    local DEVICE_TYPE="$2"   # "iphone" or "ipad"
    local LANG="$3"          # "ja" or "en"
    local OUTPUT_DIR="$4"

    echo "Capturing $DEVICE_TYPE screenshots ($LANG)..."

    xcodebuild test \
        -project MyApp.xcodeproj \
        -scheme MyApp \
        -destination "platform=iOS Simulator,id=$UDID" \
        -only-testing:MyAppUITests/ScreenshotTests/testCaptureScreenshots \
        -testLanguage "$LANG" \
        -testRegion "$(echo "${LANG}" | tr '[:lower:]' '[:upper:]')" \
        2>&1 | tail -20

    mkdir -p "$OUTPUT_DIR"
    mv "$SCREENSHOT_DIR"/*.png "$OUTPUT_DIR/" 2>/dev/null || true
}

# Capture all 4 combinations: JA/EN × iPhone/iPad
capture_device "$IPHONE_UDID" "iphone" "ja" "$SCREENSHOT_DIR/ja/iphone"
capture_device "$IPAD_UDID"   "ipad"   "ja" "$SCREENSHOT_DIR/ja/ipad"
capture_device "$IPHONE_UDID" "iphone" "en" "$SCREENSHOT_DIR/en/iphone"
capture_device "$IPAD_UDID"   "ipad"   "en" "$SCREENSHOT_DIR/en/ipad"

The -testLanguage flag sets the test runner’s locale. In the test code, Locale.preferredLanguages picks this up and forwards it to -AppleLanguages, so the app UI switches language automatically.

Resizing to Apple-Required Dimensions

App Store requires exact image sizes. Use sips (built into macOS) to resize precisely:

resize_screenshots() {
    local INPUT_DIR="$1"
    local OUTPUT_DIR="$2"
    local TARGET_W="$3"
    local TARGET_H="$4"

    mkdir -p "$OUTPUT_DIR"
    for f in "$INPUT_DIR"/*.png; do
        [ -f "$f" ] || continue
        sips -z "$TARGET_H" "$TARGET_W" "$f" \
            --out "$OUTPUT_DIR/$(basename "$f")" 2>/dev/null
    done
}

# iPhone 6.7": 1290x2796, iPad 12.9": 2048x2732
resize_screenshots "$SCREENSHOT_DIR/ja/iphone" "$RESIZED_DIR/ja/iphone" 1290 2796
resize_screenshots "$SCREENSHOT_DIR/ja/ipad"   "$RESIZED_DIR/ja/ipad"   2048 2732

Step 3: Generating Marketing Images with Python Pillow

Rather than uploading raw screenshots, we composite them with gradient backgrounds, device frames, and promotional text.

Defining Themes

Define per-language themes with background colors and text:

IPHONE_SIZE = (1290, 2796)  # iPhone 6.7"
IPAD_SIZE = (2048, 2732)    # iPad 12.9"

THEMES_JA = [
    {
        "bg_top": (230, 81, 0),       # Orange
        "bg_bottom": (153, 51, 0),
        "title": "AIが手書き文字を読み取る",
        "subtitle": "写真を撮るだけ、テキストに変換",
    },
    {
        "bg_top": (41, 98, 255),      # Blue
        "bg_bottom": (0, 48, 135),
        "title": "高精度な文字認識",
        "subtitle": "最新のAIモデルを搭載",
    },
    {
        "bg_top": (123, 44, 191),     # Purple
        "bg_bottom": (66, 15, 120),
        "title": "完全オフラインで動作",
        "subtitle": "すべての処理をデバイス上で完結",
    },
]

THEMES_EN = [
    {
        "bg_top": (230, 81, 0),
        "bg_bottom": (153, 51, 0),
        "title": "AI-Powered Text Recognition",
        "subtitle": "Just snap a photo to get text",
    },
    # ...
]

Creating Gradient Backgrounds

from PIL import Image, ImageDraw, ImageFont, ImageFilter

def create_gradient(size, color_top, color_bottom):
    """Create a vertical gradient image."""
    img = Image.new("RGB", size)
    draw = ImageDraw.Draw(img)
    w, h = size
    for y in range(h):
        ratio = y / h
        r = int(color_top[0] + (color_bottom[0] - color_top[0]) * ratio)
        g = int(color_top[1] + (color_bottom[1] - color_top[1]) * ratio)
        b = int(color_top[2] + (color_bottom[2] - color_top[2]) * ratio)
        draw.line([(0, y), (w, y)], fill=(r, g, b))
    return img

Adding Device Frames

Add rounded corners and a dark bezel to make the screenshot look like an actual device:

def add_rounded_corners(img, radius):
    """Apply rounded corners to an image."""
    mask = Image.new("L", img.size, 0)
    draw = ImageDraw.Draw(mask)
    draw.rounded_rectangle([(0, 0), img.size], radius=radius, fill=255)
    result = img.copy()
    result.putalpha(mask)
    return result

def add_device_frame(screenshot, corner_radius, is_ipad=False):
    """Add a device bezel around the screenshot."""
    bezel = int(corner_radius * 0.35)
    frame_radius = corner_radius + bezel

    frame_w = screenshot.width + bezel * 2
    frame_h = screenshot.height + bezel * 2

    frame = Image.new("RGBA", (frame_w, frame_h), (0, 0, 0, 0))
    frame_draw = ImageDraw.Draw(frame)

    # Dark bezel
    frame_draw.rounded_rectangle(
        [(0, 0), (frame_w - 1, frame_h - 1)],
        radius=frame_radius, fill=(30, 30, 30, 255)
    )
    # Inner edge highlight
    frame_draw.rounded_rectangle(
        [(bezel - 1, bezel - 1), (frame_w - bezel, frame_h - bezel)],
        radius=corner_radius + 1, fill=(50, 50, 50, 255)
    )
    # Place screenshot inside frame
    frame.paste(screenshot, (bezel, bezel), screenshot)
    return frame

Compositing the Marketing Image

Combine the gradient background, text, and framed screenshot. The key design choice is a “bleed layout” where the device extends below the visible canvas:

def create_marketing_image(screenshot_path, theme, output_size, lang="ja"):
    w, h = output_size
    is_ipad = w / h > 0.6

    # Device-specific parameters
    if is_ipad:
        title_font_pct = 0.055
        sub_font_pct = 0.030
        max_scale_w_pct = 0.82
    else:
        title_font_pct = 0.065
        sub_font_pct = 0.035
        max_scale_w_pct = 0.88

    # Language-specific font
    if lang == "en":
        font_bold_path = "/System/Library/Fonts/Helvetica.ttc"
    else:
        font_bold_path = "/System/Library/Fonts/ヒラギノ角ゴシック W6.ttc"

    # Gradient background
    bg = create_gradient(output_size, theme["bg_top"], theme["bg_bottom"])
    bg = bg.convert("RGBA")

    # Load font and calculate text position
    font_title = ImageFont.truetype(font_bold_path, int(w * title_font_pct))
    draw = ImageDraw.Draw(bg)
    title_bbox = draw.textbbox((0, 0), theme["title"], font=font_title)
    title_h = title_bbox[3] - title_bbox[1]
    title_y = int(h * 0.10)

    # Scale and position screenshot (bleed bottom)
    screenshot = Image.open(screenshot_path).convert("RGBA")
    ss_y = title_y + title_h + int(h * 0.05)
    bleed_fraction = 0.35  # Bottom 35% extends beyond canvas
    desired_visible_h = h - ss_y
    desired_total_h = desired_visible_h / (1.0 - bleed_fraction)
    scale_factor = desired_total_h / screenshot.height

    scale_w = min(int(screenshot.width * scale_factor), int(w * max_scale_w_pct))
    scale_h = int(screenshot.height * (scale_w / screenshot.width))
    screenshot = screenshot.resize((scale_w, scale_h), Image.LANCZOS)

    # Rounded corners + frame
    corner_radius = int(scale_w * 0.05)
    screenshot = add_rounded_corners(screenshot, corner_radius)
    framed = add_device_frame(screenshot, corner_radius, is_ipad=is_ipad)

    # Center horizontally
    ss_x = (w - framed.width) // 2
    bg.paste(framed, (ss_x, ss_y + int(h * 0.06)), framed)

    # Draw text on top
    title_w = title_bbox[2] - title_bbox[0]
    draw.text(((w - title_w) // 2, title_y), theme["title"],
              fill=(255, 255, 255), font=font_title)

    final = Image.new("RGB", output_size, (0, 0, 0))
    final.paste(bg, (0, 0), bg)
    return final

Screenshot Priority

Control which screenshots get used via filename-prefix-based priority:

SCREENSHOT_PRIORITY = ["04_result", "05_translation", "06_settings", "02_crop", "03_confirm"]

def find_best_screenshots(input_dir, count=3):
    all_files = sorted([f for f in os.listdir(input_dir) if f.endswith(".png")])
    selected = []
    for prefix in SCREENSHOT_PRIORITY:
        for f in all_files:
            if f.startswith(prefix) and f not in selected:
                selected.append(f)
                break
        if len(selected) >= count:
            break
    return [os.path.join(input_dir, f) for f in selected]

Step 4: Recording Demo Videos

Run xcrun simctl io recordVideo in the background while executing a UI test to automatically create a demo video:

record_demo() {
    local UDID="$1"
    local DEVICE_TYPE="$2"
    local LANG="$3"
    local OUTPUT="videos/demo_${LANG}_${DEVICE_TYPE}.mp4"

    # Start recording in background
    xcrun simctl io "$UDID" recordVideo --codec h264 "$OUTPUT" &
    local RECORD_PID=$!
    sleep 2

    # Run demo test
    xcodebuild test \
        -project MyApp.xcodeproj \
        -scheme MyApp \
        -destination "platform=iOS Simulator,id=$UDID" \
        -only-testing:MyAppUITests/ScreenshotTests/testDemoVideoFlow \
        -testLanguage "$LANG" \
        2>&1 | tail -10

    # Stop recording
    kill -INT "$RECORD_PID" 2>/dev/null || true
    wait "$RECORD_PID" 2>/dev/null || true
}

record_demo "$IPHONE_UDID" "iphone" "ja"
record_demo "$IPHONE_UDID" "iphone" "en"

The sleep calls in the test method control the pacing of the video. Use longer sleeps on screens you want to highlight.

Step 5: Uploading to App Store Connect via API

Upload the generated marketing images automatically using JWT authentication with the App Store Connect API.

Authentication

import jwt
import time

KEY_ID = "YOUR_KEY_ID"
ISSUER_ID = "YOUR_ISSUER_ID"
KEY_PATH = "~/.private_keys/AuthKey_YOUR_KEY_ID.p8"

def generate_token():
    with open(os.path.expanduser(KEY_PATH), "r") as f:
        private_key = f.read()
    now = int(time.time())
    payload = {
        "iss": ISSUER_ID, "iat": now, "exp": now + 1200,
        "aud": "appstoreconnect-v1"
    }
    return jwt.encode(payload, private_key, algorithm="ES256",
                      headers={"kid": KEY_ID})

Uploading Screenshots

The App Store Connect API screenshot upload is a three-step process:

import hashlib
import base64

def upload_screenshot(screenshot_set_id, filepath, filename):
    with open(filepath, "rb") as f:
        file_data = f.read()
    filesize = len(file_data)
    checksum = base64.b64encode(hashlib.md5(file_data).digest()).decode()

    # 1. Reserve an upload slot
    result = api_request("POST", "appScreenshots", {
        "data": {
            "type": "appScreenshots",
            "attributes": {"fileName": filename, "fileSize": filesize},
            "relationships": {
                "appScreenshotSet": {
                    "data": {"type": "appScreenshotSets", "id": screenshot_set_id}
                }
            }
        }
    })
    screenshot_id = result["data"]["id"]
    upload_ops = result["data"]["attributes"]["uploadOperations"]

    # 2. Upload binary data (chunked)
    for op in upload_ops:
        chunk = file_data[op["offset"]:op["offset"] + op["length"]]
        req = urllib.request.Request(op["url"], data=chunk, method=op["method"])
        for h in op["requestHeaders"]:
            req.add_header(h["name"], h["value"])
        urllib.request.urlopen(req)

    # 3. Commit (signal upload completion)
    api_request("PATCH", f"appScreenshots/{screenshot_id}", {
        "data": {
            "type": "appScreenshots", "id": screenshot_id,
            "attributes": {"uploaded": True, "sourceFileChecksum": checksum}
        }
    })

Per-Language Upload

Map App Store Connect localizations to screenshot subdirectories:

def main():
    app_id, version_id, version, state = get_app_and_version()
    locs = get_localizations(version_id)  # {"ja": "loc_id_1", "en-US": "loc_id_2"}

    for locale, loc_id in locs.items():
        lang = locale.split("-")[0]  # "en-US" -> "en"
        lang_dir = os.path.join(marketing_dir, lang)

        iphone_files = sorted([f for f in os.listdir(lang_dir)
                               if "iphone" in f and f.endswith(".png")])

        # Delete existing screenshots and upload new ones
        ss_set_id = delete_existing_screenshots(loc_id, "APP_IPHONE_67")
        if ss_set_id is None:
            ss_set_id = create_screenshot_set(loc_id, "APP_IPHONE_67")

        for f in iphone_files:
            upload_screenshot(ss_set_id, os.path.join(lang_dir, f), f)

Adding a --dry-run flag lets you preview what would happen without actually uploading – useful for pre-flight checks.

Step 6: Putting It All Together

#!/bin/bash
set -euo pipefail

SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
PROJECT_DIR="$(dirname "$SCRIPT_DIR")"

DO_UPLOAD=false
[[ "${1:-}" == "--upload" ]] && DO_UPLOAD=true

# 1. Prepare simulators
xcrun simctl boot "$IPHONE_UDID" 2>/dev/null || true
xcrun simctl addmedia "$IPHONE_UDID" "$PROJECT_DIR/TestResources/sample.jpg"

# 2. Capture JA/EN screenshots
for LANG in ja en; do
    capture_device "$IPHONE_UDID" "iphone" "$LANG" "$SCREENSHOT_DIR/$LANG/iphone"
    capture_device "$IPAD_UDID"   "ipad"   "$LANG" "$SCREENSHOT_DIR/$LANG/ipad"

    resize_screenshots "$SCREENSHOT_DIR/$LANG/iphone" "$RESIZED_DIR/$LANG/iphone" 1290 2796
    resize_screenshots "$SCREENSHOT_DIR/$LANG/ipad"   "$RESIZED_DIR/$LANG/ipad"   2048 2732
done

# 3. Generate marketing images
for LANG in ja en; do
    python3 "$SCRIPT_DIR/generate_marketing_screenshots.py" \
        --input-iphone "$RESIZED_DIR/$LANG/iphone" \
        --input-ipad "$RESIZED_DIR/$LANG/ipad" \
        --output "$MARKETING_DIR" \
        --lang "$LANG"
done

# 4. Record demo videos
record_demo "$IPHONE_UDID" "iphone" "ja"
record_demo "$IPHONE_UDID" "iphone" "en"

# 5. Upload (optional)
if $DO_UPLOAD; then
    python3 "$SCRIPT_DIR/upload_screenshots.py" --dir "$MARKETING_DIR"
fi

# 6. Clean up
xcrun simctl shutdown "$IPHONE_UDID" 2>/dev/null || true
xcrun simctl shutdown "$IPAD_UDID" 2>/dev/null || true

Usage:

# Capture + generate marketing images only
./scripts/capture_screenshots.sh

# Including upload to App Store Connect
./scripts/capture_screenshots.sh --upload

Tips and Gotchas

1. Separate Parameters for iPhone vs iPad

iPhone 6.7-inch (1290x2796, aspect ratio ~0.46) and iPad 12.9-inch (2048x2732, aspect ratio ~0.75) need different font sizes and scaling factors to look balanced. The code uses the w / h aspect ratio to distinguish between them.

2. Using macOS System Fonts

To keep things working in CI without custom font installation, use macOS built-in fonts: Hiragino Sans (Japanese) and Helvetica (English).

FONT_BOLD = "/System/Library/Fonts/ヒラギノ角ゴシック W6.ttc"
FONT_BOLD_EN = "/System/Library/Fonts/Helvetica.ttc"

3. PHPicker Automation is Flaky

Automating PHPicker from UI tests is unreliable. For screenshot capture, bypass it entirely by passing images via TEST_IMAGE_PATH. Only use PHPicker interaction for demo video recording where you want to show the natural user flow.

4. XcodeGen Integration

If you manage your project with project.yml, run xcodegen generate before tests to ensure the .xcodeproj is up to date.

Conclusion

With this automation pipeline, a single command handles everything when you update your app:

  • Screenshot capture across iPhone, iPad, Japanese, and English
  • Marketing image generation with gradient backgrounds and device frames
  • Demo video recording
  • Upload to App Store Connect

What used to take over an hour of manual work now runs unattended after a single script invocation. For multilingual apps, the time savings increase with each language you support.

The Pillow-based image generation lets you tweak theme colors and layouts entirely in code, producing reasonable marketing images without needing a designer for every release.