Important Usage Notice

The system described in this article may place load on external servers. Please exercise caution when using it.

  • Server load: Parallel requests place load on target servers
  • DoS risk: A large number of simultaneous accesses may be mistaken for a DoS attack
  • Recommended approach: Download images locally in advance and run only the OCR processing in parallel
  • Check terms of service: Always review the target server’s terms of service and obtain prior permission if necessary
  • Appropriate rate limiting: In production, conservative concurrency settings (around 5-10 parallel) are strongly recommended
  • Responsible use: Always be considerate of server administrators and other users

This article is a record of a technical proof of concept. We ask readers to use the system responsibly.

Background

This article introduces a case study of building a scalable OCR processing system on Azure Container Apps using NDL Classical Japanese OCR Lite, developed by the National Diet Library (NDL) of Japan. We describe the design and implementation of a system that achieves pay-per-use billing and auto-scaling through a cloud-native architecture.

System Overview

Architecture

IIIF Images -> Azure Container Apps -> NDL Classical Japanese OCR -> TEI XML Output
                    |
              Auto-scaling
              (0-30 replicas)

Key Components

  • OCR engine: NDL Classical Japanese OCR Lite (specialized for Japanese classical texts)
  • Infrastructure: Azure Container Apps (serverless containers)
  • API design: REST API (image URL -> OCR result)
  • Output format: TEI P5-compliant XML
  • Scaling: Automatic scaling based on demand

Features of NDL Classical Japanese OCR Lite

OCR Optimized for Japanese Classical Texts

  • Vertical text layout support: Vertical writing structure specific to classical texts
  • Reading order optimization: Right-to-left, top-to-bottom Japanese reading order
  • Classical character recognition: Support for cursive script (kuzushiji) and variant kana
  • Lightweight implementation: Cloud-deployable through Docker containerization

Why Azure Container Apps

Benefits of Serverless Containers

# Scaling configuration example
scale:
  minReplicas: 0      # Idle: zero cost
  maxReplicas: 30     # On demand: auto-expand
  cooldownPeriod: 300 # Scale down after 5 minutes

Cost Optimization

  • Pay-per-use billing: Charged only for actual usage
  • Zero replicas: Completely zero cost when idle
  • Auto-scaling: Resource adjustment based on demand

System Implementation

Server-Side Implementation

# Flask + NDL OCR integration
from flask import Flask, request, jsonify
from flask_restx import Api, Resource
from simple_ocr_service import OCRService

app = Flask(__name__)
api = Api(app, doc='/docs/')

@api.route('/api/image')
class ImageOCR(Resource):
    def get(self):
        image_url = request.args.get('image_url')
        # Process image with NDL OCR
        result = ocr_service.process_single_image(image_url)
        return result

Reading Order Algorithm

def sort_japanese_reading_order(lines):
    """Sort in Japanese classical text reading order"""
    return sorted(lines, key=lambda line: (
        -line["bbox"][0],  # x-coordinate descending (right to left)
        line["bbox"][1]    # y-coordinate ascending (top to bottom)
    ))

TEI XML Output

xml version="1.0" encoding="UTF-8"?>
TEI xmlns="http://www.tei-c.org/ns/1.0">
  teiHeader>
    fileDesc>
      titleStmt>
        title>桐壺title>
      titleStmt>
      respStmt>
        resp>Automated Transcriptionresp>
        name ref="https://github.com/ndl-lab/ndlkotenocr-lite">
          NDL古典籍OCR Lite
        name>
      respStmt>
    fileDesc>
  teiHeader>
  facsimile>
    surface xml:id="surface-1">
      zone xml:id="zone-1-1" ulx="3391" uly="1141"
            lrx="3727" lry="2924" cert="0.799"/>
    surface>
  facsimile>
  text>
    body>
      div type="transcription">
        pb n="1" facs="#surface-1"/>
        lb n="1.1" corresp="#zone-1-1" cert="high"/>
        いづれの御時にか
      div>
    body>
  text>
TEI>

Processing Results

Small-Scale Test (Kiritsubo)

  • Target: “Kiritsubo” held by the University of Tokyo
  • Pages: 32 pages
  • Processing time: Approximately 30 seconds
  • Success rate: 100%
  • Concurrency: 10 parallel
  • Cost: Approximately $0.05

Performance Characteristics

Processing time = ~1 second/page (with parallel processing)
Cost efficiency = $1.5-2.0/1000 pages
Scaling = 0 to 20 replicas in seconds

Technical Features

1. Cold Start Handling

async def process_with_retry(image_url, max_retries=3):
    """Automatic retry for cold starts"""
    for attempt in range(max_retries + 1):
        try:
            if attempt > 0:
                wait_time = 2 ** (attempt - 1)
                await asyncio.sleep(wait_time)
            return await ocr_request(image_url)
        except (HTTPError, TimeoutError) as e:
            if attempt == max_retries:
                raise e

2. Externalized Configuration

# Configuration via environment variables
OCR_API_URL=https://your-ocr-service.azurecontainerapps.io
DEFAULT_MAX_CONCURRENT=10
DEFAULT_CONFIDENCE_THRESHOLD=0.3
DEFAULT_OUTPUT_FORMAT=xml

3. Swagger UI Integration

# Automatic API specification generation
api = Api(app,
    version='1.0',
    title='NDL Classical Japanese OCR API',
    description='OCR processing API specialized for Japanese classical texts',
    doc='/docs/'
)

Deployment

Azure Container Apps Deployment

# Create container app
az containerapp create \
  --name ocr-service \
  --resource-group rg-ocr \
  --environment container-env \
  --image registry.azurecr.io/ocr-app:latest \
  --target-port 80 \
  --ingress external \
  --min-replicas 0 \
  --max-replicas 30 \
  --cpu 2.0 \
  --memory 4Gi

Docker Configuration

FROM python:3.11-slim

# Place NDL OCR model
COPY model/ /app/model/
COPY config/ /app/config/

# Application setup
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt

COPY . .
EXPOSE 80

CMD ["gunicorn", "--bind", "0.0.0.0:80", "app:app"]

Operations and Monitoring

Performance Metrics

  • Response time: Average 2-3 seconds/image
  • Throughput: 10-15 images/second (with 20 replicas)
  • Success rate: Over 99%
  • Cost efficiency: $0 when idle, charged only during processing

Log Monitoring

# Check Container Apps logs
az containerapp logs show \
  --name ocr-service \
  --resource-group rg-ocr \
  --follow

Future Prospects

Technical Improvements

  • Image caching: Reduction of duplicate processing
  • Batch processing: Efficient bulk processing
  • GPU support: Acceleration of OCR processing
  • Enhanced metrics: Detailed performance analysis

Potential Applications

  • Digital archives: Use in libraries and museums
  • Research support: Digitization for humanities research
  • Education: Creating teaching materials from classical texts
  • Cultural preservation: Digital preservation of rare materials

Summary

By combining NDL Classical Japanese OCR Lite with Azure Container Apps, we built a classical text OCR system that achieves both cost efficiency and scalability. The serverless architecture enables pay-per-use billing and auto-scaling, making it a practical digital humanities tool.

Key Points

  • Cost optimization: Charged only during use
  • Auto-scaling: Resource adjustment based on demand
  • TEI P5-compliant: Standardized XML output
  • Classical text specialization: OCR optimized for Japanese classical texts
  • API design: Simple and extensible architecture

This system was developed as a technical proof of concept. In production use, please give adequate consideration to the load on target servers, apply appropriate rate limiting, and comply with terms of service.

References