Achieving Up to 7.6x Faster Image Delivery by Optimizing Cantaloupe IIIF Server Cache

Introduction

I run Cantaloupe, an IIIF-compliant image server, in a Docker environment with S3 as the image source. IIIF viewers (such as Mirador and OpenSeadragon) generate dozens to hundreds of simultaneous tile requests every time the user zooms or pans.

By reviewing the cache settings and tuning parameters, I was able to speed up tile delivery by up to 7.6x. In this article, I share the methods and results.

Environment

Server: AWS EC2 (2 vCPU, 7.6GB RAM)
Cantaloupe: islandora/cantaloupe:2.0.10 (based on Cantaloupe 5.0.7)
Image Source: Amazon S3 (S3Source)
Test Image: 25167×12483px TIFF (512×512 tiles)
Reverse Proxy: Traefik v3.2
Setup: Docker Compose

Problem: Cache Is Disabled by Default

After investigating the default settings of the islandora/cantaloupe image, I found the following state:

Cache Type	Default	Description
Derivative Cache (processed images)	Disabled	Image conversion runs on every request, even for identical ones
Source Cache (local copy of originals)	Enabled (FilesystemCache)	Keeps a local copy of original images fetched from S3
Info Cache (image metadata)	Enabled (in-memory)	Stores image dimensions and tile information
Client Cache (HTTP headers)	Enabled (max-age 30 days)	Controls browser-side caching

The biggest issue is that Derivative Cache is disabled. Even when an IIIF viewer re-requests the same tile, the full pipeline of S3 download → image conversion → response runs every time.

Benchmark Method

Simple Bulk Tile Test

As a basic performance measurement, I ran a bulk tile benchmark under the following conditions:

Number of tiles: 91 tiles (all tiles at zoom level 4, scaleFactor=4)
Concurrent connections: 10 (typical browser concurrency)
Tool: Parallel requests using curl + xargs -P

# Generate tile URLs and send concurrent requests with 10 parallel workers
xargs -a tile_urls.txt -P 10 -I {} \
  curl -s -o /dev/null -w "%{time_total}\n" "{}"

Mirador Simulation

In addition to the simple bulk tile test, I ran a benchmark that simulates the actual usage flow of the IIIF viewer Mirador. When a user opens an image in Mirador, the following requests are generated in a short period:

Phase	Description	Concurrent Connections
Phase 1	Fetch `info.json` + thumbnail	2
Phase 2	Load initial viewport tiles (28 tiles, scaleFactor=8)	6
Phase 3	Zoom-in operation (50 tiles, scaleFactor=2)	6
Phase 4	Multiple simultaneous users (3 users × different regions, 24 tiles total)	18

The concurrent connection count matches Chrome’s default (6 connections per host).

Step 1: Enabling Cache

Changes

I added the following to .env:

# Derivative Cache (cache for processed images)
CANTALOUPE_CACHE_SERVER_DERIVATIVE_ENABLED=true
CANTALOUPE_CACHE_SERVER_DERIVATIVE=FilesystemCache
CANTALOUPE_CACHE_SERVER_DERIVATIVE_TTL_SECONDS=2592000

# Source Cache (local cache for original images)
CANTALOUPE_CACHE_SERVER_SOURCE=FilesystemCache
CANTALOUPE_CACHE_SERVER_SOURCE_TTL_SECONDS=2592000

# Cache Worker (automatic cleanup of expired cache, 24-hour interval)
CANTALOUPE_CACHE_SERVER_WORKER_ENABLED=true
CANTALOUPE_CACHE_SERVER_WORKER_INTERVAL=86400

# FilesystemCache storage path
CANTALOUPE_FILESYSTEMCACHE_PATHNAME=/data

In docker-compose.prod.yml, I added cache persistence and increased memory allocation:

services:
  cantaloupe:
    deploy:
      resources:
        limits:
          memory: 2G   # Increased from 1G to 2G
        reservations:
          memory: 1G   # Increased from 512M to 1G
    volumes:
      - cantaloupe_cache:/data  # Persist cache

volumes:
  cantaloupe_cache:

Results

Scenario	Total Time (91 tiles)	Avg/Tile	P95
Before change (no cache)	12,240ms	1.277s	2.769s
After change, first access (cache write)	38,557ms	4.132s	10.420s
After change, subsequent access (cache hit)	1,991ms	0.156s	0.248s

The first access is slower due to cache write overhead, but subsequent accesses are approximately 6x faster.

Step 2: Optimizing S3 Chunking, Processor, and JVM

Changes

I added the following to .env:

# S3 chunking optimization (read buffer from S3)
CANTALOUPE_S3SOURCE_CHUNKING_ENABLED=true
CANTALOUPE_S3SOURCE_CHUNKING_CHUNK_SIZE=2M        # 512K → 2M
CANTALOUPE_S3SOURCE_CHUNKING_CACHE_ENABLED=true
CANTALOUPE_S3SOURCE_CHUNKING_CACHE_MAX_SIZE=50M    # 5M → 50M

# Switch TIF processing to TurboJpegProcessor (faster via native library)
CANTALOUPE_PROCESSOR_MANUALSELECTIONSTRATEGY_TIF=TurboJpegProcessor
CANTALOUPE_PROCESSOR_SELECTION_STRATEGY=ManualSelectionStrategy

# JVM heap tuning
JAVA_OPTS=-Xmx1280m -Xms512m -XX:+UseG1GC

The rationale for each setting is as follows:

Setting	Before	After	Purpose
Chunk size	512KB	2MB	Reduce the number of S3 requests by approximately 1/4
Chunking cache	5MB	50MB	Keep source data in memory to avoid re-downloads
TIF processor	Java2dProcessor	TurboJpegProcessor	Faster JPEG output via native library
JVM GC	Default	G1GC + 1280MB heap	Reduce GC frequency and improve stability

Results

Scenario	Total Time (91 tiles)	Avg/Tile	P95
Cold (first access, no cache at all)	3,338ms	0.321s	1.004s
Semi-warm (no disk cache, memory cache available)	1,602ms	0.114s	0.192s
Warm (cache hit)	1,896ms	0.140s	0.229s

Compared to the cold access time in Step 1 (38.6 seconds), cold access became approximately 11.5x faster.

Overall Comparison

Phase	Cold	Warm
Initial state (cache disabled)	12,240ms	— (same every time)
Step 1: Cache enabled	38,557ms	1,991ms
Step 2: + Tuning	3,338ms	1,602ms (semi-warm)

Final improvement:

Warm access: approximately 7.6x faster (12.2s → 1.6s)
Cold access from Step 1 to Step 2: approximately 11.5x faster (38.6s → 3.3s)
Average per tile: approximately 9x faster (1.28s → 0.14s)

Resource Usage

Even after optimization, there was no significant change in CPU or memory usage.

State	CPU	Memory
Idle	0.1%	656MB / 2GB (32%)
Under load (91 tiles concurrently)	5%	657MB / 2GB (32%)

Since FilesystemCache is disk-based, it has the advantage of not increasing memory consumption.

Caveats

Cache Disk Space

FilesystemCache has no size limit setting. It is managed via TTL (30 days) and Cache Worker (automatic cleanup at 24-hour intervals), but with a large number of images, it may consume significant disk space. It is recommended to periodically check usage with docker exec <container> du -sh /data.

Cache Persistence

If you do not configure volumes in docker-compose.yml, all cache will be lost when the container restarts. Make sure to set up a named volume (cantaloupe_cache:/data).

Future Work

Adding an Nginx Reverse Proxy Cache

Traefik v3.2, which I currently use, does not have native HTTP response caching. By adding an Nginx reverse proxy cache in front of Cantaloupe, responses can be served from cache before reaching Cantaloupe.

Client → Traefik → Nginx (cache) → Cantaloupe → S3

This means that on cache hits, no load reaches the Cantaloupe process at all, enabling further speed improvements and higher concurrency. This is expected to be particularly effective for public collections where access to the same images is concentrated.

Conclusion

In the default Cantaloupe configuration, Derivative Cache is disabled. When using S3 as a source, this means every request triggers a download and image conversion, which is highly inefficient. The following two-step optimization achieved significant performance improvements:

Enabling cache (Derivative Cache + Source Cache + Cache Worker)
Parameter tuning (increased S3 chunking + TurboJpegProcessor + JVM heap adjustment)

In terms of the actual experience in an IIIF viewer, the initial display takes about 3 seconds, and zoom/pan operations on subsequent accesses respond almost instantly.

Introduction#

Environment#

Problem: Cache Is Disabled by Default#

Benchmark Method#

Simple Bulk Tile Test#

Mirador Simulation#

Step 1: Enabling Cache#

Changes#

Results#

Step 2: Optimizing S3 Chunking, Processor, and JVM#

Changes#

Results#

Overall Comparison#

Resource Usage#

Caveats#

Cache Disk Space#

Cache Persistence#

Future Work#

Adding an Nginx Reverse Proxy Cache#

Conclusion#

Introduction

Environment

Problem: Cache Is Disabled by Default

Benchmark Method

Simple Bulk Tile Test

Mirador Simulation

Step 1: Enabling Cache

Changes

Results

Step 2: Optimizing S3 Chunking, Processor, and JVM

Changes

Results

Overall Comparison

Resource Usage

Caveats

Cache Disk Space

Cache Persistence

Future Work

Adding an Nginx Reverse Proxy Cache

Conclusion