Introduction

FromThePage is a web platform specialized for crowdsourced transcription of historical documents. It enables efficient management of the process of converting handwritten manuscripts and printed materials into text data with the help of volunteers.

Adopted by libraries, museums, and archives worldwide—including the Library of Congress and the Smithsonian Institution—FromThePage has become a core tool for document digitization in Digital Humanities (DH).

Key Features of FromThePage

Crowdsourced Transcription

The defining feature of FromThePage is the ability to conduct large-scale transcription projects through crowdsourcing.

  • Transcription Workflow: Staged workflow of transcription, review, and approval
  • Rich Text Editor: Support for Markdown and Wiki markup
  • Version Control: Complete edit history retention
  • Volunteer Management: Contribution tracking, badge system

IIIF Support

FromThePage fully supports IIIF (International Image Interoperability Framework).

# Example IIIF manifest import
https://example.org/iiif/manifest/12345
  • IIIF Manifest Import: Directly import images from external IIIF-compatible repositories
  • Mirador Integration: Seamless integration with the Mirador IIIF image viewer
  • IIIF Content Search API: Publish transcribed text via IIIF Search API
  • IIIF Manifest Export: Output manifests containing transcribed text

Structured Data Input

Beyond simple transcription, FromThePage supports structured data entry.

FeatureDescription
Field-based TranscriptionCollect structured data with custom forms
Metadata EntryRecord document metadata by field
Spreadsheet TranscriptionSpecialized for tabular data transcription
Markup TranscriptionTranscription with TEI-XML and other markup

OCR Integration

In addition to manual transcription, FromThePage includes OCR (Optical Character Recognition) integration.

  • OCR Pre-processing: Use OCR output as initial text for transcription
  • OCR Correction Workflow: Volunteers correct OCR results
  • HTR Support: Integration with Handwritten Text Recognition

Adoption Examples

Library of Congress

The Library of Congress uses FromThePage for its By the People (formerly Transcribe) project, where volunteers transcribe tens of thousands of historical documents including presidential letters, Civil War records, and women’s suffrage movement documents.

Smithsonian Institution

The Smithsonian Transcription Center runs museum collection transcription projects powered by FromThePage. Natural history specimen labels, diaries, and field notes are among the materials being transcribed.

Other Institutions

  • Huntington Library: Medieval manuscript transcription
  • Texas State Library: Historical legal document transcription
  • Various university libraries: Special collections digitization

Technical Features

Export Formats

Transcription results can be exported in various formats.

  • TEI-XML: Standard markup format for humanities texts
  • Plain Text: Simple text output
  • HTML: For web publishing
  • CSV: Tabular output for structured data
  • IIIF Manifest: For display in IIIF-compatible viewers
  • ALTO XML: Text with page layout information

API

# FromThePage API usage example
curl -H "Accept: application/json" \
  https://fromthepage.com/api/v1/collections

The RESTful API enables programmatic retrieval of transcription data.

Pricing Plans

FromThePage offers multiple plans, including a free tier.

PlanFeatures
FreeBasic transcription features, public projects
InstitutionalPrivate projects, customization, support
EnterpriseLarge-scale projects, SLA, dedicated environment

For small-scale projects and individual research, the free plan provides sufficient functionality.

Conclusion

FromThePage is a powerful platform specialized for crowdsourced transcription of historical documents. With comprehensive IIIF integration, structured data input, OCR integration, and diverse export formats, it provides all the features needed for DH research.

As demonstrated by its adoption at major libraries and museums worldwide, it is a reliable tool capable of managing large-scale transcription projects. With a free plan available, we recommend starting with a small-scale project to experience its capabilities.