
Comparing NDL Koten OCR-Lite and Cloud Vision API on a Jiaxing Tripitaka 'Mahāprajñāpāramitā Sūtra' — Observations across 105 Images
We applied two OCR engines — Japan's National Diet Library NDL Koten OCR-Lite and Cloud Vision API DOCUMENT_TEXT_DETECTION — to 105 IIIF images of fascicles 571–575 of the Mahāprajñāpāramitā Sūtra in the Jiaxing Tripitaka held by Yūrenja (formerly the Hōonzō of Zōjōji), and compared the patterns of error in their outputs. NDL produced phantom kana lines on 12 pages; Vision picked up color charts, rulers, and shelf labels as if they were body text on all 105.
ocrndl-koten-ocrgoogle-vision-apiiiif
