Overview

I built a Gradio App using NDL Kotenseki OCR-Lite.

You can try it at the following URL.

https://huggingface.co/spaces/nakamura196/ndlkotenocr-lite

“NDL Kotenseki OCR-Lite” provides a desktop application, so an execution environment is available without the need for a web app like Gradio.

Therefore, the intended use cases for this web app include usage from smartphones or tablets, and integration via web API.

Development Notes and Bug Fixes

Using Submodules

The original ndlkotenocr-lite was introduced as a submodule.

[submodule "ndlkotenocr-lite"]
	path = ndlkotenocr-lite
	url = https://github.com/ndl-lab/ndlkotenocr-lite.git

The following is executed during the build.

#!/bin/bash
# Initialize and update submodule
git submodule update --init --recursive
git submodule update --remote

This should allow the latest files from the original ndlkotenocr-lite to be used during the build.

(There may be some misunderstandings on my part.)

Using Dockerfile

For using the submodule, a Dockerfile-based build approach was adopted.

By setting the sdk to docker, the build is performed based on the Dockerfile.

---
title: NDL Kotenseki OCR-Lite Gradio App
emoji: ๐Ÿ‘€
colorFrom: red
colorTo: blue
sdk: docker
pinned: false
---
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

Using Gradio Version 4.44.1

Initially, Gradio version 5.7.1 was used, but the following error occurred when attempting to use the API (described later).

ValueError: Could not fetch api info for https://nakamura196-ndlkotenocr-lite.hf.space/: {"detail":"Not Found"}

By using version 4.44.1, this error was resolved.

API Usage

Below is an example using “The Tale of Genji” (University of Tokyo General Library).

from gradio_client import Client, handle_file
client = Client("https://nakamura196-ndlkotenocr-lite.hf.space/")

result = client.predict(
		image_path=handle_file('https://iiif.dl.itc.u-tokyo.ac.jp/iiif/genji/TIFF/A00_6587/01/01_0004.tif/full/900,/0/default.jpg'),
		api_name="/predict"
)
print(result)

The following outputs can be obtained: image, text, XML, and JSON data.

('/private/var/folders/z5/3p9s8m011dv0tjt7ch21jtmh0000gn/T/gradio/14190ea20e00e24f50c0ba0f1efd9f411747a6bb7f9f4067404d94c1fcb681f9/image.webp',
 'ใ„ใคใ‚Œใฎๅพกๆ™‚ใ‚ˆใ‚Šใ‹ๅฅณไปๆ›ด่กฃใ‚ใพใŸใ•ใตใ‚‰ใฒ็ตฆใ‘ใ‚‹\nไธญใซใ„ใจใ‚„ใ‚“ใ”ใจใชใใ‚ใฏใซใฏใ‚ใ‚‰ใฌใ‹ใ™ใใ‚Œใใจ\nใใ‚ใ็ตฆใตใ‚ใ‚Šใ‘ใ‚Šใฏใ—ใ‚ใ‚ˆใ‚Šๆˆ‘ใ„ใจๆ€ใฒ\nใ‚ใ‹ใ‚Š็ตฆใธใ‚‹ๅพกใ‹ใŸใ€ณใ€ตใ‚ใ–ใพใ—ใใ‚‚ใฎใซใ‚’ใจใ—\nใ‚ใใญใฟๆง˜ใŠใชใ—ใปใจใใ‚Œไธ‹ใ‚‰ใ†ใฎๆ›ด่กฃใŸใก\nใฏใพใ—ใฆใ‚„ใ™ใ‹ใ‚‰ใ™ๆœๅค•ใฎๅฎฎใคใ‹ใธใซใคใ‘ใฆใ‚‚\nไบบใฎๅฟƒใ‚’ใ†ใ”ใ‹ใ—ใ†ใ‚‰ใฟใ‚’ใŠใตใคใ‚‚ใ‚Šใซใ‚„ใ‚ใ‚Š\nใ‘ใ‚€ใ„ใจใ‚ใคใ—ใใชใ‚Š่กŒใ‚‚ใฎๅฟƒใปใใ‘ใซใ•ใจใ‹ใก\nใชใ‚‹ใ‚’ใ„ใ‚ˆใ€ณใ€ตใ‚ใ‹ใ™ๅ“€ใชใ‚‹ๆฃšใซใŠใปใ‚ใ—ใฆ\nไบบใฎใใ—ใ‚Šใ‚’ใ‚‚ใˆใฏใ‚ใ‹ใ‚‰ใ›็ตฆใฏใ™ไธ–ใฎใŸใ‚ใ—',
 '<?xml version="1.0" ?>\n<OCRDATASET>\n\t<PAGE IMAGENAME="default.jpeg" WIDTH="900" HEIGHT="676">\n\t\t<LINE TYPE="ๆœฌๆ–‡" X="433" Y="169" WIDTH="29" HEIGHT="365" CONF="0.845" ORDER="0" STRING="ใ„ใคใ‚Œใฎๅพกๆ™‚ใ‚ˆใ‚Šใ‹ๅฅณไปๆ›ด่กฃใ‚ใพใŸใ•ใตใ‚‰ใฒ็ตฆใ‘ใ‚‹"/>\n\t\t<LINE TYPE="ๆœฌๆ–‡" X="401" Y="169" WIDTH="29" HEIGHT="364" CONF="0.814" ORDER="1" STRING="ไธญใซใ„ใจใ‚„ใ‚“ใ”ใจใชใใ‚ใฏใซใฏใ‚ใ‚‰ใฌใ‹ใ™ใใ‚Œใใจ"/>\n\t\t<LINE TYPE="ๆœฌๆ–‡" X="372" Y="163" WIDTH="27" HEIGHT="377" CONF="0.812" ORDER="2" STRING="ใใ‚ใ็ตฆใตใ‚ใ‚Šใ‘ใ‚Šใฏใ—ใ‚ใ‚ˆใ‚Šๆˆ‘ใ„ใจๆ€ใฒ"/>\n\t\t<LINE TYPE="ๆœฌๆ–‡" X="342" Y="162" WIDTH="29" HEIGHT="378" CONF="0.841" ORDER="3" STRING="ใ‚ใ‹ใ‚Š็ตฆใธใ‚‹ๅพกใ‹ใŸใ€ณใ€ตใ‚ใ–ใพใ—ใใ‚‚ใฎใซใ‚’ใจใ—"/>\n\t\t<LINE TYPE="ๆœฌๆ–‡" X="312" Y="169" WIDTH="27" HEIGHT="365" CONF="0.819" ORDER="4" STRING="ใ‚ใใญใฟๆง˜ใŠใชใ—ใปใจใใ‚Œไธ‹ใ‚‰ใ†ใฎๆ›ด่กฃใŸใก"/>\n\t\t<LINE TYPE="ๆœฌๆ–‡" X="279" Y="162" WIDTH="28" HEIGHT="379" CONF="0.835" ORDER="5" STRING="ใฏใพใ—ใฆใ‚„ใ™ใ‹ใ‚‰ใ™ๆœๅค•ใฎๅฎฎใคใ‹ใธใซใคใ‘ใฆใ‚‚"/>\n\t\t<LINE TYPE="ๆœฌๆ–‡" X="248" Y="162" WIDTH="31" HEIGHT="378" CONF="0.845" ORDER="6" STRING="ไบบใฎๅฟƒใ‚’ใ†ใ”ใ‹ใ—ใ†ใ‚‰ใฟใ‚’ใŠใตใคใ‚‚ใ‚Šใซใ‚„ใ‚ใ‚Š"/>\n\t\t<LINE TYPE="ๆœฌๆ–‡" X="220" Y="170" WIDTH="27" HEIGHT="362" CONF="0.842" ORDER="7" STRING="ใ‘ใ‚€ใ„ใจใ‚ใคใ—ใใชใ‚Š่กŒใ‚‚ใฎๅฟƒใปใใ‘ใซใ•ใจใ‹ใก"/>\n\t\t<LINE TYPE="ๆœฌๆ–‡" X="189" Y="162" WIDTH="28" HEIGHT="378" CONF="0.830" ORDER="8" STRING="ใชใ‚‹ใ‚’ใ„ใ‚ˆใ€ณใ€ตใ‚ใ‹ใ™ๅ“€ใชใ‚‹ๆฃšใซใŠใปใ‚ใ—ใฆ"/>\n\t\t<LINE TYPE="ๆœฌๆ–‡" X="158" Y="169" WIDTH="28" HEIGHT="363" CONF="0.844" ORDER="9" STRING="ไบบใฎใใ—ใ‚Šใ‚’ใ‚‚ใˆใฏใ‚ใ‹ใ‚‰ใ›็ตฆใฏใ™ไธ–ใฎใŸใ‚ใ—"/>\n\t</PAGE>\n</OCRDATASET>\n',
 {'contents': [[{'boundingBox': [[433, 169],
      [433, 534],
      [462, 169],
      [462, 534]],
     'id': 0,
     'isVertical': 'true',
     'text': 'ใ„ใคใ‚Œใฎๅพกๆ™‚ใ‚ˆใ‚Šใ‹ๅฅณไปๆ›ด่กฃใ‚ใพใŸใ•ใตใ‚‰ใฒ็ตฆใ‘ใ‚‹',
     'isTextline': 'true',
     'confidence': 0.845},
    {'boundingBox': [[401, 169], [401, 533], [430, 169], [430, 533]],
     'id': 1,
     'isVertical': 'true',
     'text': 'ไธญใซใ„ใจใ‚„ใ‚“ใ”ใจใชใใ‚ใฏใซใฏใ‚ใ‚‰ใฌใ‹ใ™ใใ‚Œใใจ',
     'isTextline': 'true',
     'confidence': 0.814},
    {'boundingBox': [[372, 163], [372, 540], [399, 163], [399, 540]],
     'id': 2,
     'isVertical': 'true',
     'text': 'ใใ‚ใ็ตฆใตใ‚ใ‚Šใ‘ใ‚Šใฏใ—ใ‚ใ‚ˆใ‚Šๆˆ‘ใ„ใจๆ€ใฒ',
     'isTextline': 'true',
     'confidence': 0.812},
    {'boundingBox': [[342, 162], [342, 540], [371, 162], [371, 540]],
     'id': 3,
     'isVertical': 'true',
     'text': 'ใ‚ใ‹ใ‚Š็ตฆใธใ‚‹ๅพกใ‹ใŸใ€ณใ€ตใ‚ใ–ใพใ—ใใ‚‚ใฎใซใ‚’ใจใ—',
     'isTextline': 'true',
     'confidence': 0.841},
    {'boundingBox': [[312, 169], [312, 534], [339, 169], [339, 534]],
     'id': 4,
     'isVertical': 'true',
     'text': 'ใ‚ใใญใฟๆง˜ใŠใชใ—ใปใจใใ‚Œไธ‹ใ‚‰ใ†ใฎๆ›ด่กฃใŸใก',
     'isTextline': 'true',
     'confidence': 0.819},
    {'boundingBox': [[279, 162], [279, 541], [307, 162], [307, 541]],
     'id': 5,
     'isVertical': 'true',
     'text': 'ใฏใพใ—ใฆใ‚„ใ™ใ‹ใ‚‰ใ™ๆœๅค•ใฎๅฎฎใคใ‹ใธใซใคใ‘ใฆใ‚‚',
     'isTextline': 'true',
     'confidence': 0.835},
    {'boundingBox': [[248, 162], [248, 540], [279, 162], [279, 540]],
     'id': 6,
     'isVertical': 'true',
     'text': 'ไบบใฎๅฟƒใ‚’ใ†ใ”ใ‹ใ—ใ†ใ‚‰ใฟใ‚’ใŠใตใคใ‚‚ใ‚Šใซใ‚„ใ‚ใ‚Š',
     'isTextline': 'true',
     'confidence': 0.845},
    {'boundingBox': [[220, 170], [220, 532], [247, 170], [247, 532]],
     'id': 7,
     'isVertical': 'true',
     'text': 'ใ‘ใ‚€ใ„ใจใ‚ใคใ—ใใชใ‚Š่กŒใ‚‚ใฎๅฟƒใปใใ‘ใซใ•ใจใ‹ใก',
     'isTextline': 'true',
     'confidence': 0.842},
    {'boundingBox': [[189, 162], [189, 540], [217, 162], [217, 540]],
     'id': 8,
     'isVertical': 'true',
     'text': 'ใชใ‚‹ใ‚’ใ„ใ‚ˆใ€ณใ€ตใ‚ใ‹ใ™ๅ“€ใชใ‚‹ๆฃšใซใŠใปใ‚ใ—ใฆ',
     'isTextline': 'true',
     'confidence': 0.83},
    {'boundingBox': [[158, 169], [158, 532], [186, 169], [186, 532]],
     'id': 9,
     'isVertical': 'true',
     'text': 'ไบบใฎใใ—ใ‚Šใ‚’ใ‚‚ใˆใฏใ‚ใ‹ใ‚‰ใ›็ตฆใฏใ™ไธ–ใฎใŸใ‚ใ—',
     'isTextline': 'true',
     'confidence': 0.844}]],
  'imginfo': {'img_width': 900,
   'img_height': 676,
   'img_path': 'default.jpeg',
   'img_name': 'default.jpeg'}})

Development

The repository includes a docker-compose.yml, so it can be used for building development environments and deploying to production environments outside of HuggingFace Spaces.

Summary

I am grateful that “NDL Kotenseki OCR-Lite” was released as OSS.

I may be unfamiliar with Docker-based development and there may be inaccuracies, but I hope this serves as a helpful reference.