Handling Shared Memory Shortage When Running ndlocr_cli and Other Issues

Overview

This is a memo about issues I encountered when running ndlocr_cli (the NDLOCR (ver.2.1) application repository) and the steps taken to resolve them.

Note that many of these issues were caused by my own configuration oversights or atypical usage, and are unlikely to occur during normal use. Please refer to this article if you encounter similar issues.

Shared Memory Shortage

When running ndlocr_cli, the following error occurred.

Predicting: 0it [00:00, ?it/s]ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm).
DataLoader worker (pid(s) 3999) exited unexpectedly

The response from ChatGPT was as follows.

The “Unexpected bus error encountered in worker” error message typically occurs when there is insufficient shared memory when using PyTorch’s DataLoader. This is especially seen when the dataset is large or many workers are used.

And the following instructions were given.

If you are using Docker or another virtual environment, you need to increase the shared memory size. When using Docker, set the --shm-size option when starting the container. For example, set it as docker run --shm-size 2G ....

Upon checking my Docker execution command, I found that the --shm-size specification was missing. The following script specifies --shm-size=256m.

https://github.com/ndl-lab/ndlocr_cli/blob/master/docker/run_docker.sh

After adding this option, the shared memory shortage error was resolved.

(Reference) Checking Current Shared Memory Size

This could be checked with the following command.

df -h /dev/shm

When the above error occurred, it was 64m.

KeyError: ‘STRING’

I encountered KeyError: 'STRING' several times. To address this, I made changes to the following two files.

https://github.com/ndl-lab/ndlocr_cli/blob/master/cli/core/inference.py#L681

https://github.com/ndl-lab/ruby_prediction/blob/646de35cefde6fa205f4b6a3ac308e7f5ba91061/output_ruby.py#L104C45-L104C65

Errors were occurring at the line_xml.attrib['STRING'] and elm.attrib['STRING'] sections, so I added the following handling.

if 'STRING' not in line_xml.attrib:
    continue

Reference: Adding a Progress Bar

There was a case where I wanted to display a progress bar during OCR processing. Modify the following section.

https://github.com/ndl-lab/ndlocr_cli/blob/master/cli/core/inference.py#L213

Specifically, add tqdm as follows.

from tqdm import tqdm

# for img_path in single_outputdir_data['img_list']:
for img_path in tqdm(single_outputdir_data['img_list']):
    ...

This allows you to check the current progress and estimated remaining time.

Summary

When using ndlocr_cli in a standard manner, the error handling described in this article is likely unnecessary, but I hope it serves as a useful reference when encountering similar issues.

Overview#

Shared Memory Shortage#

(Reference) Checking Current Shared Memory Size#

KeyError: ‘STRING’#

Reference: Adding a Progress Bar#

Summary#