Introduction

PureImage goals are:

  1. decide if image is without readable text (pure)
  2. make the image pure by.

First, PureImage classifies image into the one of three categories: (1) clean, (2) text or (3) identification.

The second goal is under the development. Recognized identification should be removed by enabling de-identification option in the PureImage.

  • It removes protected text from images by blanking them.
  • It is designed for DICOM burned-in annotation de-identification.
  • It specializes to DICOM files, but can work with a lot of other image file formats, too.

Results

The false positive rates are in all cases below 4.00%, and 1.81% in the mission-critical problem of detecting protected health information. The classifier's weighted average recall was 94.85%, the weighted average inverse recall was 97.42% and Cohen's Kappa coefficient was 0.920.

The classification of burned-in text is highly configurable and able to analyze images from different modalities with a noisy background.