Introduction
PureImage goals are:
- decide if image is without readable text (pure)
- make the image pure by.
First, PureImage classifies image into the one of three categories: (1) clean, (2) text or (3) identification.
The second goal is under the development. Recognized identification should be removed by enabling de-identification option in the PureImage.
- It removes protected text from images by blanking them.
- It is designed for DICOM burned-in annotation de-identification.
- It specializes to DICOM files, but can work with a lot of other image file formats, too.
Results
The false positive rates are in all cases below 4.00%, and 1.81% in the mission-critical problem of detecting protected health information. The classifier's weighted average recall was 94.85%, the weighted average inverse recall was 97.42% and Cohen's Kappa coefficient was 0.920.
The classification of burned-in text is highly configurable and able to analyze images from different modalities with a noisy background.