Usage

AnonMed is a command-line Java application. When you have already downloaded and extracted AnonMed, you can use it:

  • through command anonmed when deb package were installed (on GNU/Debian Linux);
  • use prepared anonmed.sh script when used ZIP package;
  • directly from the command-line with respect to its syntax if you want co change e.g. output directory.

The most important part of configuration is the de-identification profile. Its real content depends on the goals you want achieve. The level of anonymization and file format leads to making a proper de-identification profile you need for the AnonMed. A sort of examples you can find at the profile page or in the downloaded AnonMed, which is not a comprehensive example, but can be useful for writing your own rules.

Command Line Syntax

This information you can see also with --help argument.

usage: java -jar anonmed-2024.02.jar [options] INPUT [INPUT...]

Options: -i <identFile> -p <profileFile> -s <seqFile> -u <dir>
        [-c <ctx>] [-d <file>] [--delete-original] [-h] 
        [-ks <dir>] [-o <dir>] [--overwrite] 
        [-pwd <pass>] [-r] [--strict-mode] 

Options

-c,--context ctx
AnonMed runs in the named context (eg. date, data source).
-d,--data file
Files to de-identify/anonymize.
--delete-original
Delete original file when de-identified.
-h,--help
Print this message.
-i,--identification-mapping identFile
Identification's mapping database file (required). The file (CSV format) maps private personal identifications to an secure value (application's unique sequence number).
-ks,--keystore dir
KeyStore directory.
Output directory.
--overwrite
Overwrite file. Be careful an original or output file can be overwritten.
-p,--profiles profileFile
File with profiles description.
-pwd,--password pass
Password to the key store.
-r,--rename-filename
Rename a file on an output.
-s,--sequence-number seqFile
A file with next available sequence number (required). It contains single line with that number.
--strict-mode
Enables strict mode. Data (attributes, elements, tags, cells) not mentioned in any rule are automatically removed. Use KEEP rule when no other rule is useful to keep data in the file in the strict mode. This feature is experimental and not supported or proper (safe) for all file formats. Be careful, the result can be invalid or unusable.
-u,--uncertain dir
Uncertain output directory.

AnonMed in the GNU/Debian Linux

You have command anonmed available when the anonmed_2024.02-1_all.deb package were installed.

Configuration is both, system and user, defined. The system configuration is in the /etc/mre/anonmed/ directory.

  • /etc/mre/anonmed/profile.d/ is directory with profiles separated by
  • /etc/mre/anonmed/profile concatenated files from the previous profile.d/ and this profile (if exist) is used by default.
  • /etc/mre/anonmed/sequence new sequence number that will be assigned as the next anonymous ID.
  • /etc/mre/anonmed/identification.map map with anonymous ID and the original person one. You should have this file in the secret and backup it regularly.
  • ~/.mre/anonmed/input the file with absolute path to the directory with input data that have to be de-identified.
  • ~/.mre/anonmed/output the file with absolute path to the directory where output de-identified files are stored.
  • ~/.mre/anonmed/uncertain the file with absolute path to the directory where uncertain files are stored. An uncertain file may still contain some personal information.

The per-user configuration is in the ~/.mre/anonmed/ directory.

  • ~/.mre/log/ log output, each day per file (e.g. ~/.mre/log/YYYYMMDD-anonmed.log where YYYY is the year, MM month and DD day
  • ~/.mre/anonmed/profile concatenated profiles. This profile (if exist) is used by default.
  • ~/.mre/anonmed/sequence same as /etc/mre/anonmed/sequence.
  • ~/.mre/anonmed/identification.map same as /etc/mre/anonmed/identification.map.
  • ~/.mre/anonmed/input same as ~/.mre/anonmed/input.
  • ~/.mre/anonmed/output same as ~/.mre/anonmed/output.
  • ~/.mre/anonmed/uncertain same as ~/.mre/anonmed/uncertain.

The best practice is to use the per-user configuration files for your configuration in the production. AnonMed prefers a per-user configuration files. The system wide file is used only when the user file is missing.

AnonMed from the ZIP file

You have a script anonmed.sh available when you just extract the anonmed_2024.02.zip archive.

First, execute the ./anonmed.sh script without arguments. It will produce output like this:

/tmp/anonmed-2024.02$ ./anonmed.sh
Created <<<./data/>>> directory for your input files.

Please, place your data to the ./data/ directory and
re-run the command ./anonmed.sh again to de-identify them.
De-identified files will be created in the ./output/ directory.

Note: This information is shown only when the ./data/ directory is missing.

Well, it tells you, the ./data/ directory was created for the input files.

Next, you can copy your files to the ./data/ directory and execute the ./anonmed.sh again. It will de-identify the input files and the output files will be in the ./output/ directory.

Note: AnonMed will print a list of input files and all rules from the profile file when starts. Then the de-identification progress is shown.

Be careful to the original data damage or lost, without --output argument it will rewrite the input files when --overwrite is set.