MetaMed Examples

A script file metamed.sh is used in all examples below with content:

#!/bin/sh

java -Xms512m -Xmx1024m -classpath ./: -jar metamed-* $@

You can change a maximum memory limit used by MetaMed by -Xmx option (Java Virtual Machine). The example option -Xmx1024m sets the limit to 1024 MB. The maximum amount of memory which you can use depends on a machine and operating system you use.

The file connection.properties is used in some examples. Its content is:

url=jdbc:virtuoso://localhost:1111
user=username
password=password
#graph=http://mre.kiv.zcu.cz/dataset/medical

The meaning of all properties is clear enought. It is a default graph when you specify value of graph property in the connection.properties. Anyway, using --graph argument you can set a different one that will be used by MetaMed by default insted of the one in the connection.properties file.

Example 1 — Print the graph size

Connect to the OpenLink Virtuoso database with and print the size of the graph named http://mre.zcu.cz/dataset/imaging.

./metamed.sh --virt connection.properties \
  --graph http://mre.zcu.cz/dataset/imaging --size

Example 2 — Extract meta data

Extract all meta data from files in the ./data/ directory into the graph graphName which is backed in the file system directory ./rdfStore/.

./metamed.sh --graph-file-system rdfStore/ -graph graphName \
  --input-data data/

There all extracted meta data will be automatically backed to the RDF/XML file in rdfStore/graphName.xml from the in-memory RDF graph.

Example 3 — Concatenate/Merge RDF data files

Import files RDF files in N-TRIPLE format into the RDF store. All input files have to be in the same RDF format.

./metamed.sh --graph-file-system rdfStore/ --graph graphName \
  --import file1.nt file2.nt file3.nt -if N-TRIPLE

MetaMed will import all files into the in-memory RDF graph. The in-memory RDF graph with all imported triples will be automatically backed to the RDF/XML file in rdfStore/graphName.xml.

Example 4 — Query RDF graph with SPARQL

Query RDF graph graphName stored in the file system (./rdfStore/ directory) by a query from the file query.sparql.

./metamed.sh --graph-file-system rdfStore/ --graph graphName \
  --query query.sparql --output outputDirectory/ --output-format csv

The query output will be stored in the local file ./outputDirectory/query.sparql.csv in CSV format.

Example 5 — Export RDF graph

Export RDF graph graphName stored in the file system (./rdfStore/ directory) into the file export.ttl and use the TURTLE serialisation format.

./metamed.sh --graph-file-system rdfStore/ --graph graphName \
  --export export.ttl --export-format TURTLE

Note: When you use a file system backed graph, you have to expect the graph is stored to a file twice — (1) export and (2) before application exit, because the graph is backed on a file system. It is significant with a large graph.

Example 6 — Extract, import, query and export

Use RDF graph graphName stored in the file system (./rdfStore directory) and do following operations. Size of the graph is printed before/after each operation.

  1. First, extract meta data from files in directory ./data/.
  2. Import all files (file1.nt, file2.nt, file3.nt) into the graph (merge files in memory).
  3. Process SPARQL from the file (query.sparql) and output results into the ./outputDirectory/query.sparql.csv (CSV format).
  4. Export the whole in-memory RDF graph into the export.ttl file with TURTLE serialisation format. It is done after query processing.
./metamed.sh --graph-file-system rdfStore/ --graph graphName \
  --data data/ --import file1.nt file2.nt file3.nt -if N-TRIPLE \
  --query query.sparql --output outputDirectory/ --output-format csv \
  --export export.ttl --export-format TURTLE --size

Note: When you use a file system backed graph, you have to expect the graph will be stored to a file twice: (1) export to export.ttl file and (2) before application exit, because the graph is backed on a file system. It is significant time and resources consuming with a large graph.