Three levels of reproducibility: Docker, Galaxy, Linked Data

[Originally posted at LinkedIn]

I have just stumbled upon this thread on why one should use Galaxy ( One of the reasons posted is reproducibility, but Galaxy only solves one level of reproducibility, “functional reproducibility” (What I did with the data). There is at least two other levels, one “bellow” Galaxy and another one “above” Galaxy:

  • Bellow: computational environment: Operating System, library dependencies, binaries.
  • Above: semantics. What the data means.

In order to be completely reproducible, one has to be reproducible on the three levels:

  1. Computational: Docker.
  2. Functional: Galaxy.
  3. Semantics: URIs, RDF, SPARQL, OWL.

And how to do it is described in our GigaScience paper, “Enhanced reproducibility of SADI Web Service Worfkflows with Galaxy and Docker” 🙂 (

Just to emphasize and clarify, the 3 levels would be:
3.- Semantics: what the data means.
2.- Functional: what I did with the data.
1.- Computational: how I did it.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: