[Originally posted at LinkedIn]
I have just stumbled upon this thread on why one should use Galaxy (https://www.biostars.org/p/50034/). One of the reasons posted is reproducibility, but Galaxy only solves one level of reproducibility, “functional reproducibility” (What I did with the data). There is at least two other levels, one “bellow” Galaxy and another one “above” Galaxy:
- Bellow: computational environment: Operating System, library dependencies, binaries.
- Above: semantics. What the data means.
In order to be completely reproducible, one has to be reproducible on the three levels:
- Computational: Docker.
- Functional: Galaxy.
- Semantics: URIs, RDF, SPARQL, OWL.
And how to do it is described in our GigaScience paper, “Enhanced reproducibility of SADI Web Service Worfkflows with Galaxy and Docker” 🙂 (http://www.gigasciencejournal.com/content/4/1/59)