Tag Archives: Semantic Web

GigaScience┬┤s Impact Factor

Even though the editors of GigaScience don’t like Impact Factors (and I agree with them), GigaScience has received a very high Impact Factor, 7.46. I’m quite happy since we published a paper in GigaScience last year, Enhanced reproducibility of SADI web service workflows with Galaxy and Docker.

Tagged , , , , , , , , , , ,

Transforming CSV data to RDF with Grafter

Part of my work is to develop pipelines to transform already existing Open Data (Usually CSVs in some data portal, like CKAN) into RDF and hopefully Linked Data. If I have to do the transformation myself, interactively, I normally use Google Refine with the RDF plugin. However, what I need now is a batch pipeline that I can plug into a bigger Java platform.

Therefore, I’m looking at Grafter. Even though I have never programmed in Clojure (or any other functional language whatsoever!), Grafter’s approach seems very sensible and intuitive. Additionally, I have always wanted to use Tawny-OWL, so probably it will be easier if I learn a bit of Clojure with Grafter first. Coming from Java/Perl/Python, the functional approach felt a bit weird in the beggining, but it actually makes more sense when defining pipelines to process data.

I have gone through the Grafter guide using Leiningen in Ubuntu 14.04. So far so good (I had to install Leiningen manually though, since Ubuntu’s Leiningen package was very outdated). In order to run the Grafter example in Eclipse (Mars), or any other Clojure program, one needs to install first the CounterClockWise plugin. Note that if you want to also use GitHub, like me, there is bug that prevents┬áthe project from being properly cloned, when you choose the “New project wizard”: I cloned with the General project wizard, copied the files from another Grafter project, and surprisingly it worked (trying to convert the project to Leiningen/Clojure didn’t work!).

My progress converting data obtained in Gipuzkoa Irekia to RDF can be seen at GitHub. Also, I’m aiming at adding Data Cube SPARQL constraints as Clojure test, here.

 

Tagged , , , , ,

The executable paper

This [1] is how all the papers should look like: an easy to execute workflow with all the datasets included, in order to be able to reproduce its science. It would avoid situations like the one described in this video.

[1] Dynamics of mitochondrial heteroplasmy in three families investigated via a repeatable re-sequencing study.

Tagged , , , ,