Software simplified

Nature, Vol. 546, No. 7656. (29 May 2017), pp. 173-174,


Containerization technology takes the hassle out of setting up software and can boost the reproducibility of data-driven research. [Excerpt] [...] Containers are essentially lightweight, configurable virtual machines — simulated versions of an operating system and its hardware, which allow software developers to share their computational environments. Researchers use them to distribute complicated scientific software systems, thereby allowing others to execute the software under the same conditions that its original developers used. In doing so, containers can remove one source of variability in ...


Gotchas in writing Dockerfile



[Excerpt: Why do we need to use Dockerfile?] Dockerfile is not yet-another shell. Dockerfile has its special mission: automation of Docker image creation. [\n] Once, you write build instructions into Dockerfile, you can build the same image just with docker build command. [\n] Dockerfile is also useful to tell the knowledge of what a job the container does to somebody else. Your teammates can tell what the container is supposed to do just by reading Dockerfile. They don’t need to know login to the ...


An introduction to Docker for reproducible research, with examples from the R environment

ACM SIGOPS Operating Systems Review, Vol. 49, No. 1. (2 Oct 2014), pp. 71-79,


As computational work becomes more and more integral to many aspects of scientific research, computational reproducibility has become an issue of increasing importance to computer systems researchers and domain scientists alike. Though computational reproducibility seems more straight forward than replicating physical experiments, the complex and rapidly changing nature of computer environments makes being able to reproduce and extend such work a serious challenge. In this paper, I explore common reasons that code developed for one research project cannot be successfully executed or extended by subsequent researchers. I review current ...


Using Docker to support reproducible research



Reproducible research is a growing movement among scientists, but the tools for creating sustainable software to support the computational side of research are still in their infancy and are typically only being used by scientists with expertise in com- puter programming and system administration. Docker is a new platform developed for the DevOps community that enables the easy creation and management of consistent computational environments. This article describes how we have applied it to computational science and suggests that it could ...

