$$%% examples \newcommand{\exGraph}{\graph_{\mathrm{ex}}} \newcommand{\exOnto}{\onto_{\mathrm{ex}}} \newcommand{\exMappings}{\mappings_{\mathrm{ex}}} \newcommand{\exExtensions}{\extensions_{\mathrm{ex}}} \newcommand{\exRule}{r_{\mathrm{ex}}} \newcommand{\RDFSrules}{\rules_{\mathrm{RDFS}}} %% RDF \newcommand{\triple}[3]{(#1, #2, #3)} \newcommand{\tuple}[1]{\langle #1 \rangle} \newcommand{\subject}{\mathtt{s}} \newcommand{\prop}{\mathtt{p}} \newcommand{\object}{\mathtt{o}} \newcommand{\blank}{\_{:}b} \newcommand{\blankn}[1]{\_{:}#1} \newcommand{\irin}[1]{{:}\mathrm{#1}} \newcommand{\class}{\mathtt{c}} \newcommand{\nsrdf}{\mathrm{rdf{:}}} \newcommand{\nsrdfs}{\mathrm{rdfs{:}}} \newcommand{\rdftype}{\mathrm{rdf{:}type}} \newcommand{\rdfLiteral}{\mathrm{rdf{:}Literal}} \newcommand{\rdfssubClassOf}{\mathrm{rdfs{:}subClassOf}} \newcommand{\rdfssubPropertyOf}{\mathrm{rdfs{:}subPropertyOf}} \newcommand{\rdfsdomain}{\mathrm{rdfs{:}domain}} \newcommand{\rdfsrange}{\mathrm{rdfs{:}range}} \newcommand{\rdfsClass}{\mathrm{rdfs{:}Class}} \newcommand{\rdfProperty}{\mathrm{rdf{:}Property}} \newcommand{\xsdint}{\mathrm{xsd{:}int}} %% \newcommand{\type}{\tau} \newcommand{\subclass}{\prec_{sc}} \newcommand{\subproperty}{\prec_{sp}} \newcommand{\domain}{\hookleftarrow_{d}} \newcommand{\range}{\hookrightarrow_{r}} \newcommand{\rdfentailment}{\vdash_{^\mathrm{RDF}}} \newcommand{\RDFS}[1]{\mathrm{RDFS}(#1)} \newcommand{\aka}{a.k.a.~} \newcommand{\etc}{etc} \newcommand{\wrt}{w.r.t.~} \newcommand{\st}{s.t.~} \newcommand{\ie}{i.e.,~} \newcommand{\eg}{e.g.,~} \newcommand{\graph}{G} \newcommand{\rules}{\mathcal{R}} \newcommand{\sources}{\mathcal{S}} \newcommand{\views}{\mathcal{V}} \newcommand{\extensions}{\mathcal{E}} \newcommand{\onto}{\mathcal{O}} \newcommand{\mappings}{\mathcal{M}} \newcommand{\modelsrdf}{\models_\rules} \newcommand{\bgp}{P} \newcommand{\Bl}[1]{\mathrm{Bl}(#1)} \newcommand{\Val}[1]{\mathrm{Val}(#1)} \newcommand{\Var}[1]{\mathrm{Var(#1)}} \newcommand{\ext}[1]{\mathrm{ext}(#1)} \newcommand{\cert}{\mathrm{cert}} \newcommand{\ans}{\mathrm{ans}} \newcommand{\query}{\leftarrow} \newcommand{\body}[1]{\textrm{body}(#1)} \newcommand{\head}[1]{\textrm{head}(#1)} \newcommand{\cs}{\mathrm{cs}} \newcommand{\lcs}{\mathrm{lcs}} \newcommand{\cl}{\mathrm{cl}} \newcommand{\lua}{\mathrm{lua}} \newcommand{\lur}{\mathrm{lur}} \newtheorem{lemma}{Lemma} \newtheorem{definition}{Definition} \newtheorem{problem}{Problem} \newtheorem{property}{Property} \newtheorem{corollary}{Corollary} \newtheorem{example}{Example} \newtheorem{theorem}{Theorem} \newcommand{\URIs}{\mathscr U} \newcommand{\IRIs}{\mathscr I} \newcommand{\BNodes}{\mathscr B} \newcommand{\Literals}{\mathscr L} \newcommand{\Variables}{\mathscr V} % DB \newcommand{\CQ}{\ensuremath{\mathtt{CQ}}\xspace} \newcommand{\UCQ}{\ensuremath{\mathtt{UCQ}}\xspace} \newcommand{\SQL}{\ensuremath{\mathtt{SQL}}\xspace} \newcommand{\rel}[1]{\mathsf{#1}} % Cost model \newcommand{\cans}[1]{|#1|_t} \newcommand{\cref}[1]{|#1|_r} \newcommand{\db}{\mathtt{db}} % DL \newcommand{\cn}{\ensuremath{N_{C}}\xspace} \newcommand{\rn}{\ensuremath{N_{R}}\xspace} \newcommand{\inds}{\ensuremath{N_{I}}\xspace} \newcommand{\ainds}{\ensuremath{\mathrm{Ind}}\xspace} \newcommand{\funct}{\mathit{funct} \ } \newcommand{\KB}{\mathcal{K}\xspace} \newcommand{\dlr}{DL-Lite$_{\mathcal{R}}$\xspace} % Logics \newcommand{\FOL}{\ensuremath{\mathtt{FOL}}\xspace} \newcommand{\datalog}{\ensuremath{\mathtt{Datalog}}\xspace} \newcommand{\dllite}{DL-Lite\xspace} \newcommand{\true}{\mathrm{true}} \newcommand{\false}{\mathrm{false}} \newcommand{\dis}{\mathtt{dis}} \newcommand{\vars}[1]{\ensuremath{\mathrm{vars}(#1)}} %\newcommand{\terms}[1]{\ensuremath{\mathrm{terms}(#1)}} %math \renewcommand{\phi}{\varphi} \newcommand\eqdef{\stackrel{\mathclap{\normalfont\mbox{def}}}{=}} \newcommand\restr[2]{#1_{|#2}} \newcommand{\ontoBody}[1]{\mathrm{body}_\onto(#1)} %proof of the rewriting theorem \newcommand{\rdfGraph}{\graph^{\mappings}_{\extensions}} \newcommand\systemGraph{\graph^{\mappings \cup \mappings^{\text{STD}}_\onto}_{\extensions \cup \extensions_\onto}} \newcommand\viewsGraph{\graph^{\mappings^{\rules,\onto} \cup \mappings^{\text{STD}}_\onto}_{\extensions \cup \extensions_\onto}} \newcommand{\standMappings}{\mappings^{\text{STD}}_\onto} \newcommand{\reminder}[1]{[\vadjust{\vbox to0pt{\vss\hbox to0pt{\hss{\Large $\Longrightarrow$}}}}{{\textsf{\small #1}}}]} %\newcommand{\FG}[1]{\textcolor{blue}{\reminder{FG:~#1}}} \newcommand{\extVersion}{false} \newcommand{\printIfExtVersion}[2] { \ifthenelse{\equal{\extVersion}{true}}{#1}{} \ifthenelse{\equal{\extVersion}{false}}{#2}{} } \newcommand{\bda}{\true} \newcommand{\ifBDA}[2]% {% \ifthenelse{\equal{\bda}{true}}{#1}{}% \ifthenelse{\equal{\bda}{false}}{#2}{}% } %%% Local Variables: %%% TeX-master: "paper" %%% End: $$

Reproducibility

Juliana Freire has done a presentation about reproducibility: Using Provenance to Support Computational Reproducibility. She explains the main topics in reproducibility in data science.

Open Source

The first thing to ensure to done reproducible experiment is to be abled to share all resources need for doing the experiment. You need at least that your code and each of its dependencies are under licences allowing share.

Code

Code depositories are the key for publishing source code (git) or binairies (maven).

Environment

Environment also has to be reproducible and sharable. To do so, we need to:

  1. identify the environment parts that involve into the experiment
  2. make it reproducible

Most of the time, environment reproducibility comes with a virtual machine:

  • Docker container
  • ReproZip allows to pack your research project by selecting depencies and rerun the experiment in a virtual machine that contains the same environement.

Workflows

Workflow Trails

  • Vistrails is a graphical development tools for visualisation machine learning tasks, that keeps a trace of modification you have done of your workflow.

Literal Programming

  • org mode using Babel in Emacs
  • jupiter notebook for python
  • Observable for javascript