Reproducible Scientific Computing

I had to look it up, but I wholly approve of the i...

2015-08-10T11:33:19.185-07:00

I had to look it up, but I wholly approve of the idea!

In a similar way, the Go language is very particular about source code dependencies. As in most languages, you can't use a library unless you do "import X". But just as importantly, you aren't allowed to "import X" unless you actually use something from X! This cuts down on the problem of the every-growing set of include statements found in many C and C++ programs.

I'm reminded of 'implicit none' from F...

2015-08-10T10:54:32.159-07:00

I'm reminded of 'implicit none' from FORTRAN - scary...

Well, first a conference steering committee needs ...

2015-08-10T09:10:31.266-07:00

Well, first a conference steering committee needs to decide to mandate artifact evaluation. I am optimistic that this will happen sooner or later in a venue where common artifacts are valued but unusual hardware is not required. For example, this would be much easier to accomplish in ICSE or KDD where the concern is the algorithms and the output, as opposed to HPDC or SC, where the concern is usually the performance on unusual hardware.

If you really want to make it stick, the PC could require artifact evaluation to happen *before* technical evaluation. e.g. require brief instructions on how to start a VM, checkout the code, download the data sets, and run an example. If that can't be done, then return without review...

Gordon, I think you hit upon something really impo...

2015-08-10T09:03:58.281-07:00

Gordon, I think you hit upon something really important: the preservation tools need to be part of the work from the beginning. If you only start to think about preservation at the end, once the publication is made, then everyone has moved on and there will be little interest (time, funding) to do preservation tasks.

It follows that the preservation tools have to be used every single time, and they better not put up roadblocks to the work that you actually want to do.

I like the approach of interposing on the grid submission system, which is an example of "preserving the mess" that I posted today. And, it also has the advantage that, in order to run remotely, you need to already specify all the dependencies necessary to create the environment.

If every important task gets run in a batch system instead of locally (does it?) then that could be the right solution for HEP...

Sorry, I didn't read the linked post earlier. ...

2015-08-08T04:06:34.132-07:00

Sorry, I didn't read the linked post earlier. I do understand the reasoning for your choice, and shouldn't have said "dumping", but I still see something wrong with the idea of having 2 groups (classes) of reviewers, one that reviews the ideas/papers and one that reviews the artifacts.

2015-08-08T04:04:46.614-07:00

This comment has been removed by the author.

Shriram Krishnamurthi has sent in the following co...

2015-08-07T18:58:18.545-07:00

Shriram Krishnamurthi has sent in the following comment:

Thanks for the very nice article. You've certainly gotten to the
essence of the design.

Responding to @Dan: I deeply dislike this characterization of
“dumping” work onto students and new postdocs. (By the way, who said
students and postdocs aren't capable of at least as much creativity as
superannuated professors?) The justification for using younger people
was clearly articulated in the first essay I wrote about artifact
evaluation:

http://cs.brown.edu/~sk/Memos/Conference-Artifact-Evaluation/

Look for “AEC Composition”. It lays out several points in detail.

But yes, the upcoming Dagstuhl meeting (which I'm co-organizing) will
indeed be glad to talk about this if there's interest from the
attendees. You're welcome to bring it up.

I've been pleased to see artifact evaluation c...

2015-07-25T04:56:48.708-07:00

I've been pleased to see artifact evaluation catch on in this area, but the fact that we dump the work on students and new postdocs seems to say that it's just a box we need checked, not something we see as an area for creativity and innovation. I hope the upcoming Dagstuhl meeting (http://www.dagstuhl.de/en/program/calendar/semhp/?semnr=15452) will think about this.

Replicating someone else's computational work ...

2015-07-16T00:55:21.055-07:00

Replicating someone else's computational work is indeed of no interest. Reproducing, i.e. reimplementing the ideas from scratch, is mainly of interest for gaining better understanding, and therefore as much part of education and training as of doing science.

The big advantage that computational scientists have over experimentalists is that replication of computations is a purely technical issue (see https://khinsen.wordpress.com/2014/08/27/reproducibility-replicability-and-the-two-layers-of-computational-science/ and http://dx.doi.org/10.12688/f1000research.5773.3 for an explanation). In the long run, it can and should be delegated to technology. For reasonably short computations, a replication server could accept submissions (complete records of computational work), replicate them, and issue some certificate. Once certain tools for ensuring replicability are trusted by the community, explicit replication could be reduced or even given up. A trusted tool chain would also solve the problem of huge computations: they would be considered replicable because of the use of trusted tools.

Kristin, my guess is that most academics would agr...

2015-06-29T11:17:18.086-07:00

Kristin, my guess is that most academics would agree at least in principle with this statement of values. But, it seems that our review and publication practices don't reward this level of investigation. Reviewers largely assume that a publication is submitted without deception, and largely focus on the novelty and correctness of the written component. Do we need some additional incentives to look at correctness and buildability of the technology described by the paper?

Eric, I share your concern about the focus of our ...

2015-06-29T11:00:16.079-07:00

Eric, I share your concern about the focus of our profession. Computer science as an academic discipline spends a lot of effort producing technology, training students, and burning coal that ultimately supports nothing more than the optimization of advertising within social networks. Our educational systems could do a better job of demonstrating more important applications of computing.

But there is the converse problem: we have seen many times that specialized areas of computing cannot take advantage of economies of scale, and are better off riding the commodity wave. Specialized supercomputers have largely been supplanted by commodity clusters. GPU technology driven by video games has far better energy/performance density than any bespoke parallel hardware. Soldiers in the gulf war preferred using personal smartphones to military-issue GPS units. More examples abound.

So, can we advance science, sustainability, governance, and public health using commodity technologies? Or do they need something fundamentally different?

Dan, Thanks for the pointer to ORCID. I see that...

2015-05-31T14:35:48.439-07:00

Dan,

Thanks for the pointer to ORCID.

I see that NSF's Fastlane includes the option to list one's ORCID identity.

Do you know if this has existed for a while (and I have just noticed it) or whether this is a new feature?

If it has been around for an extended period, has there been any analysis of its uptake / impact?

You can think of ORCID just as a unique person ID....

2015-05-26T11:56:25.278-07:00

You can think of ORCID just as a unique person ID. Or, you can use it to store a profile. You don't have to do the latter to take advantage of the former.

ORCID is a good idea, and they have a nice infrast...

2015-05-26T07:14:32.241-07:00

ORCID is a good idea, and they have a nice infrastructure, but I was surprised that after creating a profile (http://orcid.org/0000-0001-5218-1956) it became *my* job to upload papers to their system and associate them to my ID. (This is now the fourth or fifth website where I'm expected to maintain my publications!) The ecosystem will work better when publishers collect ORCIDs at submission time and take on the role of associating ORCIDs to DOIs.

I wonder why you decided to start with reconstruct...

2015-05-26T07:04:44.425-07:00

I wonder why you decided to start with reconstruction. One might equally well choose to start with a new run of the same detector, or even with a new detector. The question of where reproducibility starts seems like something that should always be asked.

There are certainly cases in CS where the datasets...

2015-05-26T06:02:55.843-07:00

There are certainly cases in CS where the datasets are small, and so dumping everything into one VM is a perfectly good solution. And you are quite right about incentives.

But if you look at the real sciences, the data situation is different. In bioinformatics, you frequently have thousands of analyses performed on a common dataset of tens to hundreds of gigabytes. You can squeeze 100GB into a VM, but you wouldn't want to duplicate it once for each analysis code, so you need a way to archive the data independently of the code, be able to refer to it, mount it at runtime, and verify that you have the right thing. Same thing in astronomy at larger scales (100s of TB) and high energy physics (100s of PBs) and many other fields as well.

I've had some experience with this in a variet...

2015-05-25T14:47:43.573-07:00

I've had some experience with this in a variety of contexts.
I don't think the problem is tooling, because apt and conda cover a wide variety of situations for deploying software.
Sometimes data is a problem, but for most computer science papers I read the data sets aren't that large.
For many of the papers (and some textbooks) I've read, reproducibility could be solved with a virtual machine snapshot. The fact that this isn't done very often is telling. The real problem is the lack of incentive to make work reproducible.

You should look at ORCID - http://orcid.org

2015-05-25T05:08:04.820-07:00

You should look at ORCID - http://orcid.org

2015-05-25T05:07:24.423-07:00

This comment has been removed by the author.

One way that reproducibility is sometimes pushed i...

2015-05-19T07:22:22.723-07:00

One way that reproducibility is sometimes pushed is that I need to make sure my work is reproducible by the future me. This seems appealing to me.

In talking with colleagues in a variety of scienti...

2015-05-19T06:34:51.358-07:00

In talking with colleagues in a variety of scientific fields using computation, I find that there is almost no interest in reproducing another PI's published work simply for the purpose of validating or disproving it, in a competitive sense. If validated, nothing new is learned. If not validated, then one merely enters into a complex argument about whether the reproduction was correct.

(The value judgement is probably different in non-computational science: multiple trials will improve the confidence interval on the question or whether a particular antibiotic works or not, and so reproducibility has an easily quantified value.)

But, there is considerable interest in reproducibility for the sake of efficient collaboration! That is, if I have a student that sets up some software and system to run a particular workload efficiently, then I want to be able to pass that setup off to other people and hope they have a good chance of getting it running in the next day, rather than the next month. If my algorithm is easy to reproduce, then someone is likely to use it as a comparison point for new work, resulting in more interest and citations. I see a lot of people who would like that.

If the cost of reproducibility between collaborators becomes low, perhaps interest in reproducibility between competitors would become of more interest?