Search CORE

1,199 research outputs found

Recommended from our members

Reproducible Research: Addressing the Need for Data and Code Sharing in Computational Science

Author: Stodden Victoria C.
Publication venue
Publication date: 01/01/2010
Field of study

Roundtable participants identified ways of making computational research details readily available, which is a crucial step in addressing the current credibility crisis

Columbia University Academic Commons

Recommended from our members

White Paper for Expert Panel Discussion on Data Policies: A Workshop of the National Science Board, March 27-29, 2011

Author: Stodden Victoria C.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2011
Field of study

In our workshop charge we were invited to read three reports that formed the basis for the NSB“approved Data Policies Task Force's "Statement of Principles," providing the starting point for this workshop. I take a contrarian perspective and challenge the assumption in all these documents that open data is a foundational component of the scientific endeavor. Instead, I argue that the framing principle should be the reproducibility of computational results, from which open data (along with open code) falls as a natural corollary. In this note I highlight six implications of the framing of reproducible research as a guiding principle for science policy in the digital age

Columbia University Academic Commons

Recommended from our members

The Scientific Method in Practice: Reproducibility in the Computational Sciences

Author: Stodden Victoria C.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2010
Field of study

Since the 1660's the scientific method has included reproducibility as a mainstay in its effort to root error from scientific discovery. With the explosive growth of digitization in scientific research and communication, it is easier than ever to satisfy this requirement. In computational research experimental details and methods can be recorded in code and scripts, data is digital, papers are frequently online, and the result is the potential for "really reproducible research." Imagine the ability to routinely inspect code and data and recreate others' results: Every step taken to achieve the findings can potentially be transparent. Now imagine anyone with an Internet connection and the capability of running the code being able to do this. This paper investigates the obstacles blocking the sharing of code and data to understand conditions under which computational scientists reveal their full research compendium. A survey of registrants at a top machine learning conference (NIPS) was used to discover the strength of underlying factors that affect the decision to reveal code, data, and ideas. Sharing of code and data is becoming more common as about a third of respondents post some on their websites, and about 85% self report to have some code or data publicly available on the web. Contrary to theoretical expectations, the decision to share work is grounded in communitarian norms, although when work remains hidden private incentives dominate the decision. We find that code, data, and ideas are each regarded differently in terms of how they are revealed and that guidance from scientific norms varies with pervasiveness of computation in the field. The largest barriers to sharing are time involved in preparation of work and the legal Intellectual Property framework scientists face. This paper does two things. It provides evidence in the debate about whether scientists' research revealing behavior is wholly governed by considerations of personal impact or whether the reasoning behind the revealing decision involves larger scientific ideals, and secondly, this research describes the actual sharing behavior in the Machine Learning community

Columbia University Academic Commons

Recommended from our members

Reproducible Research in Computational Harmonic Analysis

Author: Stodden Victoria C.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2009
Field of study

Scientific computation is emerging as absolutely central to the scientific method. Unfortunately, it's error-prone and currently immature—traditional scientific publication is incapable of finding and rooting out errors in scientific computation—which must be recognized as a crisis. An important recent development and a necessary response to the crisis is reproducible computational research in which researchers publish the article along with the full computational environment that produces the results. The authors have practiced reproducible computational research for 15 years and have integrated it with their scientific research and with doctoral and postdoctoral education. In this article, they review their approach and how it has evolved over time, discussing the arguments for and against working reproducibly

Columbia University Academic Commons

Capturing the "Whole Tale" of Computational Research: Reproducibility in Computing Environments

Author: Chard Kyle
Gaffney Niall
Jones Matthew B.
Ludaescher Bertram
Nabrzyski Jaroslaw
Stodden Victoria
Turk Matthew
Publication venue
Publication date: 28/10/2016
Field of study

We present an overview of the recently funded "Merging Science and Cyberinfrastructure Pathways: The Whole Tale" project (NSF award #1541450). Our approach has two nested goals: 1) deliver an environment that enables researchers to create a complete narrative of the research process including exposure of the data-to-publication lifecycle, and 2) systematically and persistently link research publications to their associated digital scholarly objects such as the data, code, and workflows. To enable this, Whole Tale will create an environment where researchers can collaborate on data, workspaces, and workflows and then publish them for future adoption or modification. Published data and applications will be consumed either directly by users using the Whole Tale environment or can be integrated into existing or future domain Science Gateways

arXiv.org e-Print Archive

The Francis Crick Institute

Recommended from our members

Open science: policy implications for the evolving phenomenon of user-led scientific innovation

Author: Stodden Victoria C.
Publication venue
Publication date: 01/01/2010
Field of study

From contributions of astronomy data and DNA sequences to disease treatment research, scientific activity by non-scientists is a real and emergent phenomenon, and raising policy questions. This involvement in science can be understood as an issue of access to publications, code, and data that facilitates public engagement in the research process, thus appropriate policy to support the associated welfare enhancing benefits is essential. Current legal barriers to citizen participation can be alleviated by scientists' use of the "Reproducible Research Standard," thus making the literature, data, and code associated with scientific results accessible. The enterprise of science is undergoing deep and fundamental changes, particularly in how scientists obtain results and share their work: the promise of open research dissemination held by the Internet is gradually being fulfilled by scientists. Contributions to science from beyond the ivory tower are forcing a rethinking of traditional models of knowledge generation, evaluation, and communication. The notion of a scientific "peer" is blurred with the advent of lay contributions to science raising questions regarding the concepts of peer-review and recognition. New collaborative models are emerging around both open scientific software and the generation of scientific discoveries that bear a similarity to open innovation models in other settings. Public engagement in science can be understood as an issue of access to knowledge for public involvement in the research process, facilitated by appropriate policy to support the welfare enhancing benefits deriving from citizen-science

Columbia University Academic Commons

Recommended from our members

Trust Your Science? Open Your Data and Code

Author: Stodden Victoria C.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2011
Field of study

This is a view on the reproducibility of computational sciences by Victoria Stodden. It contains information on the Reproducibility, Replicability, and Repeatability of code created by the other sciences. Stodden also talks about the rising prominence of computational sciences as we are in the digital age and what that means for the future of science and collecting data

Columbia University Academic Commons

Developmental change in motor competence : a latent growth curve analysis

Author: Bardid Farid
Coppens Eline
D'Hondt Eva
Deconinck Frederik
Haerens Leen
Lenoir Matthieu
Stodden David
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2019
Field of study

Background: The development of childhood motor competence demonstrates a high degree of inter-individual variation. Some children's competence levels increase whilst others' competence levels remain unchanged or even decrease over time. However, few studies have examined this developmental change in motor competence across childhood and little is known on influencing factors. Aim: Using latent growth curve modeling (LGCM), the present longitudinal study aimed to investigate children's change in motor competence across a 2-year timespan and to examine the potential influence of baseline weight status and physical fitness on their trajectory of change in motor competence. Methods: 558 children (52.5% boys) aged between 6 and 9 years participated in this study. Baseline measurements included weight status, motor competence (i.e., Korperkoordinationstest fur Kinder; KTK) and physical fitness (i.e., sit and reach, standing long jump and the 20 m shuttle run test). Motor competence assessment took place three times across a 2-year timespan. LGCM was conducted to examine change in motor competence over time. Results: The analyses showed a positive linear change in motor competence across 2 years (beta = 28.48, p < 0.001) with significant variability in children's individual trajectories (p < 0.001). Girls made less progress than boys (beta = -2.12, p = 0.01). Children who were older at baseline demonstrated less change in motor competence (beta = -0.33, p < 0.001). Weight status at baseline was negatively associated with change in motor competence over time (beta = -1.418, p = 0.002). None of the physical fitness components, measured at baseline, were significantly associated with change in motor competence over time. Conclusion and Implications: This longitudinal study reveals that weight status significantly influences children's motor competence trajectories whilst physical fitness demonstrated no significant influence on motor competence trajectories. Future studies should further explore children's differential trajectories over time and potential factors influencing that change

University of Strathclyde Institutional Repository

Ghent University Academic Bibliography

Archivsystem Ask23

Recommended from our members

A Global Empirical Evaluation of New Communication Technology Use and Democratic Tendency

Author: Stodden Victoria C.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2009
Field of study

Is the dramatic increase in Internet use associated with a commensurate rise in democracy? Few previous studies have drawn on multiple perception-based measures of governance to assess the Internets effects on the process of democratization. This paper uses perception-based time series data on "Voice & Accountability," "Political Stability," and "Rule of Law" to provide insights into democratic tendency. The results of regression analysis suggest that the level of "Voice & Accountability" in a country increases with Internet use, while the level of "Political Stability" decreases with increasing Internet use. Additionally, Internet use was found to increase significantly for countries with increasing levels of "Voice & Accountability" In contrast, "Rule of Law" was not significantly affected by a country's level of Internet use. Increasing cell phone use did not seem to affect either "Voice & Accountability", "Political Stability" or "Rule of Law." In turn, cell phone use was not affected by any of these three measures of democratic tendency. When limiting our analysis to autocratic regimes, we noted a significant negative effect of Internet and cell phone use on "Political Stability" and found that the "Rule of Law" and "Political Stability" metrics drove ICT adoption

Columbia University Academic Commons

Recommended from our members

Breakdown Point of Model Selection When the Number of Variables Exceeds the Number of Observations

Author: Donoho David L.
Stodden Victoria C.
Publication venue
Publication date: 01/01/2006
Field of study

The classical multivariate linear regression problem assumes p variables X1, X2, ... , Xp and a response vector y, each with n observations, and a linear relationship between the two: y = X beta + z, where z ~ N(0, sigma2). We point out that when p > n, there is a breakdown point for standard model selection schemes, such that model selection only works well below a certain critical complexity level depending on n/p. We apply this notion to some standard model selection algorithms (Forward Stepwise, LASSO, LARS) in the case where pGtn. We find that 1) the breakdown point is well-de ned for random X-models and low noise, 2) increasing noise shifts the breakdown point to lower levels of sparsity, and reduces the model recovery ability of the algorithm in a systematic way, and 3) below breakdown, the size of coefficient errors follows the theoretical error distribution for the classical linear model

Columbia University Academic Commons