Datasharing guía práctica para compartir datos de investigación…

Statut

Datasharing: guía práctica para compartir datos de investigación :

“Asociar los datos de investigación a la publicación favorece que la comunidad científica los reutilice, pero no tiene suficientes garantías de preservación. Almacenarlos en bases de datos solventa esta contingencia y aporta visibilidad, pero en España no existen demasiados servicios de estas características. Por esta razón, se describen, evalúan y exponen los pros y contras de depósitos de datos multidisciplinares extranjeros que pueden ser de utilidad para investigadores y gestores de información: Dryad, Figshare, Zenodo y Dataverse. Todavía es pronto para escoger de forma óptima y definitiva entre una u otra aplicación, por lo que se concluye con unas recomendaciones que orienten a la comunidad de usuarios e intermediarios.”

“To associate research data to the published results favors their reuse by the scientific community, but this does not afford sufficient guarantees of preservation. To store them in databases solves this contingency and provides visibility, but in Spain there are not many services of this kind. For this reason, we describe, evaluate and discuss the pros and cons of foreign multidisciplinary data repositories that can be useful for researchers and information managers: Dryad, Figshare, Zenodo and Dataverse. It is still early to choose optimally and definitively one or the other application, so we conclude with recommendations to guide the user community and intermediaries.”

URL : http://eprints.rclis.org/20907/

Research Data Management Principles Practices and Prospects …

Statut

Research Data Management Principles, Practices, and Prospects :

“This report examines how research institutions are responding to data management requirements of the National Science Foundation, National Institutes of Health, and other federal agencies. It also considers what role, if any, academic libraries and the library and information science profession should have in supporting researchers’ data management needs.”

URL : http://www.clir.org/pubs/reports/pub160

The Open Access Divide This paper is…

Statut

The Open Access Divide :

“This paper is an attempt to review various aspects of the open access divide regarding the difference between those academics who support free sharing of data and scholarly output and those academics who do not. It provides a structured description by adopting the Ws doctrines emphasizing such questions as who, what, when, where and why for information-gathering. Using measurable variables to define a common expression of the open access divide, this study collects aggregated data from existing open access as well as non-open access publications including journal articles and extensive reports. The definition of the open access divide is integrated into the discussion of scholarship on a larger scale.”

URL : http://www.mdpi.com/2304-6775/1/3/113

Data reuse and the open data citation advantage…

Statut

Data reuse and the open data citation advantage :

Background: Attribution to the original contributor upon reuse of published data is important both as a reward for data creators and to document the provenance of research findings. Previous studies have found that papers with publicly available datasets receive a higher number of citations than similar studies without available data. However, few previous analyses have had the statistical power to control for the many variables known to predict citation rate, which has led to uncertain estimates of the “citation benefit”. Furthermore, little is known about patterns in data reuse over time and across datasets.

Method and Results: Here, we look at citation rates while controlling for many known citation predictors and investigate the variability of data reuse. In a multivariate regression on 10,555 studies that created gene expression microarray data, we found that studies that made data available in a public repository received 9% (95% confidence interval: 5% to 13%) more citations than similar studies for which the data was not made available. Date of publication, journal impact factor, open access status, number of authors, first and last author publication history, corresponding author country, institution citation history, and study topic were included as covariates. The citation benefit varied with date of dataset deposition: a citation benefit was most clear for papers published in 2004 and 2005, at about 30%. Authors published most papers using their own datasets within two years of their first publication on the dataset, whereas data reuse papers published by third-party investigators continued to accumulate for at least six years. To study patterns of data reuse directly, we compiled 9,724 instances of third party data reuse via mention of GEO or ArrayExpress accession numbers in the full text of papers. The level of third-party data use was high: for 100 datasets deposited in year 0, we estimated that 40 papers in PubMed reused a dataset by year 2, 100 by year 4, and more than 150 data reuse papers had been published by year 5. Data reuse was distributed across a broad base of datasets: a very conservative estimate found that 20% of the datasets deposited between 2003 and 2007 had been reused at least once by third parties.

Conclusion: After accounting for other factors affecting citation rate, we find a robust citation benefit from open data, although a smaller one than previously reported. We conclude there is a direct effect of third-party data reuse that persists for years beyond the time when researchers have published most of the papers reusing their own data. Other factors that may also contribute to the citation benefit are considered. We further conclude that, at least for gene expression microarray data, a substantial fraction of archived datasets are reused, and that the intensity of dataset reuse has been steadily increasing since 2003.”

URL : https://peerj.com/articles/175/

European Landscape Study of Research Data Management

The European Landscape Study of Research Data Management offers an overview of how to effectively support researchers in their data management. It looks at interventions by funding agencies, research institutions, national bodies and publishers across the European Union member states. The report also makes recommendations that organisations can adopt to help their researchers.

URL : https://www.sim4rdm.eu/sites/default/files/uploads/documents/SIM4RDM%20landscape%20report%20vs1%204_14.08.13.pdf

If We Share Data, Will Anyone Use Them? Data Sharing and Reuse in the Long Tail of Science and Technology

Statut

Research on practices to share and reuse data will inform the design of infrastructure to support data collection, management, and discovery in the long tail of science and technology. These are research domains in which data tend to be local in character, minimally structured, and minimally documented. We report on a ten-year study of the Center for Embedded Network Sensing (CENS), a National Science Foundation Science and Technology Center.

We found that CENS researchers are willing to share their data, but few are asked to do so, and in only a few domain areas do their funders or journals require them to deposit data. Few repositories exist to accept data in CENS research areas.. Data sharing tends to occur only through interpersonal exchanges. CENS researchers obtain data from repositories, and occasionally from registries and individuals, to provide context, calibration, or other forms of background for their studies. Neither CENS researchers nor those who request access to CENS data appear to use external data for primary research questions or for replication of studies.

CENS researchers are willing to share data if they receive credit and retain first rights to publish their results. Practices of releasing, sharing, and reusing of data in CENS reaffirm the gift culture of scholarship, in which goods are bartered between trusted colleagues rather than treated as commodities.

URL : If We Share Data, Will Anyone Use Them?

DOI : 10.1371/journal.pone.0067332

Toward Reproducible Computational Research: An Empirical Analysis of Data and Code Policy Adoption by Journals

Journal policy on research data and code availability is an important part of the ongoing shift toward publishing reproducible computational science. This article extends the literature by studying journal data sharing policies by year (for both 2011 and 2012) for a referent set of 170 journals.

We make a further contribution by evaluating code sharing policies, supplemental materials policies, and open access status for these 170 journals for each of 2011 and 2012.

We build a predictive model of open data and code policy adoption as a function of impact factor and publisher and find higher impact journals more likely to have open data and code policies and scientific societies more likely to have open data and code policies than commercial publishers.

We also find open data policies tend to lead open code policies, and we find no relationship between open data and code policies and either supplemental material policies or open access journal status.

Of the journals in this study, 38% had a data policy, 22% had a code policy, and 66% had a supplemental materials policy as of June 2012. This reflects a striking one year increase of 16% in the number of data policies, a 30% increase in code policies, and a 7% increase in the number of supplemental materials policies.

We introduce a new dataset to the community that categorizes data and code sharing, supplemental materials, and open access policies in 2011 and 2012 for these 170 journals.

URL : http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0067111