Open biological data are distributed over many resources making them challenging to integrate, to update and to disseminate quickly. Wikidata is a growing, open community database which can serve this purpose and also provides tight integration with Wikipedia.
In order to improve the state of biological data, facilitate data management and dissemination, we imported all human and mouse genes, and all human and mouse proteins into Wikidata.
In total, 59 721 human genes and 73 355 mouse genes have been imported from NCBI and 27 306 human proteins and 16 728 mouse proteins have been imported from the Swissprot subset of UniProt. As Wikidata is open and can be edited by anybody, our corpus of imported data serves as the starting point for integration of further data by scientists, the Wikidata community and citizen scientists alike.
The first use case for these data is to populate Wikipedia Gene Wiki infoboxes directly from Wikidata with the data integrated above. This enables immediate updates of the Gene Wiki infoboxes as soon as the data in Wikidata are modified.
Although Gene Wiki pages are currently only on the English language version of Wikipedia, the multilingual nature of Wikidata allows for usage of the data we imported in all 280 different language Wikipedias.
Apart from the Gene Wiki infobox use case, a SPARQL endpoint and exporting functionality to several standard formats (e.g. JSON, XML) enable use of the data by scientists.
In summary, we created a fully open and extensible data resource for human and mouse molecular biology and biochemistry data. This resource enriches all the Wikipedias with structured information and serves as a new linking hub for the biological semantic web.
URL : Wikidata as a semantic framework for the Gene Wiki initiative
DOI : 10.1093/database/baw015
There is a growing movement to encourage reproducibility and transparency practices in the scientific community, including public access to raw data and protocols, the conduct of replication studies, systematic integration of evidence in systematic reviews, and the documentation of funding and potential conflicts of interest.
In this survey, we assessed the current status of reproducibility and transparency addressing these indicators in a random sample of 441 biomedical journal articles published in 2000–2014. Only one study provided a full protocol and none made all raw data directly available. Replication studies were rare (n = 4), and only 16 studies had their data included in a subsequent systematic review or meta-analysis. The majority of studies did not mention anything about funding or conflicts of interest.
The percentage of articles with no statement of conflict decreased substantially between 2000 and 2014 (94.4% in 2000 to 34.6% in 2014); the percentage of articles reporting statements of conflicts (0% in 2000, 15.4% in 2014) or no conflicts (5.6% in 2000, 50.0% in 2014) increased.
Articles published in journals in the clinical medicine category versus other fields were almost twice as likely to not include any information on funding and to have private funding. This study provides baseline data to compare future progress in improving these indicators in the scientific literature.
URL : Reproducible Research Practices and Transparency across the Biomedical Literature
DOI : 10.1371/journal.pbio.1002333
There is increasing support for sharing individual-level data generated by medical and public health research. This scoping review of empirical research and conceptual literature examined stakeholders’ perspectives of ethical best practices in data sharing, particularly in low- and middle-income settings. Sixty-nine empirical and conceptual articles were reviewed, of which, only five were empirical studies and eight were conceptual articles focusing on low- and middle-income settings.
We conclude that support for sharing individual-level data is contingent on the development and implementation of international and local policies and processes to support ethical best practices. Further conceptual and empirical research is needed to ensure data sharing policies and processes in low- and middle-income settings are appropriately informed by stakeholders’ perspectives.
URL : Views of Ethical Best Practices in Sharing Individual-Level Data From Medical and Public Health Research
Alternative location : http://m.jre.sagepub.com/content/10/3/225
It is increasingly recognized that effective and appropriate data sharing requires the development of models of good datasharing practice capable of taking seriously both the potential benefits to be gained and the importance of ensuring that the rights and interests of participants are respected and that risk of harms is minimized. Calls for the greater sharing of individual-level data from biomedical and public health research are receiving support among researchers and research funders. Despite its potential importance, data sharing presents important ethical, social, and institutional challenges in low-income settings.
In this article, we report on qualitative research conducted in five low- and middle-income countries exploring the experiences of key research stakeholders and their views about what constitutes good data-sharing practice.
URL : Sharing Public Health Research Data
Alternative location : http://m.jre.sagepub.com/content/10/3/217