Mining and Analyzing the Future Works in Scientific Articles

Statut

“Future works in scientific articles are valuable for researchers and they can guide researchers to new research directions or ideas. In this paper, we mine the future works in scientific articles in order to 1) provide an insight for future work analysis and 2) facilitate researchers to search and browse future works in a research area.

First, we study the problem of future work extraction and propose a regular expression based method to address the problem. Second, we define four different categories for the future works by observing the data and investigate the multi-class future work classification problem. Third, we apply the extraction method and the classification model to a paper dataset in the computer science field and conduct a further analysis of the future works.

Finally, we design a prototype system to search and demonstrate the future works mined from the scientific papers. Our evaluation results show that our extraction method can get high precision and recall values and our classification model can also get good results and it outperforms several baseline models. Further analysis of the future work sentences also indicates interesting results.”

URL : http://arxiv.org/abs/1507.02140

Wikidata through the Eyes of DBpedia

Statut

“DBpedia is one of the first and most prominent nodes of the Linked Open Data cloud. It provides structured data for more than 100 Wikipedia language editions as well as Wikimedia Commons, has a mature ontology and a stable and thorough Linked Data publishing lifecycle. Wikidata, on the other hand, has recently emerged as a user curated source for structured information which is included in Wikipedia.

In this paper, we present how Wikidata is incorporated in the DBpedia ecosystem. Enriching DBpedia with structured information from Wikidata provides added value for a number of usage scenarios. We outline those scenarios and describe the structure and conversion process of the DBpediaWikidata dataset.”

URL : Wikidata through the Eyes of DBpedia

Related URL : http://arxiv.org/abs/1507.04180

 

Barriers to Initiation of Open Source Software Projects in Libraries

Statut

“Libraries share a number of core values with the Open Source Software (OSS) movement, suggesting there should be a natural tendency toward library participation in OSS projects. However Dale Askey’s 2008 Code4Lib column entitled “We Love Open Source Software. No, You Can’t Have Our Code,” claims that while libraries are strong proponents of OSS, they are unlikely to actually contribute to OSS projects. He identifies, but does not empirically substantiate, six barriers that he believes contribute to this apparent inconsistency.

In this study we empirically investigate not only Askey’s central claim but also the six barriers he proposes. In contrast to Askey’s assertion, we find that initiation of and contribution to OSS projects are, in fact, common practices in libraries. However, we also find that these practices are far from ubiquitous; as Askey suggests, many libraries do have opportunities to initiate OSS projects, but choose not to do so. Further, we find support for only four of Askey’s six OSS barriers. Thus, our results confirm many, but not all, of Askey’s assertions.”

URL : http://journal.code4lib.org/articles/10665

From Peer-Reviewed to Peer-Reproduced in Scholarly Publishing: The Complementary Roles of Data Models and Workflows in Bioinformatics

Motivation

Reproducing the results from a scientific paper can be challenging due to the absence of data and the computational tools required for their analysis. In addition, details relating to the procedures used to obtain the published results can be difficult to discern due to the use of natural language when reporting how experiments have been performed.

The Investigation/Study/Assay (ISA), Nanopublications (NP), and Research Objects (RO) models are conceptual data modelling frameworks that can structure such information from scientific papers. Computational workflow platforms can also be used to reproduce analyses of data in a principled manner. We assessed the extent by which ISA, NP, and RO models, together with the Galaxy workflow system, can capture the experimental processes and reproduce the findings of a previously published paper reporting on the development of SOAPdenovo2, a de novo genome assembler.

Results

Executable workflows were developed using Galaxy, which reproduced results that were consistent with the published findings. A structured representation of the information in the SOAPdenovo2 paper was produced by combining the use of ISA, NP, and RO models. By structuring the information in the published paper using these data and scientific workflow modelling frameworks, it was possible to explicitly declare elements of experimental design, variables, and findings.

The models served as guides in the curation of scientific information and this led to the identification of inconsistencies in the original published paper, thereby allowing its authors to publish corrections in the form of an errata.”

URL : From Peer-Reviewed to Peer-Reproduced in Scholarly Publishing: The Complementary Roles of Data Models and Workflows in Bioinformatics

DOI : 10.1371/journal.pone.0127612

Fair Shares and Sharing Fairly: A Survey of Public Views on Open Science, Informed Consent and Participatory Research in Biobanking

Statut

Context

Biobanks are important resources which enable large-scale genomic research with human samples and data, raising significant ethical concerns about how participants’ information is managed and shared. Three previous studies of the Canadian public’s opinion about these topics have been conducted. Building on those results, an online survey representing the first study of public perceptions about biobanking spanning all Canadian provinces was conducted. Specifically, this study examined qualitative views about biobank objectives, governance structure, control and ownership of samples and data, benefit sharing, consent practices and data sharing norms, as well as additional questions and ethical concerns expressed by the public.

Results

Over half the respondents preferred to give a one-time general consent for the future sharing of their samples among researchers. Most expressed willingness for their data to be shared with the international scientific community rather than used by one or more Canadian institutions. Whereas more respondents indicated a preference for one-time general consent than any other model of consent, they constituted less than half of the total responses, revealing a lack of consensus among survey respondents regarding this question. Respondents identified biobank objectives, governance structure and accountability as the most important information to provide participants.

Respondents’ concerns about biobanking generally centred around the control and ownership of biological samples and data, especially with respect to potential misuse by insurers, the government and other third parties. Although almost half the respondents suggested that these should be managed by the researchers’ institutions, results indicate that the public is interested in being well-informed about these projects and suggest the importance of increased involvement from participants. In conclusion, the study discusses the viability of several proposed models for informed consent, including e-governance, independent trustees and the use of exclusion clauses, in the context of these new findings about the views of the Canadian public.”

URL : Fair Shares and Sharing Fairly: A Survey of Public Views on Open Science, Informed Consent and Participatory Research in Biobanking

DOI : 10.1371/journal.pone.0129893

Much Obliged: Analyzing the Importance and Impact of Acknowledgements in Scholarly Communication

Statut

“Author rights, peer review, open access, and the role of institutional repositories have all come under scrutiny by scholars, librarians, and legal experts in the last decade. Much of the conversation is centered on liberating information from the confinements of legal, financial, and hierarchical restraints. The relevancy of traditional citation analysis too, understood within the framework of an h-Index and Eigen factor, is under scrutiny with the rise of altmetrics. Collectively, these issues form the core of the scholarly communication process, from creation to dissemination to impact.

However, an overlooked facet of the scholarly communication process is the acknowledgement. As an expression of scholarly debt, the acknowledgment is an important facet of intellectual networks. Not only does the acknowledgement demonstrate the intellectual contributions of colleagues, advisors, funding agencies, and mentors but also the significance of librarians in the scholarly communication process.”

URL : http://eprints.rclis.org/25428/

The One Repo: background, implementation and call for funding

Statut

“As scholarly communication undergoes seismic changes, endless opportunities are opening up. In an open-access world, there is potential for Internet-enabled research on huge corpuses, discovering new correlations and making new connections. To facilitate these processes, we need platform that provides uniform access to the metadata and full text of all open-access articles, whether in repositories or open-access journals. The platform must provide data that is complete, up to date, high quality and open for every kind of re-use.

The One Repo ( http://onerepo.net/) is is that platform. It aims to make the entire open-access scholarly record available via a Web UI, embeddable widgets and various web-services, as well providing all of the metadata for direct download. It is built from battle-tested components that are in use in high-volume commercial systems.

Numerous harvesting methods are used. The existing demonstrator presents a UI that integrates results from a small number of repositories and other sources. We seek funding to rapidly increase coverage. The One Repo has dramatic implications forscholarship, research and engineering across every field of human endeavour.”


URL : http://onerepo.net/onerepo-whitepaper.pdf