Open Research Data (ORD) Conference
May 28 – 29 2015, Warsaw, Poland
The opening of research data is a central aspect of Open Science. Research data is recognized as one of the key outputs of the scientific process.
The following post was written by Dr. Henry Lütcke, ITS Scientific IT Services.
Properly managed data is a valuable resource for other scientists, who can then freely use, re-use and combine it. Open data also benefits society as a whole, enabling novel forms of scientific inquiry and non-scientific data usage, facilitating engagement in citizen science, and perhaps even accelerating economical development.
The European conference “Open Research Data: Implications for Science and Society”, which took place in Warsaw from May 28th to 29th, provided a forum for a broad debate on all issues related to opening research data.
Among the many contributions, three major themes could be identified:
- the use of open research data for science
- the implications for society as a whole as well as tools
- methodologies for opening data
Open Research Data – Implications for Science
Open data offers two main benefits for science.
- First, it allows the use and re-use of existing data sets by other scientists (‘secondary use’), thereby making the scientific process more efficient and increasing the rate of scientific discovery.
- Second, opening of research data allows verification of results and conclusions and is therefore in the interest of reproducible research.
As part of his keynote address, Mark Pearson (Research Data Alliance) reminded the audience that the use of research data for others is frequently underestimated even by scientists themselves. The true value of data sets lies not in their perceived current value but in their capacity to produce unanticipated change through unfiltered contributions from broad and varied audiences. This point was further emphasized by Kevin Ashley from the UK Digital Curation Centre. He illustrated the enormous potential value of data with the Old Weather project (http://www.oldweather.org/) which transcribes ships’ logs since the mid-19th century to improve knowledge about past environmental conditions. This rich data set, which was not even acquired for research purposes, has so far yielded new insights not only for climate modelers but also for economists and historians. Clearly, not all research data can be made publicly available, for example because of privacy concerns in biomedical research. It is however important that information about the existence of data should always be available, for example to avoid pointless replication efforts. Nevertheless, it has been estimated that at the moment, more than 90% of research data are never published. Clearly, there is still a long way to go to achieve truly Open Science.
Open Research Data – Implications for Society
From the point of view of society, opening research data offers several advantages.
- First, open data makes scientific discovery more efficient and reproducible, thereby reducing the cost of the scientific process and consequently saving tax payers’ money.
- Second, certain data sets may also find re-use in the public domain.
For example, Mark Thornley (Natural Environment Research Council, UK) highlighted mobile apps created from environmental data, in order to determine flooding risk or radiation levels in certain areas. There was however also agreement that most research data are of relatively little interest to the general public as well as business and industry. Importantly, successful data re-use is not achievable by open data per se, but requires additional domain expertise, which has to be provided by ‘knowledge brokers’.
Tools and Methodologies for Opening Data
The final stream of the conference was concerned with novel tools for opening research data. In this context, I presented the publication extension of the data management system openBIS which has been developed by Scientific IT Services. I also highlighted the integration of openBIS in the Swiss Open Research Data portal (openresearchdata.ch) which is maintained by SIS, in close collaboration with the Systemdienste. Interestingly, a similar national data repository which uses the same underlying technology (CKAN) is currently under development in Poland. Apart from data repositories developed by academic institutions, a number of commercial repositories are becoming popular, notably by scientific publishers such as Macmillan (e.g. FigShare).
In summary, the Open Research Data conference provided a comprehensive overview of current efforts for opening research data at the European level and an opportunity for networking with experts from academia, administration as well as industry. From a technical point of view, the IT Services of ETH Zurich are in a good position to provide scientists with tools and solutions for managing research data and making it publically available. Politically, however, Switzerland is still lagging behind other research-intensive European countries, notably the UK, in mandating and incentivizing the opening of research data. Hopefully, this situation will improve in the coming years.