Thoughts on publication of open data sets using national standards

The NIH data sharing policy emphasizes the need for data sharing for research repeatability and collaboration. Many private funding agencies require that the data sets to be published openly as well. I anticipate that this will prompt medical research universities to implement both governance and technical processes to de-identify and publish data. This will open the data floodgates that platforms like us expect to support.

However, this is a distributed process where each research group may publish the data set in a bespoke format in the absence of an easy mechanism to adopt a standard format - both for publication and for use by various research tools.

NIH ODSS supports national standards to publish data and they specifically name FHIR. In addition, the Office of National Coordinator of Health IT - the federal government regulator of healthcare interoperability - requires that electronic health record systems export data at a patient and population level in FHIR format using common APIs. The International Patient Summary standard based on FHIR aims to extend this interoperability across national borders.

What are your thoughts on accelerating publication of deidentified open research data sets using such national or international standards?

If you are a research team that publishes open data, will this accelerate your publication process?

If you are a researcher who consumes open data, will this accelerate your research - especially if you use multiple data sets from different publishers?

What challenges do you have for either publishing or using such standards?