LibGuides: Data Management: Open Data

How Does Open Data Work?

The process of making data truly open can seem overwhelming, but it doesn’t have to be. To help simplify the process, it may be helpful to think about enabling Open Data through two basic routes:

Making Data Technically Open: Ensuring that data are made available as a complete set in a machine-readable format on an easily accessible platform is key to enabling Open Data.
Making Data Legally Open: Ensuring that data are made available under legal terms that allow users to redistribute and fully reuse the data is the second key step to ensuring Open Data. The only way to be sure that data are adequately covered is to put a license on it that conforms to the full Open Definition of Open Data. Many options for such licenses are available, such as those produced by Creative Commons.

Source: SPARC

Open Data Repositories & Portals

Open Data Repositories

Abacus: UBC Dataverse, Public Data Collection
Aperta
AWS Public Data Sets
- Includes NASA, climate Data, 1000 Genome Project and more!
Broad Institute Cancer Program datasets
DataCite
Dryad Digital Repository
FigShare
GigaDB
GitHub
Harvard Dataverse
Human Genome Diversity Project
Mendeley
Nature. Recommended respositories
Pew Research Centre Open Data
PLOS recommended repositories
R e3data.org (Registry of Research Data Repositories)
Spacial Data Repository
US Spatial Data Repository

Open Data Portals

Source: Thompson Rivers University Library

What is Open Data?

Source: Sheridan College Library and Learning Services

Open Data is research data that is freely available on the internet permitting any user to download, copy, analyse, re-process, pass to software or use for any other purpose without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself.

Open Data is research data that:

Is freely available on the internet;
Permits any user to download, copy, analyze, re-process, pass to software or use for any other purpose; and
Is without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself.

Open Data typically applies to a range of non-textual materials, including datasets, statistics, transcripts, survey results, and the metadata associated with these objects. The data is, in essence, the factual information that is necessary to replicate and verify research results. Open Data policies usually encompass the notion that machine extraction, manipulation, and meta-analysis of data should be permissible.

Open Data:

Accelerates the pace of discovery. When datasets are openly available, they can be easily accessed and used to create a fuller picture of a given area of inquiry, or analyzed by data mining software that can uncover connections not apparent to those who produced the original data.
Grows the economy. Researchers estimate that $3.2 trillion in economic output could be added to global GDP through Open Data across all sectors, with scientific and scholarly data playing an important role.[1]
Helps ensure we don’t miss breakthroughs. There are a huge number of ways to use or analyze any given dataset. What seems like noise to one person could be an important discovery to someone else with a different perspective or analytical technique.
Improves the integrity of the scientific and scholarly record. When the data that underlies findings is accessible, researchers can check each other’s work and ensure that conclusions are built upon a firm foundation.
Is becoming recognized by many in the research community as an important part of the research enterprise of the 21st From research funders like the US government to publishers, institutions involved in the research process are beginning to require that, at the very least, the data that underlies publications be made openly accessible.

Open Data has the potential to speed up the research process while simultaneously improving our confidence in those results. The access, use, and curation of this huge and growing body of data is central to the research enterprise.

Source: SPARC

Why is Open Data Important?

During the past several years, Open Data has become a field of urgent interest to researchers, scholars, and librarians. With the amount of scientific data doubling every year, issues surrounding the access, use, and curation of data sets are increasing in importance. The data-rich, researcher-driven environment that is evolving poses new challenges and provides new opportunities in the sharing, review, and publication of research results. Ensuring access to primary research data will play a key role in seeing that the scholarly communication system evolves in a way that supports the needs of scholars and the academic enterprise as a whole.

Increasingly, institutions that support research – from public and private research funders to higher education institutions – are exploring policies that require researchers to produce data management plans that explicitly cover how they will make their data available, and under what terms.

Broadly communicating results and making research data broadly accessible and fully available for reuse encourages new research through the reanalysis of existing data, further leveraging the value of a research investment. Providing access to data that is made accessible in formats and under terms that enable full reuse promotes interoperability, and allows the data to be mined using cutting-edge computational tools across huge amounts of data to find connections, trends and patterns that can’t be uncovered when data is closed or siloed.

Source: SPARC