The JoRD project has not set out to define the term “data” (or the singular form of the word, “datum”). This was a fortunate choice, because one of the messages that has clearly come across from all the participants of our study is that data can take many forms. The recent Royal Society Report, “Science as an Open Enterprise”, (http://royalsociety.org/uploadedFiles/Royal_Society_Content/policy/projects/sape/2012-06-20-SAOE.pdf) includes a glossary of data terms which illustrates the ways in which the term “data” can be used. For example:
- big data – data that requires massive computing power to process
- broad data – structured big data
- data set – a collection of information held in electronic form
- linked data – data that has been allocated a unique identifying number to be able to access it from an electronic storage facility
… and those are just a few terms that it explains. The word “Data” is defined as “Qualitative or quantitative statement or numbers that are (or assumed to be ) factual”. The researchers that were part of this study considered that their data took more forms that just statements or numbers.
Researchers described the data that their research generated as: software, video footage, geodata, geological maps, ontologies, web services and data models , as can be seen in the table below. The multitude of forms therefore makes it difficult for publishers to include in their on-line published articles. The publishers said that linked data in a journal article should be “fit for use” and “replicable” and consider that data in many different formats is “Messy” and currently is not supplied with sufficient meta-data. Another consideration is the resulting file size of an article if the publisher saves the embedded data on their own servers. Data repositories and data centres are the more practical method of data storage with published articles incorporating linked data.
Therefore that is one reason for Journals to have a data policy, and a good argument for those policies to be collected and made accessible in a centralised resource, a JoRD Policy Bank Service.
Researchers description of data |
Qualitative(documents and text) |
Quantitative(figures) |
Visual data (images) |
Virtual data (software or protocols) |
Collection of examiner reports and questions supervisory reports, letters and other documentary evidence. |
√ |
|
|
|
Dataset of measurements and statistical analyses |
|
√ |
|
|
Digitised Textual Sources |
√ |
|
|
|
Excavation, field observation, environmental monitoring, software to collate mine and analyse |
√ |
|
|
√ |
Excel sheets |
|
√ |
|
|
Focus Group, Interview Transcripts, some footage of people using computers, digital photographs |
√ |
|
√ |
|
Geodata |
|
√ |
|
√ |
Geologic maps, chemical and isotopic analyses of Earth Materials, GIS datasets |
|
√ |
√ |
√ |
Interview transcripts |
√ |
|
|
|
Ontologies |
|
|
|
√ |
Reports |
√ |
|
|
|
Visualization |
|
|
√ |
|
Web Services, Data Models and Specifications |
|
|
|
√ |