Last week was busy for the JoRD team. Jane did the presentation for ANDS, and Marianne appeared twice at Oxford, once to present a brief summary of the JoRD project to the Jisc organised “Now and Future of Data Publishing” event, and later in the week, to give a selection of the project findings to the Dryad Members meeting. The links to both the Oxford presentations follow, with a text summary.

1. The project was Jisc funded to explore the possibility of setting up a self sustaining data base and service to collate and summarise academic Journal policies on the deposition of data associated with published articles
2. Current belief that openly accessible research data is a good thing because it drives science forward
3. Aims Jisc funded project to look at the possibility of setting up a central resource of journal instructions to authors about sharing the data on which articles are based
4. Objectives
• Investigate current state of Journal data policies
• Investigate current data sharing views and habits
5. Landscape of data sharing There has always been data published in printed journals in the form of charts and tables
6. But digital data becomes a problem, where should it be stored? In a repository? On a website? Embedded into articles?
7. This is a journal data policy, it is an instruction to authors of where to share or deposit research data that is relevant to a published article
8. We initially analysed 230 research data policies and found many inconsistencies and a lack of standardisation
9. Some journals were vague about the form of data to be deposited, others were more precise
10. Some journals were specific about where the data should be deposited, most were less so.
11. Go back to the policy and explain
12. We spoke these stakeholder groups and we found a number of dichotomies
13. Taking researchers first, they said that they would be happy to share their data (with certain caveats, which I will not go into here). These were the reasons they gave for sharing data
14. However, when we asked how much they shared and where, most of them only shared with colleagues. Only a small number mentioned that they put their data into repositories
15. We asked them why that was, and they their replies ranged over No time, don’t know where, difficulty of accessing Institutional repositories. And that current research models do not value and encourage data sharing (A PhD researcher sated that he felt that if he shared his data during the course of the research, he may be “gazzumped”, meaning that should someone publish  research on his chosen topic, the thesis would no longer be unique and therefore the doctoral thesis would no longer be credited)
16. The publishers also showed a dichotomy whereas they also appreciated the benefits of sharing data, they felt that their servers would have difficulty holding the quantity of data included in each article and that repositories were the right place. However there was some discussion about the long term availability of repositories. They have not yet been proven, but the publishing houses have been around for a long time
17. Worries about links, etc
18. Academic librarians and Repository managers, no conflicting concerns, practicality
19. Data sharing landscape is a mess
20. How could a Jord Service improve the infra-structure?
• Develop a model data policy framework, which takes into account the concerns of all the stakeholders
21. Improved policies saves the time of publishers and authors, more consistent
22. Address the fears of IP, data citation etc, eliminating dichotomies, improving the infrastructures, creates order
• Implications for repositories, authors know where data can be deposited to be shared and re-used, more will do so.

The JoRD Project: Now and Future

1. JISC funded feasibility study central resource of research Journal data policies
2. Looked at what the service should include and whether it could pay for itself
3. And 4 Tried to answer two questions
• Can Journal data policies encourage deposition of data?
• Will a JoRD service help publicly funded data to be shared and re-used?
5. Why bother? When an author publishes she is trading her intellectual property with a publisher, as part of a transaction and there are certain obligations on both sides, this can include data linked to the article. Author needs to know and understand what to do with it (reading the small print)
6. Needed to find out three things
• Understand current journal data policies
• Would anyone bother to use the service
• Could it generate sufficient income for development, building and maintenance?
7. We analysed some journal data policies in depth
8. Looked at 371 journals,
9. What was in the policies? Main areas were data type, when to deposit, and where
10. Little requirement for open access or compliance or consequences for non compliance
11. That does not provide an argument that journal data policies will help open data sharing
12. And 18 But there are signs that the situation is changing
• More publishers are considering data policies
• Elsevier Journal of the future
• Rise of data journals
• Apparent upward trend of journals with data policies
19. If there were a JoRD Service, would anyone use it?
20. All the stakeholders said that they would
21. For a variety of reasons BUT
22. They all wanted different things…
23. …apart from these, difficult to build one service
24. And will anyone pay for it?
25. Resounding no, except from publishers if the service was all singing and dancing
26. So, how does a JoRD service stand?
27. Now, with few policies stipulating deposit of data and stakeholders not financially contributing,
28. BUT… Let’s think of the future? The landscape is changing
29. Funders are asking for data plans to be included in funding bids
30. Universities are installing data management systems
31. Increase of data journals
32. And expectation that data should be included in articles
33. We have an opportunity to build a high quality data-base of existing journals data policies, which can be added to and maintained to a high level with simple user interface. Establish a user base and develop a sustainable business model which can be implemented in a later stage.
34. JoRD is the future And we should build it now when the quantity of data is smaller and the cost will be lower
35. Before the data deluge comes

What is linked data?

The fact that data comes in all sorts of shapes and sizes has already been blogged about, but what is the concern about adding data into online journals? after all, printed journals have included data in the shape of graphs or tables for a great many years. The problem is now that the journal article and its corresponding data is no longer in the flat two dimensional world of a piece of paper, but is part of the multi-dimensional world of the internet, the data is linked to something else. Linked data, according to Bizer, Heath and Berners-Lee (http;// is the method by which data is connected, structured and published on the web resulting in a “web of data”. Linked data “refers to data published on the web in such a way that it is machine readable, its meaning is is explicitly defined, it is linked to other external data sets and can in turn be linked to from external data sets”.

Before the data is published and linked, it has to be put somewhere. Most of our research participants said that they store their data in a personal storage system, either their own work or home computer, or on a portable storage device. While, of course, such spaces may be linked to the internet, it is rather like keeping the data in a filing cabinet, although anyone can go and find the data, they have to search very hard or ask the data keeper to give it to them. Data therefore has to be uploaded to a space that is openly accessible, which could be a university repository, a subject repository, a web page, or even onto the publishers own servers.

Again this is not as simple as it seems, first you have to choose your repository and ensure that it will accept your sort of data. Once safely held in a repository, the data must be permanently linked and archived. As digital repositories are relatively new things, there is the question of what if the repository you have chosen has to close? where will the data go? If the data is uploaded onto the publisher’s server, do they have the capacity to hold all the data for all the journals that they publish, as well as all the articles? Suddenly the storage needs of a single article can become top heavy. At the moment there are not very clear answers to these concerns, therefore there needs to be some guidelines and methods of best practice resolved before all data can be truly linked.

JISCMRD Programme Progress Workshop / DCC Institutional Engagements Workshop, Wed 24-Thu 25 October, Nottingham

Last week we have been attending the JISCMRD Programme meeting, here in Nottingham. This was an opportunity for projects on the JISC Managing Research Data Programme 2011-13 and DCC Institutional Engagements and associate projects to meet up and discuss progress.

The event covered

  • Institutional RDM policies; developing an institutional strategy and an ‘EPSRC’ roadmap.
  • Managing active data: storage, access, academic dropbox services.
  • Data management planning: developing good practice and providing effective support.
  • Data repositories and storage: options for repository service solutions.
  • Training & guidance.
  • Triage and handover: what to keep and where to entrust it?  Selection and appraisal; deposit and handover.
  • Business case: covering roles, responsibility, costing, sustainability, advocacy etc.
  • Data catalogues: metadata profiles, identifiers.

In addition to presentations, most projects also brought poster detailing their progress. We will tell you all about our poster next week.

It was very interesting to here from various projects around the UK, and how they are going about implementing data management plans, and data repositories at their institutions.

Of particular interest to JoRD project was PRIME.

PRIME along with PREPARDE and ourselves, are part of the JISC Managing Research Data: Innovative Research Data Publication strand.

PRIME (Publisher, repository and institutional metadata exchange), will be looking into automated ways for publishers, repositories and institutions to share metadata about datasets.

An example use case, would be that a researcher submits a data paper to a metajournal, which in turn shares the metadata on the dataset with subject and institutional repositories.

News from America, “U-M, Sloan Foundation to enhance open access to research data”

“Professional associations, journals, data repositories and funding agencies must work together to make the entire scientific venture more transparent and to encourage broader access to research data,” said ICPSR Director George Alter. “The first step is to give scientists who produce important research data the recognition they deserve.”

U-M, Sloan Foundation to enhance open access to research data (

The University of Michigans’ Inter-university Consortium for Political and Social Research and the Alfred P. Sloan Foundation are working together to promote open access to research data and improve the link between published works and the background data.

In particular, the ICPSR will be working with stakeholders within the social sciences, to improve:

  • Data citation
  • Transparency of research
  • Collaboration across scientific fields to study sustainable funding models for data repositories

Our survey work with JoRD has indicated that Social Sciences journals are behind Science journals in having policies on data sharing and archiving. This project has the potential to address this imbalance.

Inter-university Consortium for Political and Social Research

Alfred P. Sloan Foundation