Going back to basics – reusing data

It is almost a year since the first set of data was gathered to analyse journal articles, and now the benefits of saving data well is becoming fruitful. Two things are happening that means we are getting the basic figures out, dusting them off and looking at them again. The first is a paper about the development of a model journal research data policy, which is being co-authored by the JoRD team members, and the second is in response to certain questions that various people are asking.

The idea of creating a model policy emerged from the mass of data that was being found in the analytical process, and it was based on what journals were already doing, and suggestions from the report “Sharing Publication-related Data and Materials: Responsibilities of Authorship in the Life Sciences” (Committee on Responsibilities of Authorship in the Biological Sciences, 2003,  http://www.nap.edu/openbook.php?isbn=0309088593). The report was the outcome of a workshop in the United States which involved Biological Scientists. The five principles and ten recommendations stated in the report were strongly in favour of open access to the data that underpins the research reported in published articles.  A summary of the principles and recommendations can be found here: http://www.councilscienceeditors.org/files/scienceeditor/v26n6p192-193.pdf. The report suggested that the data could either be included into the article , or deposited in a reputable repository and linked to the article. The focus of the first model data policy was therefore based on the rather patchy and inconsistent set of policies that were found, from less than half the journals we analysed, and a report which was biased towards one scientific discipline.  It was decided to compare the initial model data policy with the needs of the stakeholders, which were examined at a later stage in the JoRD project. This has entailed, not only going over the data gathered from the stakeholder interviews and questionnaire, but also digging retrospectively into the reasons for the initial model criteria to be chosen.

The second reason for examining the basic data has come from interesting questions asked by a number of bodies that know about the JoRD project, and therefore assume that the JoRD team are experts in the field of Journal research data policies, an assumption that is becoming increasingly true as more questions are answered. In order for the questions to be answered, the data needed to be looked at from a different perspective. For example, to answer “How many journals make sharing a requirement of publication?” the original data set was re-examined and journals counted, because the original analysis was looking at the number of policies, some journals having up to three different data policies. Here follows a table with figures from a journal perspective:

Results of Journal Survey
Total no. of Journals surveyed 371
Total no. of Journals with data sharing policies 162
Total no. of Journals that make sharing a requirement of publication 31
Total no. of Journals that enforce the policies 27
Total no. of Journals that state consequences for non compliance 7

This process is an illustration of the way that well organised data, saved  safely, and as in this case in digital form, can be re-used after a particular project has ended. Surely it is generally after research has been concluded that questions arise and the iterative process of dipping in and out of data to validate or extend the research then begins. The moral of this blog post? Manage your data well because you never know what you will asked.

Another week, another presentation

Early this morning, well before normal work time, the dedicated Centre for Research Communication employees, Marianne and Jane, entered the special media communication room which contains the video conferencing equipment so that they could jointly present “Publisher Interest towards a role for Journals in Data Sharing: The Findings of the JoRD Project”. In the true spirit of global access and the digital world, they presented in Nottingham, UK and the presentation was seen at the ELPUB conference in Karleskrona, Sweden. We are pleased to report that the Nottingham technology worked really well, but a fellow presenter, also speaking through Adobe Connect, had difficulties with her connection and transmitted the sound of a large aircraft which was passing over the room where she was speaking. Jane and Marianne had chosen the high-tech route, because currently a tram line and bridge is being noisily constructed out side their office window, and had they decided to present from their computer, there would have been the sound of heavy machinery moving, beeps and rumbles, drilling and clangs.

Here is the link for the power-point slides:

JoRDELPUB

 

 

 

 

A rather long post, but quite a brief summary

Here is a summary of the the project so far.

Sharing the data which is generated by research projects is increasingly being recognised as an academic priority by funders, researchers and publishers.  The issue of the policies on sharing set out by academic journals has been raised by scientific organisations, such as the US National Academy of Sciences, which urges journals to make clear statements of their sharing policies. On the other hand, the publishing community expresses concerns over the intellectual property implications of archiving shared data, whilst broadly supporting the principle of open and accessible research data .

The JoRD Project was a feasibility study on the possible shape of a central service on journal research data policies, funded by the UK JISC under its Managing Data Research Programme. It was carried out by the Centre for Research Communications Research at Nottingham University (UK) with contributions from the Research Information Network and Mark Ware Consulting Ltd. The project used a mix of methods to examine the scope and form of a sustainable, international service that would collate and summarise journal policies on research data for the use of researchers, managers of research data and other stakeholders. The purpose of the service would be to provide a ready reference source of easily accessible, standardised, accurate and clear guidance and information, on the journal policy landscape relating to research data. The specific objectives of the study were:  to identify the current state of journal data sharing policies; to investigate the views and practices of stakeholders; to develop an overall view of stakeholder requirements and possible service specifications; to explore the market base for a JoRD Policy Bank Service; and to investigate and recommend sustainable business models for the development of a JoRD Policy Bank Service

A review of relevant literature showed evidence that scientific institutions are attempting to draw attention to the importance of journal data policies and a sense that the scientific community in general is in favour of the concept of data sharing.  At the same time it seems to be the case that more needs to be done to convince the publishing world of the need for greater consistency in data policy and author guidelines, particularly on vital questions such as when and where authors should deposit data for sharing.

The study of journal policies which currently exist found that a large percentage of journals do not have a policy on data sharing, and that there are great inconsistencies between journal data sharing policies. Whilst some journals offered little guidance to authors, others stipulated specific compliance mechanisms. A valuable distinction is made in some policies between two categories of data: integral, which directly supports the arguments and conclusions of the article, and supplementary, which enhanced the article, but was not essential to its argument. What we considered to be the most significant study on journal policies (Piwowar & Chapman, 2008), defined journal data sharing policies as “strong”, “weak” or “non-existent”. A strong policy mandates the deposit of data as a condition of publication, whereas a weak policy merely requests the deposit of data. The  indication from previous studies that researchers’ data sharing behaviour is similarly inconsistent was confirmed by our online survey. However, there is general assent to the data sharing concept and many researchers who would be prepared to submit data for sharing along with the articles they submit to journals.

We then investigated a substantial sample of journal policies to establish our own picture of the policy landscape. A selection of 400 international and national journals were purposefully chosen to represent the top 200 most cited journals (high impact journals), and the bottom 200 least cited (low impact journals), equally shared between Science and Social Science, based on the Thomson Reuters citation index.  Each policy we identified relating to these journals was broken into different aspects such as: what, when and where to deposit data; accessibility of data; types of data; monitoring data compliance and consequences of non compliance. These were then systematically entered onto a matrix for comparison. Where no policy was found, this was indicated on the matrix. Policies were categorised as either being “weak”, only requesting that data is shared, or “strong”, stipulating that data must be shared.

Approximately half the journals examined had no data sharing policy. Nearly three quarters of the policies we found we assessed as weak and only just under one quarter we deemed to be strong (76%: 24%). The high impact journals were found to have the  strongest policies,  whereas not only did fewer low impact journals include a data sharing policy, those policies were  were less likely to stipulate data sharing, merely suggested that it may be done. The policies generally give little guidance on which stage of the publishing process is data expected to be shared.

Throughout the duration of the project, representatives from publishing and other stakeholders were consulted in different ways. Representatives of publishing were selected from a cross section of different types of publishing house; the researchers we consulted were self selected through open invitations by way of the JoRD Blog. Nine of them attend a focus group and 70 answered an online survey. They were drawn from every academic discipline and ranged over a total of 36 different subject areas. During the later phases of the study, a selection of representatives of stakeholder organisations was asked to explore the potential of the proposed JoRD service and to comment on possible business models. These included publishers, librarians, representatives of data centres or repositories, and other interested individuals. This aspect of the investigation included a workshop session with representatives of leading journal publishers in order to assess the potential for funding a JoRD Policy Bank service. Subsequently an analysis of comparator services and organisations was performed, using interviews and desk research.

Our conclusion from the various aspects of the investigation was that although idea of making scientific data openly accessible for share is widely accepted in the scientific community, the practice confronts serious obstacles. The most immediate of these obstacles is the lack of a consolidated infrastructure for the easy sharing of data. In consequence, researchers quite simply do not know how to share their data. At the present juncture, when policies are either not available, or provide inadequate guidance, researchers acknowledge a need for the kind of information that a policy bank would supply. The market base for a JoRD policy bank service would be the research community, and researchers did indicate they believed such a service would be used.

Four levels of possible business models for a JoRD service were identified and finally these were put to a range of stakeholders. These stakeholders found it hard to identify a clear cut option of service level that would be self sustaining. The funding models of similar services and organisations were also investigated. In consequence, an exploratory two phase implementation of a service is suggested. The first phase would be the development of a database of data sharing policies, engagement with stakeholders, third party API development with the intention to build use to the level at which a second phase, a self sustaining model, would be possible.

Librarians, data and JoRD

So far this blog had commented on what researchers think and what publishers and journals are currently doing. The final part of the stakeholder consultation comprises  interviews that were held with academic librarians which explored their thoughts on open access research data; the role of librarians in working with open data and a JoRD policy bank service. The librarians agreed with views of the other stakeholders that wider access to research data is beneficial. However, they showed a deeper understanding of the infrastructure required to store and access data and considered the problem of selecting which data should be preserved. In their experience, institutional practice is not advancing in line with policies, and, as information specialists, librarians considered that they have the skills necessary to improve the situation.

Librarians anticipated that their expertise could be used for the following roles:

  •   Meta-data management and structure of data
  •   Data licensing
  •   Inclusion of data in institutional repositories
  •   Data management advice and training
  •   Co-ordination with other university support departments, for example, IT, record management and research office.
  •   Enabling compliance

Librarians were also positive about the concept of a JoRD Policy Bank service, but considered that it would be a useful addition to some existing services, for example RoMEO or JISC Collections Knowledge Base+; therefore creating a single point of reference for broad advice on data management and publication. As with the views of other stakeholders, librarians considered that one function of a JoRD service would be to compare journal policies with funders requirements, but also suggested that some co-funded projects would need guidance should the funder’s policies be different. They also suggested that JoRD should rate journal policies on aspects such as usability and access of data.

Data comes in all sorts of shapes and sizes

The JoRD project has not set out to define the term “data” (or the singular form of the word, “datum”). This was a fortunate choice, because one of the messages that has clearly come across from all the participants of our study is that data can take many forms. The recent Royal Society Report, “Science as an Open Enterprise”, (http://royalsociety.org/uploadedFiles/Royal_Society_Content/policy/projects/sape/2012-06-20-SAOE.pdf) includes a glossary of data terms which illustrates the ways in which the term “data” can be used. For example:

  • big data – data that requires massive computing power to process
  • broad data – structured big data
  • data set – a collection of  information held in electronic form
  • linked data – data that has been allocated a unique identifying number to be able to access it from an electronic storage facility

… and those are just a few terms that it explains. The word “Data” is defined as “Qualitative  or  quantitative statement or numbers that are (or assumed to be ) factual”. The researchers that were part of this study considered that their data took more forms that just statements or numbers.

Researchers described the data that their research generated as:  software, video footage, geodata, geological maps, ontologies, web services and data models , as can be seen in the table below. The multitude of forms therefore makes it difficult for publishers to include in their on-line published articles. The publishers said that linked data in a journal article should be  “fit for use” and “replicable” and consider that data in many different formats is “Messy” and currently is not supplied with sufficient meta-data. Another consideration is the resulting file size of an article if the publisher saves the embedded data on their own servers. Data repositories and data centres are the more practical method of data storage with published articles incorporating linked data.

Therefore that is one reason for Journals to have a data policy, and a good argument for those policies to be collected and made accessible in a centralised resource, a JoRD Policy Bank  Service.

Researchers description of data Qualitative(documents and text) Quantitative(figures) Visual data (images) Virtual data (software or protocols)
Collection of examiner reports and questions supervisory reports, letters and other documentary evidence.
Dataset of measurements and statistical analyses
Digitised Textual Sources
Excavation, field observation, environmental monitoring, software to collate mine and analyse
Excel sheets
Focus Group, Interview Transcripts, some footage of people using computers, digital photographs
Geodata
Geologic maps, chemical and isotopic analyses of Earth Materials, GIS datasets
Interview transcripts
Ontologies
Reports
Visualization
Web Services, Data Models and Specifications

Summary of workshop, discussion about the nature of JoRD

Here is another summary of the concluding discussion that took place at the workshop on 13th November. This is about the expectations and perceptions of publishers concerning the nature of the JoRD Data Bank service.

A prominent consideration of the publishers was that JoRD should be an authoritative resource, such that a JoRD compliance stamp, or quality mark, could be displayed on Journal’s websites. There was discussion that for JoRD to be authoritative, the content of the database should be added, updated and maintained by the JoRD team. It was mentioned that publishers might initially populate the data base, but ongoing maintenance would be the responsibility of JoRD. However, there should be a guarantee that the content is accurate and that publishers would need to commit to providing policies that can be machine readable in order for them to be automatically harvested.

It was suggested that the operational database should not be merely a static catalogue or encyclopaedia. It was requested that the non-compliance of a journal to a data sharing policy, or to a funder’s policy, could be flagged and reported to the publisher, although that request was queried as to whether that was the remit of the service, or the publisher themselves. Similarly, it was questioned whether the service would mediate user complaints, and proposed that it would engage with complaints concerning policies only. To maintain functionality, could there be automatic URL checking which would send an alert to the publisher if links were broken.  Updates to policy changes would also be a useful function.

The service website should include a model data policy framework or an example of a standard data policy and offer guidance and advice to journals and funders about policy development. However, the processing and ratification of a model policy could be a time consuming process to some publishers. It was asked whether repository policies would also be included, and there was mention of compliance with the OpenAIRE European repository network. The website should also contain:

  • Links to the publishers web-pages
  • Dates of the records
  • Lists of links to repositories
  • Set of criteria for data hosting repository

It should look inviting, but businesslike and be simple and clear, but be sufficiently detailed.

Methods of funding the service were considered and the benefits of membership. For example, would only the policies of members to the service be entered into the database? Would there be different levels of membership or different service options that publishers could choose? and would there be extra costs for extra services? One such service could be to contain historical records and persistent records to former policies. In the publisher’s opinion, they would be prepared to pay for a service that is transparent and would save them time.

Other comments included:

  • Would the service be a member of the World Data System?
  • Could it be released in Beta?
  • There are around 4-600 titles to enter initially
  • When set up the service could be studied to discover its effectiveness and impact
  • Further consultation may be needed

Very brief summary of JoRD workshop

On Tuesday 13th November some of the JoRD team met with representatives of several well known journal publishers for workshop a session to discuss a number of points concerning the potential JoRD data bank service. This is a very potted summary of the discussions that took place. If any of the attendees are reading this and feel that their comments have not been correctly interpreted, then please comment to correct any misunderstandings.

Preservation of and sustained access to published supplementary material: The current situation
The group perceived that at present there are a variety of issues that impede the maintenance of data added to an on-line journal as supplementary material, or even the practice of including data within an article. The areas where difficulties lie include:
• Technology
• Data repositories
• Embargoes
• Peer review
• Licensing
• Copyright
Unstable URLs, PDF formats and usable forms of preserved data present technological problems that need to be solved to ensure that data can be accessed in the long term. However, transferring data to new formats has fewer difficulties. Data may be linked to external repositories, but they present a problem because they each have different policies and practices. Embargoes placed on data release complicates matters, there is not standard for their length. To overcome these issues, an alternative solution would be not to include the data file with the article but to add information of where it can be obtained directly from the researcher. However, on-line journals will be upgrading to enriched HTML and should therefore commit to include data.

The group were concerned about the peer review of data, which is currently “Ad Hoc”. It was queried whether peer reviewers have time to examine data alongside judging arguments and suggested that data is reviewed by the research community. Currently publishers’ practices concerning licensing and copyrighting of data as supplementary material vary greatly. However EU legislation does not allow data to be copyrighted. Authors could be offered choices of licensing and work is being done to define data and on forms of data citation, however, publishers do feel a duty of care to the knowledge that they publish.

About data repositories: Advantages and disadvantages
Ideally, publishers would like repositories to be a searchable archive that manages data and collects retrospectively, such as the library of Columbia University gathering data for PLOS.

Advantages

  • The situation for publishers would be made simpler should data be held in external repositories
  • Technically more able to deal with digital data
  • Guidelines about re-depositing data if closed
  • Institutional repositories could manage data then aggregate it as in Australia

Disadvantages

  • May want to take over from publishers
  •  Not currently ready for influx of data
  • Funding may not be sustained
  • Discovery issues

Solutions to any of the issues posed above are not given in this post, but there is opportunity for you to comment. The remainder of the discussion focused on the structuring and content of a JoRD Policy Bank service, which will be summarised in the next post.

Online survey results part two

The second set of questions asked in the online survey ask for the opinions of researchers about data sharing and the usefulness of a data policy bank service. They are as follows:

  • Where do you access or locate the research output of other researchers?
  • In your opinion are the key drivers behind increasing access to research data?
  • In your opinion what are the main problems associated with sharing research data?
  • What do you think about linking a publication with digital data that are integral to its main conclusions?
  • What do you think about linking an article with supplementary material that enhances the article?
  • Do you think that journals should provide digital data sharing policies?
  • Do you think there would be benefits in having a service offering information about journal research data policies?
  • Would you use a service of this kind?
  • What information should be included in a policy bank service?
  • Do you have any other comments?

Most of the respondents locate other researcher’s data from colleagues or in their own institution or organisation and feel that the four most important key drivers to increasing access to data are:

  • Openness
  • Accountability
  • Increased access to data
  • Increased efficiency of research resources

The most frequently expressed concern is that of attribution of intellectual property right to the data being shared. The next frequently expressed issue is that current  institutional and establishment models and mindsets of institutions and some individuals create barriers to sharing data. However just over one-third of respondents (35%) consider that linking digital data as an integral part of  main conclusions in published online journals would be useful and should be mandatory.

Linking articles to supplementary data to enhance the article was considered useful by more respondents (43%) but it would also depend on the context of the data shared. Over 74% of researchers considered that journals should provide data sharing policies and a similar percentage (73%) thought that such a service would be of benefit, because it would be a central resource. Nearly 80% of respondents said that they would use such a service, either to gather data, or as a means of selecting where to publish their work. Many ideas of what to include in a policy data bank were suggested, which included:

  • Clarity and simplicity of use
  • Archiving URLs
  • Guidelines
  • Usage licences (eg Creative Commons)

Eight researchers commented that they considered the initiative important.

The least number of respondents said that they gather other research data from their own blog, or from hard copy data sets. The concerns expressed about sharing data were those of trust, confidentiality and the need to overcome existing mindsets and institutional barriers. A small number of researchers felt that sharing data would affect the future of research and that before sharing data certain conditions would have to be fulfilled. A very low number of people (3%) said that linking data to main conclusions was not useful and unnecessary; that they would only be interested in a published article, not in any additional material and that journals should not provide data sharing policies. One researcher commented that further research about the topic with a trial  would help their decision as to whether published data sharing policies would be of personal benefit.

Three percent of respondents thought that there would be no benefit to a data policy bank service, because it is not needed, not feasible or there would be conflicting journal ethos. Twenty one percent considered that they would not use such a service because they did not find it relevant and one researcher stated that they would prefer to deal directly with the journal.

On balance, it appears that more respondents are pro-data sharing, have positive opinions about the JoRD policy bank service and would find it useful, than respondents who feel that there is no need or use for such a service.

Literature Review – Articles Relevant to the Field

This bibliography of useful literature has been sitting in the draft section for some months, but as our study had now finished, and the feasibility study report is in the hands of Jisc, we are practising our own preaching and passing on out information to others who may be interested in this area. I am sorry, but it is a rather long list and looks tedious and boring.

More data will follow in the next few weeks.

LITERATURE REVIEW

An early paper on journal policies.

McCain, K. (1995) Mandating sharing: journal policies in the natural sciences. Science Communication 16, 403-431.

Baseline paper on journal policies (and examples of the other work of Piwowar and Chapman on data sharing).

Piwowar, H. and Chapman, W. (2008)  A review of journal policies for sharing research data   In: Open Scholarship: Authority, Community, and Sustainability in the Age of Web 2.0 – Proceedings of the 12th International Conference on Electronic Publishing (ELPUB) June 25-27 2008, Toronto Canada. Available at http://ocs.library.utronto.ca/index.php/Elpub/2008/paper/view/684

Piwowar, H. and Chapman, W. (2008) Identifying data sharing in biomedical literature. AMIA Annual Symposium Proceedings, 596-600. Available at http://www.ncbi.nih.gov/pmc/articles/PM2655927

Piwowar, H. and Chapman, W. (2010) Public sharing of research datasets: a pilot study of associations. Journal of Info-metrics 4(2) 148-156. Available at http://www.sciencedirect.com/science/article/pii/S1751157709000881

Piwowar, H. and Chapman, W. (2010) Recall and bias of retrieving gene expression micro array datasets through PubMed identifiers. Journal of  Biomedical Discovery and Collaboration 5, 7-20. Available at http://www.ncbi.nih.gov/pmc/articles/PMC2990274

Piwowar, H. (2010) Who shares? Who doesn’t? Factors associated with openly archiving raw research data. PLoS One 6:7 07. Available at http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0018657

Most recent work on best practice for scholarly publishing.

Shriger, D. et al (2006) The content of medical journal instructions for authors. Annals of Emergency Medicine 48(6), 742-749.

Looked at 166 journals and found contradictory policies and little guidance on methodological and statistical issues.

Smit, E. and Gruttemeier, H. (2011) Are scholarly publications ready for the data era? Suggestions for best practice guidelines and common standards for the integration of data and publications. New Review of Information Networking 16(1) 54-70.

Smit, E. (2011) Abelard and Heloise: why data and publications belong together. D-Lib Magazine 17(1-2). Available at http://www.dlib.org/dlib/january11/smit/01smit

Recent broad explorations of the issues.

Schriger, D. et al (2006) From submission to publication: a retrospective review of the tables and figures in a cohort of randomised controlled trials submitted to the British Medical journal. Annals of Emergency Medicine 48(6) 750-756.

Carpenter, T. (2009) Journal article supplementary materials: a Pandora’s box of issues needing best practices. Against the Grain 21(6) 84-85.

Neylon, C. (2009) Scientists lead the push for open data sharing. Research Information 41, 22-23.

Hodson, S. (2009) Data-sharing culture has changed. Research Information 45, p.12.

Fisher, J. and Fortmann, L. (2010) Governing the data commons: policy, practice and the advancement of science. Information and Management 47(4) 237-245.

Bizer, C., Heath, T. and Berners-Lee, T ( ? ) Linked data – the story so far. International Journal on Semantic web and Information Systems. Special Issue on Linked Data. Available at http://linkeddata.org/docs/ijswis-special-issue

Hrynaszkiewicz, I. (2011). The need and drive for open data in biomedical publishing. Serials 24(1) 31-37.

Bechhofer, S. et al (2011) Why linked data is not enough for scientists. Future Generation Computer Systems (forthcoming as of Aug 2011)

Kauppinen, T. and Espindola, G. (2011) Linked open science – communicating, sharing and evaluating data, methods and results for executable papers. Procedia Computer Science 4, 726-731.

LOS has 4 ‘silver bullets’ 1. Publication of data using Linked Data principles 2. Open source and need-based environments, 3. Cloud computing use, 4. Creative commons.

Parsons, M. (2011) Expert Report on Data Policy – Open Access. Available at http://151.1.219.218/57883ed7-88bc-4e6f-92ed-3af6e96600be.pdf.

Tenopir, Carol, Suzie Allard, Kimberly Douglass, Arsev Umur Aydinoglu, Lei Wu, Eleanor Read, Maribeth Manoff, and Mike Frame. “Data Sharing by Scientists: Practices and Perceptions.” PLoS ONE 6, no. 6 (2011): e21101. http://www.plosone.org/article/info:doi/10.1371/journal.pone.0021101

Borgman, C. (2012) The conundrum of sharing research data. Journal of the American Society for Information Science and Technology, 63(6) 1059-1078.

Selected specific studies on aspects of data archiving and sharing.

Hrynaszkiewicz, I. and Altman, D. (2009). Towards agreement on best practice for publishing raw clinical trial data. Trials 10(17) 1-5. Available at http://www.biomedcentral.com/content/pdf/1745-6215-10-17.pdf

Groves, T. (2009) Managing UK research data for future use. BMJ 338 b1252. Available at http:www.bmj.com/content/338/bmj.b1252.Full?tab=response-form/

De Roure, D. et al. (2009) Towards open science: the myexperiment approach. Concurrency and Computation: Practice and Experience (submitted 2009). Available at http://eprints.soton.ac.uk/267270/

Colin Elman, Diana Kapiszewski and Lorena Vinuela (2010). Qualitative Data Archiving: Rewards and Challenges. PS: Political Science & Politics, 43 , pp 23-27 doi:10.1017/S104909651099077X

Moore, R. and Anderson, W. (2010) ASIS&T Research Data Access and Preservation Summit: conference summary. Bulletin of the American Society for Information Science and Technology 36(6) 42-45.

Planta, A. et al (2010) The enduring value of social science research: the use and reuse of primary research data. In: The Organisation, economics and Policy of scientific Research Workshop, Torino, Italy, April 2010. Available at http://www.carloalberto.org/files/brick_dime_strike_workshopagenda_april2010/.pdf

Eschenfelder, K. and Johnson, A. (2011) The limits of sharing: controlled data collections. Proceedings of the American Society for Information Science & Technology 48(1) 1-10.

Neveol, A. et al (2011) Extraction of data deposition statements from the literature: a method for automatically tracking research results. Bioinformatics 27(23) 3306-3312.

Ingwersen, P. and Chavan, V. (2011) Indicators for the Data Usage Index (DUI): an incentive for publishing primary biodiversity data through global information infrastructure. BMC Bioinformatics 12(S3).

Korjonen, M. (2012) Clinical trial information: developing an effective model of dissemination and a framework to improve transparency. UCL PhD thesis. Available at http://discovery.ucl.ac.uk/1344051/

Bibliography

Bailey, C. (2012) Research Data Curation Bibliography. Houston: Digital Scholarship. Available at http://digital-scholarship.org/rdcb/rdcb.htm

Approaches the question from a library/archive perspective.