Session Block 3 – Wednesday, June 3, 15:00-16:30
C1: Improving the User Experience
- Time: 15:00 - 16:30
- Location: Blegen Hall 135
- Chair: Jennifer Moore
- Track: Data Services Professional Development
The Evolving Best Practices of DDI in Canada
- Presenter: Jane Fry, Carleton University and Chantal Ripp (in abstensia), Statistics Canada
- Abstract: Sharing of data has become the norm in academia, thus requiring an infrastructure to manage it. As there are different platforms across which they are shared, interoperability is critical to ensure continued collaboration and data sharing across disciplines and institutions. The Data Documentation Initiative (DDI) standard will be examined regarding its use in Ontario universities and the Microdata Access programs at Statistics Canada. The development and current state of Best Practices and tools to ensure collaboration among the different institutions marking up data will be reviewed, including how the shared infrastructure has reduced the cost of data staging and vastly improved data access not only to our local communities of users, but also to user communities nationally and internationally. Nonetheless, challenges do still exist and these are mentioned as well as potential solutions. Finally, we will discuss the evolving state of best practices and suggest ways to move forward with the partnership of those responsible for tagging datasets.
Improving the visibility of data: a case study on an international birth cohort survey
- Presenter: Hersh Mann, UK Data Service
- Abstract: The Young Lives birth cohort study is an innovative project investigating the changing nature of childhood poverty in four developing countries. Released by the Economic and Social Data Service (ESDS) in 2006, 10 researchers accessed the data in the first year. In 2007 the depositors of the survey worked with the ESDS to implement a plan aimed at raising its visibility and increasing its use. The survey became an ESDS 'major study' with its own set of specialist support web pages and an immediate spike in usage followed: by the end of 2007, the study was attracting 40+ users per annum. The ESDS (now UK Data Service) has continued to build on the relationship with the data producers and the Young Lives data portfolio now includes a teaching dataset, as well as new waves of data. At the end of 2014, users per annum now stand at 225, with a substantial proportion of these coming from the countries being studied. This case study is just one example of how collaboration with the data producer can greatly enhance data visibility and use. It is an illustration of how data services can do much more than simply 'get, store, provide'.
The Makerere University Institutional Repository: Benefits, challenges and way forward
- Presenters: Eric Haumba (YMCA Comprehensive Institute, Kampala) and Sekikome Patrick (Makerere University)
- Abstract: Universities function as focal points for academic research in Africa. Egwunyenga (2008) has attributed this to the fact that research is compulsory for lecturers and post graduate students by job description and mandatory academic requirement respectively. The nature of studies at Makerere University requires students to actively engage in research activities in partial fulfilment of the requirements of the degree being sought. For academic staff, the concept of "publish or perish" has come to secure their promotion within the academic environment. Consequently, it is expected that, the volume of research output originating from the university addressing local problems in Uganda will continue to increase. The research outputs addressing issues endemic to the region should be given wide circulation so that the results can be applied in addressing the issues that they sought to tackle. Unfortunately, these outputs gather dusts in departmental offices and library shelves without getting published (Gideon, 2008). Subsequently, these findings die at the institutional level as those in need of this knowledge cannot access it due to institutional and external challenges associated with the Institutional Repository thus the need for an investigation into the practical benefits, challenges and propose strategies for improvement.
- Panel:
- Justin Joque (University of Michigan)
- David Pavelich (Duke University)
- Heather Tompkins (Carleton College)
- Margaret Pezalla-Granlund (Carleton College)
- Abstract: In the last 3-5 years data librarians, long accustomed to working primarily with social scientists and scientists, are increasingly called to work with people across the disciplines who are interested in using their data. This leads to many challenges, among them, discerning when a problem of access requires a technological, methodological, or cultural solution. Working across disciplinary boundaries also opens up new possibilities for engaging with data, by uncovering new uses for familiar data and by introducing new approaches of appreciating and critiquing our understanding of data and how we put it to use. The presenters on this multidisciplinary panel will speak from their experiences in this fertile zone where data science meets the arts and humanities. A digital humanities librarian, a special collections librarian, a visualization librarian, and a curator of library exhibitions will each talk about their experiences reaching across disciplinary practices to get at and connect with data. Their case studies will shed light on common questions and experiences regarding working with new partners, managing expectations around such work, and helping patrons find data in places they may have never thought to look before.
- Heather Tompkins
- Digital humanities often results in the production of rich collections of digital objects, metadata, and data, but digital humanists may not always see this digital output as data. This space between digital humanities and data services creates an occasion for librarians in expanding conceptions of data on campus to include materials beyond quantitative information. This work takes on additional pedagogical significance when mentoring and teaching undergraduate research assistants who will support faculty projects. This presentation explores one approach for exploring this intersections between digital humanities and data services and raises questions about what DH can borrow from the tradition of data services in this area.
- David Pavelich
- For decades, archives and special collections libraries have been collecting data in diverse formats, sometimes purposely, sometimes incidentally. The content is equally, endlessly diverse, from diary reckonings of the value of slaves; to 19th century weather data; to unpublished financial data collected by twentieth century economists. Many such archival items (like ledgers) are passed over by researchers because of their complexities or inscrutability. However, these collections of under-explored data hold pedagogical potential for undergraduate (and even graduate) instruction. This paper offers a way for special collections librarians and data librarians to work together to teach students about using primary sources from two very different perspectives within the research library.
- Margaret Pezalla-Granlund
- Many artists are interested in the way information is represented, and explore techniques of visualizing data through their artwork. Some of the most interesting artwork about data gets to questions about how we read data, how it is understood (and misunderstood), and the possibility of uncertainty. Is there an art behind data? Can a graph be expressive? What can artists tell us about how we look at numbers? For this session, I will choose three key artist’s books to use as case studies to explore the ways in which artists visualize and interpret data.
- Justin Joque
- From text mining projects to the creation of interactive websites, humanists are turning towards data as a way to understand and augment their research. Offering data visualization and mapping support as part of our Spatial and Numeric Data Services, we often assist on substantial portions of these projects. Especially as various sources and types of textual data, including those with interesting topological features such as link networks for websites, become available and methodologies for processing large corpora develop, humanists are increasingly using and thinking critically about data. The vast amounts of data that can be computationally processed are pushing the boundaries of what reading and analyzing textual information means in the humanities. This presentation will explore some of the interesting uses of data in the humanities we have developed and supported at the University of Michigan Library and the ways in which humanists along with data librarians are thinking about data and its relation to the humanities.
C3: Data Sharing Behavior and Policy
- Time: 15:00 - 16:30
- Location: Blegen Hall 130
- Chair: Melanie Wright
- Track: Research Data Management
"The road to data sharing is paved with good intentions": Looking at UK and German University Research Data Policies
- Presenters: Laurence Horton (London School of Economics and Political Science) and Astrid Recker (GESIS - Leibniz Institute for the Social Sciences)
- Abstract: As of late 2014, 20 percent of UK Higher Education Institutions (HEIs) have adopted a Research Data Policy. In contrast, only one percent of German HEIs have adopted one. We examine policies in the context of national funder requirement differences and the overall research funding landscape. Whereas recommendations exist on what should go into a policy, there is no analysis on what is going into policies. This presentation compares the content of policies from both countries for similarities and differences to see if -- regardless of the differences in the environments -- a standard form and language is emerging. The presentation will illustrate the adoption of two distinct approaches. The first is a 'general principles' approach. This policy is short, strong on the normative values for data re-use and preservation, and general goals, but weak on policy detail and enforcement mechanisms. The other approach is a formal "legalistic" style; it's longer, specific in requirements, strong in definitions, but not necessarily clear in direction or easy for researchers to work with. Policies are tested for type of university (research intensive vs non-research intensive institutions) and age (university cohort). The results of this research fed into LSE's own draft research data policy.
Data sharing Practices across the Social Science Disciplines
- Presenter: Amy Pienta, ICPSR, University of Michigan
- Abstract: Data sharing has become an increasing important issue facing scientists in recent years. And, understanding what kinds of factors affect data sharing behavior remains an important goal in informing those setting data sharing policy. The present analysis examines survey data ICPSR collected from social scientists in the United States who collected primary research data under funding from the National Science Foundation or the National Institutes of Health. Building on our prior work, here we examine whether certain social science disciplines embraced data sharing more than others early on. Results from multivariate regression models suggest political scientists and economists are most likely to share their data and psychologists and health scientists are the least likely. Implications for discipline-specific policies are discussed.
Data sharing behavior: a social psychology approach
- Presenter: Alexia Katsanidou, GESIS - Leibniz Institute for the Social Sciences
- Abstract: Previous work on journal data sharing focused on the relation between data policies and research data availability (Ghergina and Katsanidou 2013 and Zenk-Möltgen and Lepthien 2014). A clear literature gap is the omission of analyzing individual researcher intrinsic motivation for data sharing. Social psychology offers the analytical framework that allows us to investigate how personal beliefs can shape intentions of individuals and how these intentions influence their behavior. Based on the theory of planned behavior by Ajzen and Fishbein, which emphasizes the impact of peer group, this paper sets out to explain data sharing behavior by authors in political science and sociology journals. A set of authors of publications from pre-selected ISI indexed journals will be the sample for a survey conducted to explore the author's personal beliefs, intention and behavior regarding sharing the data their analysis is based upon. We hope to shed some light on a previously obscure component of data sharing behavior.
Developing a Repository Lifecycle Model and Metadata Description: Modeling and Describing Changes
- Presenter: Juliane Schneider, Research Data Curation Program, UC San Diego
- Abstract: In the past decade a wealth of data repositories and open datasets across all disciplines have been created. Registries of repositories have also been established, mostly by discipline (medical, social sciences) or by ownership (academic, governmental). We have reached a point where a lifecycle model should be constructed for these resources, as well as a set of agreed-upon metadata to describe them. We will present our repository lifecycle model, and propose the most likely existing metadata schemas for constructing an overall description for repositories.
Research Data Repositories: Review of current features, gap analysis, and recommendations for minimum requirements
- Presenter: Amber Leahey (Co-authors: Nancy Fong, Claire Austin, Peter Webster, and the rest of the RDC SINC Committee.), Scholars Portal, Ontario Council of University Libraries
- Abstract: Scientific reproducibility and data sharing are increasingly recognized as integral to scientific research and publishing, to ensure new knowledge discovery. This goes far beyond making data publicly available. It requires informed and thoughtful preparation from initial research planning to collection of data/metadata, considerations of interoperability, and publication in curated repositories. Research Data Canada (RDC) is a collaborative, non-government organization interested in access to and preservation of Canadian research data. The RDC Standards and Interoperability Committee (RDC-SINC) assessed 30 Canadian and International research data repositories for data transfer, storage, curation, preservation, and access. We identified data submission requirements, Standards, features and functionality implemented by the repositories, and performed a gap analysis. Results are discussed in light of current and evolving needs. Recommendations are made for minimum research data and repository requirements. Terminology used complies with RDC's new glossary of research data "Terms & Definitions". This paper provides a practical multi-disciplinary compendium of core research data submission and repository requirements currently in use. Given this rapidly developing field, the paper will be updated just prior to submission.
Re-Shaping the Landscape of Research Data Repositories
- Presenter: Louise Bolger, UK Data Service
- Abstract: ReShare is the UK Data Service's online data repository for archiving and sharing research data, produced by researchers, primarily for ESRC grant holders. It is designed for 'short-term management' whereby researchers self-deposit data and prepare data files themselves. However, ReShare's metadata profile and discovery system is fully integrated in UKDS. We optimise linkages with other systems to maximise standardisation, and minimise the 'burden' on depositors of completing a metadata record. Once depositors complete their project record, ReShare administrators conduct reviews to check for disclosure risks, and quality of documentation. Currently, there are 595 collections in ReShare, 500 of which were migrated from the ESRC Data Store, which ReShare replaced. To incentivise and reward depositors who provide complete well-constructed project records containing data and robust supporting documentation, we are introducing a quality mark on projects which meet the criteria. This paper discusses the criteria for this quality mark in more depth, reflects upon the common issues faced in the review process of ReShare, and provides recommendations on how depositors can avoid making the common errors seen in ReShare. Finally, an overview of ReShare is provided, covering topics such as ReShare's purpose, functionality, and a reflection on a year of ReShare.
- Abstract: The acceptance and adoption of a standard like DDI highly depends on the availability of software tools to use it. The DDI Developers Community is a part of the DDI Alliance where software developers from around the world can meet and swap ideas on working with DDI in various programming environments and languages. In this session we like to give you an introduction to our work and present you a selection of available tools. This Session will give you an overview of tools available from the community. Most of the presenters will be available during the subsequent poster session for detailed questions or further demonstrations of their tools.
- Presenters:
- Adrian Dușa: Web-based solutions for data archiving and dissemination using DDI
- Johan Fihn: TERESAH - Authoritative Knowledge Registry for Researchers
- Olof Olsson: Building a community platform for DDI Moving Forward
- Olof Olsson and Johan Fihn: Exposing your metadata via eXist
- Ørnulf Risnes: Nesstar Publisher, Server and the Nesstar APIs
- Ingo Barkow: Data Management Module
- Jeremy Williams: Metadata Management and Dissemination with CED2AR
- Dan Smith: Colectica Designer, Repository, Portal and SDK version 5
- Metadata Technology North America
- Marcel Hebing: DDI on Rails