Session Block 6 – Friday, June 5, 9:00-10:30
F1: Training Data Users II
- Time: 9:00 - 10:30
- Location: Blegen Hall 150
- Chair: Mandy Swygart-Hobaugh
- Track: Data Services Professional Development
Training for de-identifying human subjects data for sharing: a viable library service
- Presenters: David Fearon and Jennifer Darragh, Johns Hopkins University Sheridan Libraries
- Abstract: Since 2011, Johns Hopkins Data Management Services (DMS) has provided consulting and training on managing, sharing and preserving research data, and operates the JHU Data Archive. Last year, DMS consultant Dave Fearon collaborated with Jen Darragh, JHU's Data Services and Sociology Librarian, providing training on removing identifiers from human subjects data for sharing and archiving. Both presenters attended ICPSR's 3-day summer program on Assessing and Mitigating Disclosure Risk, drawing upon course materials and additional resources to develop a one-hour session. The training emphasizes how researchers can make disclosure assessment, de-identification and sharing through repositories a viable option through techniques applied at each stage of the research cycle, in order to better meet expanding funder expectations and improving dataset impact. JHU's IRB offices vetted the content, and expressed appreciation for training on areas of disclosure assessment that they do not support extensively. A broad audience across JHU in social science, education, medicine and public health divisions spurred some customization of content and flexible blending of introductory and advanced material. We have been expanding online resources on de-identification software and exploring in-depth consulting on de-identification projects. We will discuss the training's topics and context within library and institutional research support services.
Have data skills will travel: one summer, 19 stories
- Presenter: Jackie Carter, University of Manchester
- Abstract: Q-Step is the 5-year national UK programme supporting more social science and humanities students to use quantitative data in their undergraduate studies. Q-Step at The University of Manchester is working across politics, sociology, criminology and linguistics degree programmes to 'make numbers normal' in the classroom. In this session I will report on progress in the first 18months, presenting what happened with our 19 summer placement students when we placed them in think-tanks, polling organisations, research consultancies, city councils, the UK Data Service and market research organisations. Each student produced a poster demonstrating their experience; we celebrated these findings in an event entitled 'Stepping Out'. Moreover they produced briefing papers, blog posts, news articles, public presentations, a book chapter and in one case an evidence-based report for MPs. Some returned to their third year and chose to undertake a dissertation involving data analysis. They exceeded our, the employers, and their own expectations, setting the blueprint for 2015, when we will double the number of students, and increase the number of organisations we will place them with. This presentation tells our students', and employers' stories. from our 2014 pilot year, demonstrating how data skills acquired in the classroom travel into the workplace.
Teaching users to work with research data: case studies in architecture, history and social work
- Presenters: Jennifer Moore and Aaron Addison, Washington University in St. Louis
- Abstract: A tailored approach is ideal for teaching users to work with research data, which often varies significantly by domain and project depending on methodology, available data sources and intended outcomes. In this paper and presentation, three distinct contexts will be put forth, each using Geographic Information Systems (GIS) and focused problem-based learning (PBL) approaches to teach research data use: primary collection, digital data reuse and mined textual data. In each illustration, researchers are not only working to implement a functional methodology, but also to engage students in practices that equip them with theory, tools and skills to advance their own research trajectory. Further, these examples are from researchers in distinctly different disciplines: an architect working on climate change in the St. Louis region, three historians reconstructing history with data from texts and a professor of social work collecting data for villages in India. The Data and GIS Services (DGS) team at Washington University in St. Louis (WUSTL) has partnered with each project presented to support analyses, visualization, management, preservation and sharing of research data. Methods, challenges and opportunities are discussed.
SowiDataNet - Bringing Social and Economic Research Data Together
- Presenter: Monika Linne, GESIS: Leibniz Institute for the Social Sciences
- Abstract: Flexible data distribution and the reuse of research data are becoming increasingly relevant in the social sciences. Therefore, GESIS in collaboration with the Social Science Centre Berlin, the German Institute for Economic Research, and the German National Library of Economics started the development of SowiDataNet. The overarching - and so far in Germany unique - objective is the construction of an infrastructure for decentralized research data from the social and economic sciences in Germany. At present, the holding of research data in Germany is heavily fragmented, which precludes a user-friendly, centralized and therefore quick data retrieval. Due to this major hurdle, data reuse by other scholars underlies extremely high levels of complexity and effort, or in the worst - but not very uncommon - case is simply impossible. This dissatisfying situation is aimed to be resolved by SowiDataNet, which will integrate decentralized research data together within one repository-network. The core of this network will be a web-based, independent infrastructure that allows for low-threshold self-archiving, standardized documentation and distribution of research data. SowiDataNet is community driven. It focusses on the specific needs of social and economic scientists, in order to prosper the ideal of data sharing and long-term data archiving.
The challenges of reducing the public's data trust deficit: The experience of communications and public engagement across the Administrative Data Research Network
- Presenter: Trazar Astley-Reid, Administrative Data Service
- Abstract: Securing an understanding of public attitudes to the use and linking of administrative data has been the cornerstone to setting up the new Administrative Data Research Network. The Network is a UK-wide partnership between universities, government bodies, national statistics authorities and the wider research community www.adrn.ac.uk . Accessing and linking administrative data can bring benefits to society but people worry that data sharing is a risk to their privacy and security. Central to our work is the need to communicate with a broad church e.g. the general public, government bodies, academia, the third sector and our own Network. This is borne out of a need to be transparent, inclusive and trusted. One of our challenges to reduce the public's data trust deficit is to balance the communications messages so as not to increase fear by increasing awareness. This is an exercise in risk management, as without widespread communications targeted at all levels of society the benefits may not be realised. The Network's role is to both secure the public's trust and provide a service to researchers that is secure, lawful and ethical, run by experts in the field who ensure privacy is protected.
Improving efficiency and accuracy of administrative data linkage: can methods from other disciplines help?
- Presenter: Kakia Chatsiou, ADRN, UK Data Archive
- Abstract: One of the great challenges of enabling access to linked de-identified administrative data is the accuracy and quick delivery of pre-processing and linkage of such large datasets. While quite a lot of work needs to be done in cleaning these datasets and getting them ready to be linked with other datasets, only some of the records are successfully linked using automated methods (by i.e. using deterministic or probabilistic methods) while most are linked by indexing professionals. Clerical data linkage while more accurate is more resource intensive and time consuming and adds to the preparation time needed for the researchers to access the data they need for their research. This paper will provide an overview of current methods for preparing and linking administrative records as used by the Administrative Data Research Network, a UK-wide partnership between universities, government departments and agencies, national statistics authorities, funders and the wider research community. We will also discuss how methods from other disciplines, such as Natural Language Processing, have been dealing with similar challenges when working with similar goals in mind, such as when trying to disambiguate named entities in large corpora/datasets.
F3: Social Science Data Archives in Transition
- Time: 9:00 - 10:30
- Location: Blegen Hall 120
- Chair: Daniel Tsang
- Track: Data Services Professional Development
From being an archive to becoming an archive
- Presenter: Anne Sofie Fink, Danish Data Archive/Danish National Archive
- Abstract: From being an archive to becoming an archive Since 1993 Danish Data Archive (DDA) has been part of the National Archive in Denmark. The DDA has been working as a (small) European style data archive -- acquiring, curating and disseminating survey data produced by social science -- as an organisational, independent unit. May 2014 the National Archive implemented a new organisational strategy with the aim of specialising activities across the whole organisation. This means that acquisition, curation, dissemination and software development is now carried out across administrative data and research data by four organisational separate units. Therefore the data archive needs to become a new kind of archive. At the moment we are standing in the middle of this implementation of new ways of working. The unit for data dissemination services for administrative data and research data has kept the name DDA and has taken on the responsibility for our international activities including being service provider for CESSDA ERIC, DDI-L based software development and taking part in DDI Alliance. The presentation will outline the challenges and risks in the process of change and point to new ways of becoming an archive in a new context.
European Research Services for Distributed Data: A Semantic Approach
- Presenters: Anja Burghardt, Research Data Centre of the German Federal Employment Agency
- Abstract: Europe is struggling with societal challenges in fields such as health, migration and demographic change. For the development of tackling policy solutions on a European level innovative pan-European research is crucial. The upcoming challenges are thereby not limited to one specific discipline or European country. Consequently interdisciplinary research on a European level is necessary. For this kind of research data of different types and from multiple sources are needed. A future challenge for the European Research Community and related institutions is to build a research Infrastructure that will be able to integrate data in different forms and from multiple sources such as Data Archives, Research Data Centres, National Statistical Institutes, the corporate sector or the Internet. Thus we propose a European Research Services network equipped with semantic tools organizing multiple ontologies and data flows. This will improve the European Research infrastructure and allow researchers to make use of relational information and data. This will bring the research experience to a new level and ensuing research to its best. This talk will focus on the harmonization of data access forms, data styles, distributed sources, data documentation and possible other necessary information through a semantic model approach.
Data Collection Today: An Overview of Data Collections and Acquisition Procedures In Health Libraries in the South-West, Nigeria
- Presenter: Joseph Olorunsaye, E. Latunde Odeku Medical Library, College of Medicine, University of Ibadan
- Abstract: Data collections and acquisitions in the electronic age are increasingly unique globally. But the growing equity in access to data for effective information service delivery and global relevance is a serious import of this study. Therefore, the effect of current economic and political challenges in Nigeria to the community of data, and the need to bridge the gap in literature is imperative. The purpose of the study is to determine the extent to which health libraries in the South-West, Nigeria have formalized data collections and acquisitions in the electronic age and to highlight the guidelines and policies used for collection and acquisition in the electronic age as against the traditional purchasing models. And, to determine the extent of current challenges on collection development and acquisitions for improve access to relevant data. There are scores of medical/health libraries in the south west of Nigeria but the guidelines for collections and acquisitions for effective information service delivery is underdeveloped. Giving the growing importance of this study, a questionnaire and interview approach will be explored to gather data from the Sectional heads and the Medical/Health Library Directors.
- Panel:
- Ingrid Dillo, Data Archiving and Networked Services (DANS)
- Sophia Lafferty-Hess, Odum Institute
- Stuart MacDonald, University of Edinburgh
- Lynn Woolfrey, DataFirst
- Mary Vardigan, Inter-university Consortium for Political and Social Research (ICPSR)
- Abstract: Safeguarding data for future use is critical to the scientific endeavor, enabling replication of results, new research, and continued return on the original investment. In the current data landscape data repositories perform this important role, protecting data and ensuring their usability over time. How can repositories demonstrate their trustworthiness to the community so that their role in data sharing is recognized by stakeholders? The Data Seal of Approval (DSA) offers basic certification of data repositories, enabling them to provide evidence of compliance with essential data stewardship responsibilities. This session will begin with an overview of the DSA and then present case studies illustrating how repositories undertake the process of certification. Information about an initiative to harmonize certification guidelines for the DSA and the World Data System will also be provided.
- Ingrid Dillo: Data Seal of Approval: An Overview
- Sophia Lafferty-Hess: Case Study: Odum Institute for Research in Social Science, University of North Carolina
- Stuart MacDonald: Case Study: Cornell Institute for Social and Economic Research (CISER)
- Lynn Woolfrey: Case Study: DataFirst, University of Cape Town
- Mary Vardigan: Data Seal of Approval: Looking to the Future
- Panel:
- Amanda Whitmore, Oregon State University
- Lizzy Rolando, Georgia Tech Library
- Brian Westra, University of Oregon Libraries
- Abstract: To provide research data management (RDM) support services, libraries need to develop expertise in data curation and management within the library. Many academic libraries are reorganizing to initiate RDM service structures, but may lack staff expertise in this area. Funding agencies increasingly require a data management plan (DMP) with funding proposals; they describe how data generated in the proposed work will be managed, preserved and shared. We have developed an analytic rubric for assessing DMPs. An analysis of DMPs can identify common gaps in researcher understanding of RDM principles and practices, and identify barriers for researchers in applying best practices. Our rubric allows librarians to utilize DMPs as a research tool that can inform decisions about which research data services they should provide. This tool enables librarians who may have no direct experience in applied research or RDM to become better informed about researchers' data practices and how library services can support them. This panel will consist of five data specialists from academic libraries who will introduce the rubric, share the results of our individual analyses, and describe how the results informed the evolution of services at our respective libraries.