Time | ID | Title | Location |
---|---|---|---|
9:00-12:00 | W1 | Hands-on Big Data presenters: Ryan Womack |
TBD |
W2 | Where Everybody Knows Your Name: Building Credible and Sustainable Data Services in a Liberal Arts College presenters: Kristin Partlo, Danya Leebaw, Paula Lackie, Peter Rogers, & Diana Symons |
TBD | |
W3 | Introduction to International Microdata: IPUMS-International and the Integrated Demographic & Health Surveys presenters: Lara Cleveland, Patricia Kelly Hall, & Miriam King |
TBD | |
W4 | Using NVivo 10 for Qualitative Data Analysis presenters: Mandy Swygart-Hobaugh, Georgia State University |
TBD | |
W5 | Metadata Management Using DDI and Colectica presenters: Jeremy Iverson, & Dan Smith |
TBD | |
12:30-13:30 | Break | ||
13:30-15:30 | W6 | Data Quality in Qualtrics: Applying data management practices during design, collection, and analysis. presenters: Andrew Sell, Thomas Lindsay, & Alicia Hofelich Mohr |
TBD |
W7 | Data Sharing with ICPSR: Fueling the Cycle of Science through Discovery, Access Tools, and Long-Term Availability presenter: Johanna Bleckman, & Kaye Marz |
TBD | |
W8 | Managing and Sharing Qualitative Data presenter: Colin Elman & Dessislava Kirilova |
TBD | |
13:30-16:30 | W9 | New data from IPUMS-CPS and ATUS-X presenter: Sarah Flood, & Katie Genadek |
TBD |
W10 | The Art of the Merge: How to Merge Data in Three Statistical Software Programs presenter: Ashley Jester, Tara Das, & Starr Hoffman |
TBD |
W1: Hands-on Big Data
- Time: 9:00 - 12:00
- Presenter: Ryan Womack, Rutgers University
Abstract: This workshop is for those of you who, having read about Big Data and seen some of its results in academic studies and the commercial world, would like to get a sense of what actually working with Big Data entails.
The workshop will provide an overview of key technologies for the handling and analysis of large scale datasets, including Hadoop/MapReduce, the RHadoop package, other R packages used for large scale analysis, and Big Data handling environments such as Cloudera, Hortonworks, Tessera, and Amazon Web Services. We will also discuss a few of the primary challenges in successfully completing analysis of large scale data, such as integrating and structuring heterogenous data, handling sparse matrices, and devising effective analytical routines using parallel processing and splitting data. Participants will work with a live demonstration environment that provides a realistic introduction to Big Data Analytics using scripts that will run both on a scaled-down demonstration dataset and on truly large scale data.
W2: Where Everybody Knows Your Name: Building Credible and Sustainable Data Services in a Liberal Arts College
- Time: 9:00 - 12:00
- Presenters:
- Kristin Partlo, Carleton College
- Danya Leebaw, Carleton College
- Paula Lackie, Carleton College
- Peter Rogers, Colgate University
- Diana Symons, College of Saint Benedict/Saint John's University
- Aaron Albertson, Macalester College
Abstract: Providing data services within a liberal arts college setting presents unique challenges and opportunities. Residential liberal arts colleges are characterized by a focus on teaching undergraduates, small class sizes, and individualized support from staff and faculty provided with a fraction of the technical infrastructure of research institutions.
This workshop will cover topics particularly relevant for those with emerging or established data services in a liberal arts college. Practicing librarian from four instituions will lead discussion and interactive activities designed to help participants learn more about the following as they pertain to the particular institutional context of liberal arts colleges: developing a sustainable and credible model, building on the strengths of a small community, outreach to faculty and students, identifying allies, empowering other colleagues to respond to data questions and needs, establishing data management practices, partnering with related campus initiatives like digital scholarship, integrating data into a traditional collection development model, and curating campus data projects. Participants will leave with strategies to advance data services on their own campuses. Beyond addressing these topics, an important goal for the workshop is for liberal arts data practitioners to build relationships with their colleagues at similar institutions.
W3: Introduction to International Microdata: IPUMS-International and the Integrated Demographic & Health Surveys
- Time: 9:00 - 12:00
- Presenter:
- Lara Cleveland, University of Minnesota
- Patricia Kelly Hall, University of Minnesota
- Miriam King, University of Minnesota
Abstract: The IPUMS-International (Integrated Public Use Microdata Series -International) and the IDHS (Integrated Demographic & Health Surveys) are international microdata dissemination projects of the Minnesota Population Center (MPC). IPUMS-International provides large samples of census microdata from 79 countries, from the 1960s through the latest census rounds. These records, covering over 500 million individuals, report on demographics, education, household structure, labor force participation, dwelling characteristics, and other topics. IDHS offers data on African and Indian women of childbearing age and children under 5, with information on health topics ranging from contraceptive use and prenatal care to HIV and intimate partner violence. Data from IPUMS-International and IDHS are ideal for comparative analyses across time and space. The user-friendly web interface shows variable availability at a glance, offers variable-specific information on question wording, codes and frequencies, and comparability issues, and merges files to create customized data extracts. This is a hands-on session that will introduce participants to the power and ease-of-use of IPUMS and IDHS. After an introduction to the datasets, participants will do a series of exercises to showcase the interactive metadata, customized microdata extract system, online tabulator, and classroom registration system.
W4: Using NVivo 10 for Qualitative Data Analysis
- Time: 9:00 - 12:00
- Presenter: Mandy Swygart-Hobaugh, Georgia State University
Abstract: Many social scientists like to “get their hands dirty” by delving into deep analysis of qualitative data – be it discourse analysis, in-depth interviews, ethnographic observations, visual and textual media analysis, etc. Manually coding these data sources can become cumbersome and cluttered – and may even hinder drawing out the rich content in the data.
Through hands-on work with provided qualitative data, participants will explore ways to organize, analyze, and present qualitative research data using NVivo 10 analysis software. The workshop will cover the following topics:
- Coding of text and multimedia sources
- Using Queries to explore and code data
- Creating Attribute Value Classifications to facilitate comparative analyses
- Data visualizations
W5: Metadata Management Using DDI and Colectica
- Time: 9:00 - 12:00
- Presenters:
- Jeremy Iverson, Colectica
- Dan Smith, Colectica
Abstract: The DDI Lifecycle metadata standard enables creating, documenting, managing, distributing, and discovering data. Colectica is a software tool that is built on open metadata standards, and helps facilitate adopting DDI into the research data management process.
This workshop starts with a high-level overview of the DDI content model, and then teaches how to create DDI XML, both manually and with Colectica. Finally, participants will learn how to publish DDI metadata.This workshop covers the following topics:
- Introduction to DDI 3.2
- Introduction to Colectica
- Documenting concepts and general study design
- Designing and documenting data collection instruments and surveys
- Documenting variables and creating linkages
- Ingesting existing resources
- Publishing resources
- Hands-on: use Colectica and DDI to manage a sample study
W6: Data Quality in Qualtrics: Applying data management practices during design, collection, and analysis
- Time: 13:30 - 15:30
- Presenters:
- Andrew Sell, University of Minnesota
- Thomas Lindsay, University of Minnesota
- Alicia Hofelich Mohr, University of Minnesota
We expect this workshop to be useful both for users of Qualtrics and for those who encounter data collected in Qualtrics. Some knowledge of creating surveys in Qualtrics is expected. Participants who do not use Qualtrics are advised to explore the tool beforehand.
W7: Data Sharing with ICPSR: Fueling the Cycle of Science through Discovery, Access Tools, and Long-Term Availability
- Time: 13:30 - 15:30
- Presenters:
- Johanna Bleckman, ICPSR
- Kaye Marz, ICPSR
The workshop will cover several deposit options (to fully-curated archives and the public access archive, openICPSR), differences between sharing public-use and restricted-use data, and benefits to depositors through the ICPSR Website. A hands-on demonstration of making a deposit is planned.
Finding data for the unique needs of a research project can be challenging, particularly in a world that values both the liberal use and protection of research data. The workshop will describe and demonstrate the array of discovery and exploration tools that leverage ICPSR’s vast data catalog, metadata, and online analysis options, discuss the discovery, use, and publishing from restricted-use data, and include group discussion of disclosure issues and hands-on time with ICPSR data tools.
Participants will become more familiar with:
- Federal data sharing requirements
- Options for sharing data
- Data discovery tools
- Protection of confidentiality when sharing data
W8: Managing and Sharing Qualitative Data
- Time: 13:30 - 15:30
- Presenters:
- Colin Elman, Qualitative Data Repository
- Dessislava Kirilova, Qualitative Data Repository
The idea that qualitative data should be shared is much more recent and controversial. Part of the debate arises from the absence of widely shared understandings of the concrete operational practices for sharing qualitative data. Of course, many of the best practices for dealing with data that librarians, archivists, data center staff and other information professionals typically employ remain applicable. However, qualitative data present a variety of additional challenges due to their close proximity to the social world from which they were drawn. Their often-textual nature likewise poses special challenges to sharing, particularly internationally. The workshop highlights these challenges and provides a basic framework research data professionals can make use of when called upon to advise their user community about managing qualitative data.
Workshop organizers are associated with the Qualitative Data Repository (QDR). Funded by the National Science Foundation, QDR was established in 2014 to provide the infrastructure to safely store and share qualitative data and to contribute to developing the expertise and tools needed to share such data.
Specific techniques, tools, and resources will be presented on the following topics:
- Planning to manage qualitative data before a research project begins
- Organizing qualitative data for analysis and writing, research transparency and potential sharing
- Sharing qualitative data ethically and legally and in a way that facilitates broad international access
- The uses to which shared qualitative data can be put
W9: New data from IPUMS-CPS, ATUS-X, and IPUMS-SESTAT
- Time: 13:30 - 16:30
- Presenters:
- Sarah Flood, University of Minnesota
- Devon Kristiansen, University of Minnesota
W10: The Art of the Merge: How to Merge Data in Three Statistical Software Programs
- Time: 13:30 - 16:30
- Presenters:
- Ashley Jester, Columbia University
- Tara Das, Columbia University
- Starr Hoffman, Columbia University
"I’ve found all of my variables and need to bring them into a single file…"
This workshop will focus on merging datasets using three statistical software packages: Stata, R, and SAS. It will teach the basic research principles and data requirements necessary to execute a successful merge and will apply this knowledge. Instructors will provide sample datasets and guide participants step-by-step through preparing data, completing a merge successfully, and validating results. This will be of use to researchers as well as to librarians and others who support research. If you need to merge data or assist those who do, this workshop will give you the knowledge to make your data merge a success.
Learning objectives:
- Able to execute successful data merges in Stata, R, and SAS
- Understand general principles necessary to complete a data merge in any application