November 7, 2018
Purdue will be hosting a one-day event in West Lafayette, Indiana to bring together colleagues for focused data discussion. The key themes for this summit will be Data Visualization, Big Data Anaytics, and Data Governance.
For your convenience, a block of rooms has been reserved for November 6, 2018 and November 7, 2018. Please call the Purdue Union Club hotel at 765 494 8900 to reserve a room for your stay. Please be sure to mention you are reserving for the Purdue Data Summit to get the reserved rate.
Registration
Day at a Glance
Welcome
Diane Beaudoin, Purdue University
Location: STEW 306
9:00am - 9:15am
Presentation
Track Sessions
9:30am - 11:45am
Lunch
Location: Purdue Memorial Union
12:00pm - 1:00pm
Track Sessions
2:30pm - 3:30pm
From Data to Knowledge: Large Scale Analytics and Insight Derivation from Data
Iris Shen
Microsoft Research
10:45am - 11:45am
Location: STEW 306
Presentation
Our Berkeley - Cal's New Data Digest
Noam Manor
Univ. of Cal-Berkeley
2:30pm - 3:30pm
Location: STEW 310
Presentation
Agent-Based Modeling for Course Enrollments
Ian Pytlarz, Purdue University
Scott Pu, Purdue University
9:30am - 10:30am
Location: STEW 310
Presentation
Using Social Media Listening to Explore Perceptions of Big Ten Land Grant Universities
Nicole Widmar
Purdue University
2:30pm - 3:30pm
Location: STEW 306
Presentation
Auditing: Basic Data Quality Ingredient
Jennifer Littlefield, Purdue University
Abby Snodgrass, Purdue University
Kathleen Thomason, Purdue University
9:30am - 10:30am
Location: STEW 306
Presentation
Data Governance at a Public Research Institution: The Long and Winding Road
Connie Pierson
Univ. of Maryland Baltimore County
Mike Glasser
Univ. of Maryland Baltimore County
10:45am - 11:45am
Location: STEW 310
Presentation
Closing
Location: STEW 306
3:35pm - 4:00pm
Keynote

Robert Wagner
SPLUNK
Robert Wagner is a security professional with 15 years of InfoSec experience. He is a co-founder of the Hak4Kidz children's charity and a co-founder of BurbSecCon in Chicago.
Data Visualization
From Data to Knowledge: Large Scale Analytics and Insight Derivation from Data

Iris Shen
Principal Data Scientist
Microsoft Research
Location: STEW 306
10:45am - 11:45am
In this Information Age, with massive amount of data and computation powers, machine has made great strides in exhibiting intelligent behaviors from collecting data to acquiring and utilizing knowledge. In this talk, we will describe Microsoft Academic, a research project to create a cognitive agent that can be simultaneously proficient in more than 220,000 fields of study by reading over more than a century’s worth of scholarly publications from the web. We will have a live demo to show how the cognitive agent can be publicly accessed, and how the knowledge accumulated has played a role to provide analytics, as well as derive and visualize insights.
Our Berkeley - Cal's New Data Digest

Noam Manor
Institutional Research Analyst
Office of Planning & Analysis
University of California - Berkeley
Location: STEW 310
2:30pm - 3:30pm
Our Berkeley is a public-facing website featuring data and narratives on the major dimensions of UC Berkeley. Over 20 dashboards were developed and released in the course of the last year, spanning topics such as admissions, enrollment, demographics, instruction, financial aid, student outcomes, student experience, sponsored projects, and human resources. This session will describe the full lifecycle of the project, focusing on working with campus partners to help Berkeley achieve the goals of greater transparency for and increased awareness among both internal and external audiences. We will discuss the development process, team structure, review and approval of new dashboards, and the technologies used to create them. The session will conclude with a live demo and Q&A.
Big Data Analytics
Agent-Based Modeling for Course Enrollments

Senior Data Scientist
Office of Institutional Research, Assessment and Effectiveness
Purdue University

Data Scientist
Office of Institutional Research, Assessment and Effectiveness
Purdue University
Location: STEW 310
9:30am - 10:30am
Using data to derive a probability space for student decision making, Purdue has crafted an agent-based model to predict enrollments in each course at the university. We will cover our methodology as well as the data involved in building the model.
Using Social Media Listening to Explore Perceptions of Big Ten Land Grant Universities

Nicole Widmar, PhD
Professor/Associate Head and Chair Graduate Programs
Department of Agricultural Economics
Purdue University
Location: STEW 306
2:30pm - 3:30pm
Various private industries have been employing social media listening to “hear” what is being said about their brands, industries, and even competitors. It is no secret that your online postings, complaints, tweets, and blogs are combed over in an effort to understand wants, likes, dislikes, and perhaps to drive marketing decisions. All across the Web, data is being generated by millions of users about millions (or billions?) of topics. Universities, especially large R1: Research Universities (Highest research activity), Land Grant Universities serving multiple missions within a state – including Outreach and Extension, and those competing in Division I of the National Collegiate Athletic Association (NCAA), are industries in themselves. Dr. Nicole Widmar will explore the social media footprint of Lang Grant Universities in the Big Ten, including using language analysis to measure the overall sentiment about what’s being said about individual Universities over time. Together we will look at the sentiment about individual Universities both over time and surrounding events which have occurred over the past 24 months.
Governance
Auditing: Basic Data Quality Ingredient

Financial Operations Accountant
Accounting Services
Purdue University

Strategic Data Manager
College of Agriculture
Purdue University

Senior Director, Comptroller
Comptroller
Purdue University
Location: STEW 306
9:30am - 10:30am
The data is wrong. Anyone ever heard that?
Is the data wrong? Or is the understanding of the data not clear? Or both?
The panelists each have experience in both the private sector and higher education. Higher education is a more decentralized structure, which makes auditing even more important. Systems and data are in silos. Auditing, in addition to improving data quality, is one way to increase the understanding of the data as well as how it connects across the silos. Collaboration among the academic, central offices and IT is key. Clean, well-understood data across all levels of the institution allow for more strategic decision-making, based upon the bigger picture.
Data Governance at a Public Research Institution: The Long and Winding Road

Associate Vice Provost
University of Maryland Baltimore County

Director of Decision Support
University of Maryland Baltimore County
Location: STEW 310
10:45am - 11:45am
This session will focus on how our IR office partnered with IT to develop and maintain our system of data governance at a public four-year research institution. Session attendees will learn the factors considered as we built our data governance structure, the resulting committee and decision-making structure, and how this structure is maintained. The role of data stewards, data security and integrity, and data quality will be emphasized. Discussion will also focus on the simultaneous build out of our data warehouse and reporting environment, and how we were able to use our data governance structure to foster campus buy-in for data informed decision-making, as well as an example of how the data governance structure has been used to guide decision making with respect to strategic planning.