Cyber Center

Resources - Seminars

 No results were found. Please try your search again.





Anticipating Ethical Controversy in Big Data

Co-sponsored by Cyber Center, Department of Computer Science, Brian Lamb School of Communication

November 17, 2014  Jeff Collmann

“Big Data” means many things to many people.  This talk will explore themes about ethical reasoning in Big Data that have emerged in an interdisciplinary, collaborative project between scholars at Purdue University, Georgetown University, Stevens Institute of Technology and the Association of American Geographers.  What does the term “Big Data” mean?  How does “Big Data” differ from “Small Data” with respect to its production, use and interpretation, particularly in scientific research? How do the ethical implications of issues such as privacy vary across the range of communities working with Big Data?  What are the implications of key ethical principles such as the principles of beneficence, trust  or social justice for Big Data and do they also vary across the range of communities?  The talk will consider several case examples as points of reflection with respect to all these issues.   The talk begins with the premise that none of these questions has ready answers in the Big Data domain and argues that we should start discussing them now before major new ethical controversies emerge with the application of Big Data methods in scientific research.

Katie Siek

 If You Build It 

 Co-sponsored by Cyber Center and Computer Science Women's Network

 April 17, 2014   Katie Siek

In this talk, I show how applying theories from psychology, design, and business to application  design gradually improved acceptance and appropriation in underserved communities. I first briefly discuss how we used Bandura's Social Cognitive Theory to design a mobile application that empowers low literacy, chronically ill patients to manage their diet. I then discuss various theories and design methods we used with low socioeconomic status families to design an application to improve family snacking behaviors. Finally, I will show how we are building on the Ikea Effect to motivate low socioeconomic status children to create their own health monitoring technology. All of the methods can be easily adopted into an engineer's toolbox for designing applications that can potentially change the world.

No picture available of Xun Yi

 Practical k Nearest Neighbor Queries with Location Privacy  

 April 17, 2014   Xun Yi

In mobile communication, spatial queries pose a serious threat to user location privacy because the location of a query may reveal sensitive information about the mobile user. In this talk, we consider k nearest neighbor (kNN) queries where the mobile user queries the location-based service (LBS) provider about k nearest points of interest (POIs) on the basis of his current location.

Howard T. Welser

 Agents beyond control?  How distributed social control in computational institutions can increase organizational fidelity and reduce corruption.

Co-sponsored by Cyber Center and Brian Lamb School of Communication

February 19, 2014    Howard T. Welser

This talk describes the organizational challenge in terms of the principal-agent problem and identifies the set of attributes necessary for distributed social control in computational institutions. This set forms an ideal type that can be applied to extant organizations to reveal problems and suggest solutions that can be integrated from current online systems of interaction. In conclusion, I advocate for researchers and designers to propose, develop and to implement systems that will allow organizations to bring organizational agents under control.

Dr. Munirul Haque

 Mobile-Based Symptom Management for Palliative Care

 Co-sponsored by Cyber Center and Regentrief Center for Healthcare Engineering

 Janurary 16, 2014   Mohamed Munirul Haque

The goal of palliative care is to improve the quality of life of terminally ill patients through the management of pain and other symptoms. Though the term ‘palliative care’ is well known in the developed world, it is relatively a new term in the developing world. In this presentation, we elaborate on the challenges faced by the rural breast cancer (BC) patients of Bangladesh and a mobile phone based solution for their palliative care treatment. Breast cancer patients need traditional treatment as well as long term monitoring through an adaptive feedback-oriented treatment mechanism. Based on detailed field studies, we have developed and deployed a mobile-based remote symptom monitoring and management system named e-ESAS. Design of e-ESAS has evolved through continuous feedback from both the patients and doctors. e-ESAS has been used by 10 breast cancer patients to submit symptom values from their home for 10 months (Nov ’11- Sep ’12). Our results show how e-ESAS with motivational videos not only helped the patients to have a ‘dignified’ life but also helped the doctors to achieve the goals of palliative care.

Sorin Adam Matei

Do wise crowds have sticky elites?

Co-sponsored by Cyber Center and Brian Lamb School of Communication

November 13, 2013           Sorin Adam Matei

Pareto’s “80/20 rule,” which says that 80% of the output or benefits is produced or enjoyed by 20% of the members of any given group, has increasingly become a source of debate. For example, the top 20% American earners reap 51% of  the national personal income.

How uneven is the social media wealth? If Wikipedia were a country and words its currency, how uneven would its “income” distribution be? Would Wikipedia me more egalitarian than Denmark or more elitist than Sierra Leone? And more importantly, do the highest earners belong to a stable elite, or do they change all the time?

What can this type of research tell us about nurturing social media engagement and a sense of equity and democracy online?

Victoria Stodden

Scientific Reproducibility: Opportunities and Challenges for Open Research Data and Code

Co-sponsored by Purdue Libraries and the Cyber Center in Discovery Park

October 25, 2013     Victoria Stodden

Assistant professor of Statistics at Columbia University whose research centers on the multifaceted problem of enabling reproducibility in computational science. This includes studying adequacy and robustness in replicated results, designing and implementing validation systems, developing standards of openness for data and code sharing, and resolving legal and policy barriers to disseminating reproducible research.  Her work has resulted in platforms and tools such as SparseLab,, and the Reproducible Research Standard. Stodden is a member of the National Science Foundation's Advisory Committee on Cyberinfrastructure, the Mathematics and Physical Sciences Directorate Subcommittee on Support for the Statistical Sciences at NSF, the National Academies of Science committee on Responsible Science: Ensuring the Integrity of the Research Process, and several committees in the American Statistical Association. She completed her PhD in Statistics and her law degree at Stanford University, and her Erdös Number is 3.

Kevin P. White

Big Data in Biology: From Genomics to Systems Biology and Medicine

April 30, 2012   Kevin P. White

Kevin White, PhD, combines experimental and computational techniques to understand the networks of factors that control gene expression during development, evolution and disease.  He is a Professor in Human Genetics, Ecology & Evolution and Medicine, Section of Genetic Medicine and Director of the Institute for Genomic & Systems Biology at the University of Chicago and Argonne National Laboratory.
Kevin White graduated magna cum laude from Yale University with a joint B.S.-M.S. degree in biology in 1993. He completed his Ph.D. in developmental biology at Stanford University in 1998, followed by a postdoctoral fellowship in biochemistry and genomics at the Stanford Genome Technology Center. In 2001, he joined the faculty at Yale University as an Assistant Professor of Genetics, and was promoted to Associate Professor in 2003. In 2006, he joined the faculty of The University of Chicago.  He was named an NIH Genome Scholar in 2000, a W.M. Keck Distinguished Young Investigator in Medical Sciences in 2003, an Arnold and Mabel Beckman Young Investigator in 2004, a Pritzker Fellow in 2006 and 2007, and a James and Karen Frank Family Professor in 2006. In 2009, he was elected chairman of the Gordon Conference on Hormone Action in Development and Cancer and in 2012 he was a co-organizer of the Genetics Society of America Drosophila Research Conference.

Ben Shneiderman

Building Safe, Thriving Communities with Credible Content: Design Principles for Web Sites and Social Structures

April 9, 2012    Ben Shneiderman

While researchers are far from having reliable predictive models of online community success, a growing body of literature and inspirational examples can provide guidance for aspiring community managers. We know that design principles for websites can make a substantial difference in getting first-time users to return or regular visitors to become content contributors and eventually active collaborators.  Getting active collaborators to become committed leaders is yet a bigger challenge, but having such leaders is one of the keys to community success.  These leaders help set behavior norms by their examples, take the community into new directions, and deal with a wide variety of threats.  Successful communities must develop leaders who create resilient social structures to deal with serious threats from hackers who maliciously violate privacy, attack servers, vandalize content, or provide misleading content.  This talk will cover examples and a road map for research.

Renee J. Miller

On Schema Discovery

April 12, 2012    Renée J. Miller

Structured data is distinguished from unstructured data by the presence of a schema describing the logical structure and semantics of the data. The schema is the means through which we understand and query the underlying data. Schemas enable data independence. In this talk, I consider new challenges in the old problem of schema discovery. I'll discuss the changing role of schemas from prescriptive to descriptive. I'll use examples from Web data publishing and from Business Analytics to motivate the automation of schema discovery and maintenance.

Beth Plale

Metadata and Provenance: Fins in the Sea of Data 

March 28, 2012    Beth Plale

The data deluge and discussions of its implications is no longer just the purview of obscure technology conferences. So visible and immediate has the topic become, that it was a topic at the 2012 Davos World Economic Forum.  The data deluge of interest to computer scientists and informatics researchers is the one that impacts science and scholarly research.  If realized properly, this deluge will be a catalyst for new scientific discovery that fuels advances in grand challenge questions such as climate and social-ecological interactions.  Antecedent to these discoveries, however, is the need for deeper understanding of the issues of access and use of the metadata and provenance about scientific data.   On one hand, good metadata and provenance turn data from being write-once, read-none to write-many, read-many; on the other, bad metadata and provenance just contribute in a non-trivial way to the data deluge. 

In this talk I discuss several related research efforts on tools, evaluations, and experiences gained in provenance and metadata use and access that facilitate new forms of access and use of scientific and scholarly data.

Mohamed Eltoweissy

BioSENSE: Biologically-inspired Secure Elastic Networked Sensor Environment 

October 25, 2011   Mohamed Eltoweissy

Recently, the Department of Defense has designated Cyberspace as the fifth, and only human-made, dimension of operations. Besides, Cyberspace is becoming the nervous system of our modern society as it is increasingly being integrated in virtually all aspects of control and communication. Today’s Cyberspace comprises highly dynamic, interdependent global network of information technology infrastructures, telecommunications networks, computing systems, integrated sensors, control systems, embedded processors and controllers. Moreover, cyber resources are being tightly coupled and coordinated with their physical counterparts giving rise to pervasive social-cyber-physical environments. These emerging environments promise innovations, new services and significant enhancements in the safety, efficiency and reliability of numerous systems and infrastructures ranging from worldwide social interactions and gaming to smart infrastructure  smart infrastructure systems in energy, healthcare, transportation and emergency response to national security and defense.

Charles Schmitt photo

 Enabling Data Driven Research

December 2010  Charles Schmitt

Scientific communities have been rapidly expanding their ability to generate and use electronic data for exploration, modeling and simulations, and for generating and testing hypotheses. However, the current emphasis on data-driven discovery has generated new IT and informatics-related challenges that often impede the goals of these communities.   These challenges, especially when coupled with the shifting hardware landscape, are also impacting the ability of campus computational and IT infrastructures to rapidly adapt to scientific demands.   The Renaissance Computing Institute (RENCI) works with various scientific communities to assist in overcoming these challenges by leveraging new research in cyberinfrastructure, computer science, and information sciences.  In this talk, we address the successes and limitations we’ve encountered in meeting those challenges with a focus on the gaps that new research in technology can address to move scientific domains forward. 

Presenter Image

To friend and to trust: Eliciting truthful and useful ratings online

November 2010  Lada Adamic

Online rating and reputation systems have shown themselves to be essential for filtering content, building trust, and fostering communities. However, these ratings should not be taken at face value. When individuals submit ratings online, especially ratings of other people, they are being asked to quantify inherently subjective feelings. To complicate matters, they may formulate their ratings differently if these are shown to others, and if those others can reciprocate. In this talk I will present a study that combines data analysis of several online data sets. For one such system,, I will discuss findings from a largescale survey and in-depth interviews to examine from multiple angles the challenges that users have in providing useful and truthful ratings. We find, for example, that the potential to reciprocate produces higher and more correlated ratings than when individuals are unable to see how others rated them. Ratings further can depend on the gender and nationalities of the raters and ratees. All of these findings indicate that ratings should not be taken at face value without considering social nuances.

Presentation Slides
Presenter Image

Searching in the "Real World"

April 2010 Ophir Frieder

For many, "searching" is considered a mostly solved problem. In fact, for text processing, this belief is factually based. The problem is that most "real world" search applications involve "complex documents", and such applications are far from solved. Complex documents, or less formally, "real world documents", comprise of a mixture of images, text, signatures, tables, etc, and are often available only in scanned hardcopy formats. Search systems for such document collections are currently unavailable. We describe our efforts at building a complex document information processing prototype. This prototype integrates "point solution" (mature) echnologies, such as OCR capability, signature matching and handwritten word spotting techniques, search and mining approaches, among others, to yield a system capable of searching "real world documents". The described prototype demonstrates the adage that "the whole is greater than the sum of its parts". Our complex document benchmark development efforts are likewise presented.
Having described the global approach, we describe some potential future point solutions which we have developed over the years. These include an Arabic stemmer and a natural language source integration fabric called the Intranet Mediator. In terms of stemming, we developed and commercially licensed an Arabic stemmer and search system. Our approach was evaluated using the benchmark Arabic collections and favorably compared against the state of the art.We also focused on source integration and ease of user interaction. By integrating structured and unstructured sources, we developed and commercially licensed our mediator technology that provides a single, natural language interface to querying distributed sources. Rather than providing a set of links as possible answers, the described approach actually answers the posed question. Both the Arabic stemmer and the mediator efforts are likewise discussed.

Presenter Image

Estimation Techniques and Cost Models

February 2010  Riham Abdel Kader

The traditional optimization paradigm relies on cardinality estimation techniques and cost models which are supposed to accurately estimate the result size and the cost of operators. But these estimations are not always accurate. The inaccuracy in cardinality and cost estimations grows exponentially when propagated through the plan, causing serious optimization mistakes. This failure is more acute in the XML case, where much of the progress achieved in the relational context is missing. To overcome the vulnerabilities of traditional optimizers, we propose ROX, a Run-time Optimizer for XQueries. ROX focuses on optimizing the execution order of the path steps and relational joins in an XQuery. It does not depend on any statistics nor cost model and uses sampling techniques to estimate cardinalities of operators. It consists of interleaving optimization and execution steps, where the first initiates a sampling-based search to identify the sequence of operators most efficient to execute. The execution step executes the chosen sequence of operators and
materializes the result. This allows the subsequent optimization phase to analyze the newly materialized results to update the previously estimated cardinalities and to detect correlations. In this talk, the ROX algorithm will be described, along with some experimental results. We will also present a demo that illustrates the steps of ROX and compare its performance to other plans.


Cyber Center
Young Hall
155 South Grant Street
West Lafayette, Indiana 47907