Cyber Center

Resources - Seminars

 No results were found. Please try your search again.

 

 

 

Kevin P. White

Big Data in Biology: From Genomics to Systems Biology and Medicine

April 30, 2012   Kevin P. White

Kevin White, PhD, combines experimental and computational techniques to understand the networks of factors that control gene expression during development, evolution and disease.  He is a Professor in Human Genetics, Ecology & Evolution and Medicine, Section of Genetic Medicine and Director of the Institute for Genomic & Systems Biology at the University of Chicago and Argonne National Laboratory.
Kevin White graduated magna cum laude from Yale University with a joint B.S.-M.S. degree in biology in 1993. He completed his Ph.D. in developmental biology at Stanford University in 1998, followed by a postdoctoral fellowship in biochemistry and genomics at the Stanford Genome Technology Center. In 2001, he joined the faculty at Yale University as an Assistant Professor of Genetics, and was promoted to Associate Professor in 2003. In 2006, he joined the faculty of The University of Chicago.  He was named an NIH Genome Scholar in 2000, a W.M. Keck Distinguished Young Investigator in Medical Sciences in 2003, an Arnold and Mabel Beckman Young Investigator in 2004, a Pritzker Fellow in 2006 and 2007, and a James and Karen Frank Family Professor in 2006. In 2009, he was elected chairman of the Gordon Conference on Hormone Action in Development and Cancer and in 2012 he was a co-organizer of the Genetics Society of America Drosophila Research Conference.

Ben Shneiderman

Building Safe, Thriving Communities with Credible Content: Design Principles for Web Sites and Social Structures

April 9, 2012    Ben Shneiderman

While researchers are far from having reliable predictive models of online community success, a growing body of literature and inspirational examples can provide guidance for aspiring community managers. We know that design principles for websites can make a substantial difference in getting first-time users to return or regular visitors to become content contributors and eventually active collaborators.  Getting active collaborators to become committed leaders is yet a bigger challenge, but having such leaders is one of the keys to community success.  These leaders help set behavior norms by their examples, take the community into new directions, and deal with a wide variety of threats.  Successful communities must develop leaders who create resilient social structures to deal with serious threats from hackers who maliciously violate privacy, attack servers, vandalize content, or provide misleading content.  This talk will cover examples and a road map for research.

Renee J. Miller

On Schema Discovery

April 12, 2012    Renée J. Miller

Structured data is distinguished from unstructured data by the presence of a schema describing the logical structure and semantics of the data. The schema is the means through which we understand and query the underlying data. Schemas enable data independence. In this talk, I consider new challenges in the old problem of schema discovery. I'll discuss the changing role of schemas from prescriptive to descriptive. I'll use examples from Web data publishing and from Business Analytics to motivate the automation of schema discovery and maintenance.

Beth Plale

Metadata and Provenance: Fins in the Sea of Data 

March 28, 2012    Beth Plale

The data deluge and discussions of its implications is no longer just the purview of obscure technology conferences. So visible and immediate has the topic become, that it was a topic at the 2012 Davos World Economic Forum.  The data deluge of interest to computer scientists and informatics researchers is the one that impacts science and scholarly research.  If realized properly, this deluge will be a catalyst for new scientific discovery that fuels advances in grand challenge questions such as climate and social-ecological interactions.  Antecedent to these discoveries, however, is the need for deeper understanding of the issues of access and use of the metadata and provenance about scientific data.   On one hand, good metadata and provenance turn data from being write-once, read-none to write-many, read-many; on the other, bad metadata and provenance just contribute in a non-trivial way to the data deluge. 

In this talk I discuss several related research efforts on tools, evaluations, and experiences gained in provenance and metadata use and access that facilitate new forms of access and use of scientific and scholarly data.

Mohamed Eltoweissy

BioSENSE: Biologically-inspired Secure Elastic Networked Sensor Environment 

October 25, 2011   Mohamed Eltoweissy

Recently, the Department of Defense has designated Cyberspace as the fifth, and only human-made, dimension of operations. Besides, Cyberspace is becoming the nervous system of our modern society as it is increasingly being integrated in virtually all aspects of control and communication. Today’s Cyberspace comprises highly dynamic, interdependent global network of information technology infrastructures, telecommunications networks, computing systems, integrated sensors, control systems, embedded processors and controllers. Moreover, cyber resources are being tightly coupled and coordinated with their physical counterparts giving rise to pervasive social-cyber-physical environments. These emerging environments promise innovations, new services and significant enhancements in the safety, efficiency and reliability of numerous systems and infrastructures ranging from worldwide social interactions and gaming to smart infrastructure  smart infrastructure systems in energy, healthcare, transportation and emergency response to national security and defense.

Charles Schmitt photo

 Enabling Data Driven Research

December 2010  Charles Schmitt

Scientific communities have been rapidly expanding their ability to generate and use electronic data for exploration, modeling and simulations, and for generating and testing hypotheses. However, the current emphasis on data-driven discovery has generated new IT and informatics-related challenges that often impede the goals of these communities.   These challenges, especially when coupled with the shifting hardware landscape, are also impacting the ability of campus computational and IT infrastructures to rapidly adapt to scientific demands.   The Renaissance Computing Institute (RENCI) works with various scientific communities to assist in overcoming these challenges by leveraging new research in cyberinfrastructure, computer science, and information sciences.  In this talk, we address the successes and limitations we’ve encountered in meeting those challenges with a focus on the gaps that new research in technology can address to move scientific domains forward. 

Presenter Image

To friend and to trust: Eliciting truthful and useful ratings online

November 2010  Lada Adamic

Online rating and reputation systems have shown themselves to be essential for filtering content, building trust, and fostering communities. However, these ratings should not be taken at face value. When individuals submit ratings online, especially ratings of other people, they are being asked to quantify inherently subjective feelings. To complicate matters, they may formulate their ratings differently if these are shown to others, and if those others can reciprocate. In this talk I will present a study that combines data analysis of several online data sets. For one such system, CouchSurfing.org, I will discuss findings from a largescale survey and in-depth interviews to examine from multiple angles the challenges that users have in providing useful and truthful ratings. We find, for example, that the potential to reciprocate produces higher and more correlated ratings than when individuals are unable to see how others rated them. Ratings further can depend on the gender and nationalities of the raters and ratees. All of these findings indicate that ratings should not be taken at face value without considering social nuances.

Presentation Slides
Presenter Image

Searching in the "Real World"

April 2010 Ophir Frieder

For many, "searching" is considered a mostly solved problem. In fact, for text processing, this belief is factually based. The problem is that most "real world" search applications involve "complex documents", and such applications are far from solved. Complex documents, or less formally, "real world documents", comprise of a mixture of images, text, signatures, tables, etc, and are often available only in scanned hardcopy formats. Search systems for such document collections are currently unavailable. We describe our efforts at building a complex document information processing prototype. This prototype integrates "point solution" (mature) echnologies, such as OCR capability, signature matching and handwritten word spotting techniques, search and mining approaches, among others, to yield a system capable of searching "real world documents". The described prototype demonstrates the adage that "the whole is greater than the sum of its parts". Our complex document benchmark development efforts are likewise presented.
Having described the global approach, we describe some potential future point solutions which we have developed over the years. These include an Arabic stemmer and a natural language source integration fabric called the Intranet Mediator. In terms of stemming, we developed and commercially licensed an Arabic stemmer and search system. Our approach was evaluated using the benchmark Arabic collections and favorably compared against the state of the art.We also focused on source integration and ease of user interaction. By integrating structured and unstructured sources, we developed and commercially licensed our mediator technology that provides a single, natural language interface to querying distributed sources. Rather than providing a set of links as possible answers, the described approach actually answers the posed question. Both the Arabic stemmer and the mediator efforts are likewise discussed.

Presenter Image

Estimation Techniques and Cost Models

February 2010  Riham Abdel Kader

The traditional optimization paradigm relies on cardinality estimation techniques and cost models which are supposed to accurately estimate the result size and the cost of operators. But these estimations are not always accurate. The inaccuracy in cardinality and cost estimations grows exponentially when propagated through the plan, causing serious optimization mistakes. This failure is more acute in the XML case, where much of the progress achieved in the relational context is missing. To overcome the vulnerabilities of traditional optimizers, we propose ROX, a Run-time Optimizer for XQueries. ROX focuses on optimizing the execution order of the path steps and relational joins in an XQuery. It does not depend on any statistics nor cost model and uses sampling techniques to estimate cardinalities of operators. It consists of interleaving optimization and execution steps, where the first initiates a sampling-based search to identify the sequence of operators most efficient to execute. The execution step executes the chosen sequence of operators and
materializes the result. This allows the subsequent optimization phase to analyze the newly materialized results to update the previously estimated cardinalities and to detect correlations. In this talk, the ROX algorithm will be described, along with some experimental results. We will also present a demo that illustrates the steps of ROX and compare its performance to other plans.

Contact

Cyber Center
Young Hall
155 South Grant Street
West Lafayette, Indiana 47907