Cyber Center

Technical Reports

  • 12/12/2014Privacy-preserving Assessment of Social Network Data TrustworthinessChenyun Dai et al.

    Extracting useful knowledge from social network datasets is a challenging problem. To add to the difficulty of this problem, privacy concerns that exist for many social network datasets have restricted the ability to analyze these networks and consequently to maximize the knowledge that can be extracted from them. This paper addresses this issue by introducing the problem of data trustworthiness in social networks when repositories of anonymized social networks exist that can be used to assess such trustworthiness. Three trust score computation models (absolute, relative, and weighted) that can be instantiated for specific anonymization models are defined and algorithms to calculate these trust scores are developed. Using both real and synthetic social networks, the usefulness of the trust score computation is validated through a series of experiments.

    View the Full Publication
  • 12/12/2014Building access control policy model for Privacy Preserving and Testing Policy Conflicting ProblemsElisa Bertino et al.

    This paper proposes a purpose-based access control model in distributed computing environment for privacy preserving policies and mechanisms, and describes algorithms for policy conflicting problems. The mechanism enforces access policy to data containing personally identifiable information. The key component is purpose involved access control models for expressing highly complex privacy-related policies with various features. A policy refers to an access right that a subject can have on an object, based on attribute predicates, obligation actions, and system conditions. Policy conflicting problems may arise when new access policies are generated that are possible to be conflicted to existing policies. As a result of the policy conflicts, private information cannot be well protected. The structure of purpose involved access control policy is studied, and efficient conflict-checking algorithms are developed and implemented. Finally a discussion of our work in comparison with other related work such as EPAL is presented.

    View the Full Publication
  • 01/31/2014Intelligent Shelter Allotment for Emergency Evacuation Planning: A Case Study of MakkahKwangsoo Yang et al.

    Given maps of an evacuee population, shelter destinations and a transportation
    network, the goal of intelligent shelter allotment (ISA) is to assign routes, exits and shelters to evacuees for quick and safe evacuation. ISA is societally important due to emergency planning and response applications in context of hazards such as floods, terrorism, fire, etc. ISA is challenging due to conflicts between movements of evacuee-groups heading to different shelters and transportation-network choke-points. State of the practice based on Nearest Exit or Shelter (NES) paradigm addresses the former challenge but not the latter one leading to load-imbalance and slow evacuation. Recent computational development, e.g., capacity-constrained route planning (CCRP), address the latter challenges to speedup evacuation, but do not separate evacuee groups
    going to different shelter destinations. To address these limitations, we propose a novel approach, namely, Crowd-separated Allocation of Routes, Exits and Shelters (CARES) based on the core idea of spatial anomaly avoidance. Experiments and Hajj case study (Makkah) show that CARES meets both challenges by providing much faster evacuation than NES and much lower evacuee-group movement-conflicts than CCRP.

    View the Full Publication
  • 01/29/2014Similarity-Aware Query Processing and OptimizationYasin Silva et al.

    Many application scenarios, e.g., marketing analysis, sensor networks, and
    medical and biological applications, require or can significantly benefit from the
    identification and processing of similarities in the data. Even though some work
    has been done to extend the semantics of some operators, e.g., join and
    selection, to be aware of data similarities; there has not been much study on the
    role, interaction, and implementation of similarity-aware operations as first-class
    database operators. The focus of this thesis work is the proposal and study of
    several similarity-aware database operators and a systematic analysis of their
    role as query operators, interactions, optimizations, and implementation
    techniques. This work presents a detailed study of two core similarity-aware
    operators: Similarity Group-by and Similarity Join. We describe multiple
    optimization techniques for the introduced operators. Specifically, we present: (1)
    multiple non-trivial equivalence rules that enable similarity query transformations,
    (2) Eager and Lazy aggregation transformations for Similarity Group-by and
    Similarity Join to allow pre-aggregation before potentially expensive joins, and (3)
    techniques to use materialized views to answer similarity-based queries. We also
    present the main guidelines to implement the presented operators as integral
    components of a database system query engine and several key performance
    evaluation results of this implementation in an open source database system. We
    introduce a comprehensive conceptual evaluation model for similarity queries
    with multiple similarity-aware predicates, i.e., Similarity Selection, Similarity Join,
    Similarity Group-by. This model clearly defines the expected correct result of a
    query with multiple similarity-aware predicates. Furthermore, we present multiple
    transformation rules to transform the initial evaluation plan into more efficient
    equivalent plans.

    View the Full Publication
  • 01/15/2013Privacy Preserving Context Aware Publish Subscribe Systems 2013-1Mohamed Nabeel et al.

    Publish/subscribe (pub/sub) systems support highly scalable, many to many communications among loosely coupled publishers and subscribers.Modern
    pub/sub systems perform message routing based on the message content and allow subscribers to receive messages related to their subscriptions and the current context. However, both content and context encode sensitive information
    which should be protected from third-party brokers that make routing decisions. In this work, we address this issue by proposing an approach for constructing a
    privacy preserving context-based pub/sub system. In particular, our approach assures the confidentiality of the messages being published and subscriptions being issued while allowing the brokers to make routing decisions without decrypting individual messages and subscriptions, and without learning the context. Further, subscribers with a frequently changing context such as location are able to issue and update subscriptions without revealing the subscriptions in plaintext to the broker and without the need to contact a trusted third party for each subscription change resulting from a change in the context. Our approach is based on a modified version of the Paillier additive homomorphic cryptosystem and a recent expressive group key management scheme. The former construct is used to perform privacy preserving matching and covering, and the latter construct is used to enforce fine-grained encryption based access control on the messages being published. We optimize our approach in order to efficiently handle frequently changing contexts. We have implemented our approach in a prototype using an industry strength JMS broker middleware. The experimental results show that our approach is highly practical.

    View the Full Publication
  • 12/12/2012Detecting, Representing and Querying Collusion in Online Rating Systems 2012-3Mohammad Allahbakhsh et al.

    Online rating systems are subject to malicious behaviors mainly by posting unfair rating scores. Users may try to individually or collaboratively promote or demote a product. Collaborating unfair rating 'collusion' is more damaging than individual unfair rating. Although collusion detection in general has been widely studied, identifying collusion groups in online rating systems is less studied and needs more investigation. In this paper, we study impact of collusion in online rating systems and asses their susceptibility to collusion attacks. The proposed model uses a frequent itemset mining algorithm to detect candidate collusion groups. Then, several indicators are used for identifying collusion groups and for estimating how damaging such colluding groups might be. Also, we propose an algorithm for finding possible collusive subgroup inside larger groups which are not identified as collusive. The model has been implemented and we present results of experimental evaluation of our methodology.

    View the Full Publication
  • 12/12/2012An Analytic Approach to People Evaluation in Crowdsourcing Systems 2012-4Mohammad Allahbakhsh et al.

    Worker selection is a significant and challenging issue in crowdsourcing systems. Such selection is usually based on an assessment of the reputation of the individual workers participating in such systems. However, assessing the credibility and adequacy of such calculated reputation is a real challenge. In this paper, we propose an analytic model which leverages the values of the tasks completed, the credibility of the evaluators of the results of the tasks and time of evaluation of the results of these tasks in order to calculate an accurate and credible reputation rank of participating workers and fairness rank for evaluators. The model has been implemented and experimentally validated.

    View the Full Publication
  • 05/14/2012Authentication and Key Management for Advanced Metering Infrastructures Utilizing Physically Unclonable Functions 2012-2Mohamed Nabeel et al.

    Conventional utility meters are increasingly being replaced with smart meters as smart meter based AMIs (Advanced Metering Infrastructures) provide many benefits over conventional power infrastrucutures. However, security issues pertaining to the data transmission between smart meters and utility servers have been a major concern. With large scale AMI deployments, addressing these issues is challenging. In particular, as data travels through several networks, secure end-to-end communication based on strong authentication mechanisms and a robust and scalable key management schemes are crucial for assuring the confidentiality and the integrity of this data. In this paper, we propose an approach based on PUF (physically unclonable function) technology for providing strong hardware based authentication of smart meters and efficient key management to assure the confidentiality and integrity of messages exchanged between smart meters and the utility. Our approach does not require modifications to the existing smart meter communication. We have developed a proof-of-concept implementation of the proposed approach which is also briefly discussed in the paper.

    View the Full Publication
  • 04/13/2012Challenges and Opportunities with Big Data 2011-1Divyakant Agrawal et al.

    The promise of data-driven decision-making is now being recognized broadly, and there is growing enthusiasm for the notion of ``Big Data.’’ While the promise of Big Data is real -- for example, it is estimated that Google alone contributed 54 billion dollars to the US economy in 2009 -- there is currently a wide gap between its potential and its realization.
    Heterogeneity, scale, timeliness, complexity, and privacy problems with Big Data impede progress at all phases of the pipeline that can create value from data. The problems start right away during data acquisition, when the data tsunami requires us to make decisions, currently in an ad hoc manner, about what data to keep and what to discard, and how to store what we keep reliably with the right metadata. Much data today is not natively in structured format; for example, tweets and blogs are weakly structured pieces of text, while images and video are structured for storage and display, but not for semantic content and search: transforming such content into a structured format for later analysis is a major challenge. The value of data explodes when it can be linked with other data, thus data integration is a major creator of value. Since most data is directly generated in digital format today, we have the opportunity and the challenge both to influence the creation to facilitate later linkage and to automatically link previously created data. Data analysis, organization, retrieval, and modeling are other foundational challenges. Data analysis is a clear bottleneck in many applications, both due to lack of scalability of the underlying algorithms and due to the complexity of the data that needs to be analyzed. Finally, presentation of the results and its interpretation by non-technical domain experts is crucial to extracting actionable knowledge.
    During the last 35 years, data management principles such as physical and logical independence, declarative querying and cost-based optimization have led, during the last 35 years, to a multi-billion dollar industry. More importantly, these technical advances have enabled the first round of business intelligence applications and laid the foundation for managing and analyzing Big Data today. The many novel challenges and opportunities associated with Big Data necessitate rethinking many aspects of these data management platforms, while retaining other desirable aspects. We believe that appropriate investment in Big Data will lead to a new wave of fundamental technological advances that will be embodied in the next generations of Big Data management and analysis platforms, products, and systems.
    We believe that these research problems are not only timely, but also have the potential to create huge economic value in the US economy for years to come. However, they are also hard, requiring us to rethink data analysis systems in fundamental ways. A major investment in Big Data, properly directed, can result not only in major scientific advances, but also lay the foundation for the next generation of advances in science, medicine, and business.

    View the Full Publication

About Us

The Cyber Center at Purdue University will provide a venue for all IT-related research, hardware, software, and staffing to come together in a single venue allowing new discoveries that can have immediate impact on discovery, learning, and engagement.


Contact

Cyber Center
Young Hall
155 South Grant Street
West Lafayette, Indiana 47907