StreamCI Integration for Ecological Data John Martinson Honors College Academic Year 2026 Accepted Computer Science, Cyberinfrastructure, Ecology, Conservation This project situates undergraduate research within the NSF-funded National Science Foundation Cyberinfrastructure for Sustained Scientific Innovation (CSSI) initiative, StreamCI, which aims to enable scalable, AI-ready data pipelines for scientific applications. As a participating domain science, our goal is to adapt unstructured ecological datasets—particularly acoustic recordings and camera trap imagery—for integration into the StreamCI framework. These data are central to ongoing conservation efforts focused on species such as the bobcat, gray wolf, and American black bear, where large volumes of sensor data are collected but remain difficult to standardize and analyze at scale. Enabling these data within StreamCI supports a broader “Conservation in Action” goal: transforming raw environmental observations into actionable ecological insight. Kristen Marie Bellisario The undergraduate researcher will play a central role in transforming unstructured ecological data into structured, machine-readable formats suitable for streaming data pipelines. This includes examining raw datasets to identify patterns, inconsistencies, and missing metadata; defining appropriate data fields and schema; and evaluating how these design choices affect downstream analysis.

A key component of the project is student-driven inquiry. The researcher will develop an independent research question related to data representation, selection, or preprocessing in ecological monitoring systems. They will select datasets, implement data transformations, and test how different schema and preprocessing strategies influence the performance of analytical or algorithmic workflows within StreamCI.

Through this work, the student will contribute to enabling multimodal ecological data (audio and imagery) to function effectively within a national cyberinfrastructure platform. The project advances both domain-specific knowledge in ecological informatics and broader goals of reproducible, scalable data science in support of wildlife conservation.
Experience coding in R or Python (coursework is great!) Interest in conservation, wildlife, or environmental data Interest in working with both data and field-based research teams 0 5 (estimated)