Submitting ideas for AI use
Using the best practices provided here will guide your decisions on possible AI use.
Best Practices Expedite AI Decisions
Purdue uses the Data Ethics Committee Use Case Review form to review your proposed use case for AI or data. The committee works to understand the scope, risks and potential impact of your project. This review process also serves as a tool to help submitters understand the risks associated with their use cases, so they can ensure that appropriate governance, oversight and safety protocols are in place. This process also supports conversations with vendors and guides additional information gathering.
Be specific
Avoid generic or unsupported claims like “we conduct fairness testing” or “no sensitive data is used.” Instead, describe how you tested for fairness, e.g., “tested against three demographic subgroups” or why the data is not sensitive, e.g., “all PII fields are hashed before uploading.”
Provide sufficient Information to enable evaluation
Similarly, please include enough high-level information directly in the form or provide details on how to use any attached documentation to answer a question. Any answers lacking sufficient detail will receive follow-up questions. The committee meets monthly to review use cases, meaning unclear answers resulting in back-and-forth with the committee may delay approval by as much as a month.
Articulate your strategy
If a formal procedure doesn’t exist yet, describe your planned approach. It is acceptable to say, “We do not have a fixed metric yet, so we will conduct a manual review of 50 outputs every quarter to establish a baseline and share the results with the committee.”
Explain to a broader audience
Write for a generalist audience, and assume the readers do not have deep familiarity with your work in context, like the relevant job activities, workflow, existing organizational safeguards, etc. If you must use technical or job-specific concepts (e.g., RAG, Financial Reporting Statement, our office’s admissions routing procedure), briefly explain them in context.
Do not rely on ‘magical’ or generic oversight:
Simply stating “there is a human in the loop” or “the vendor ensures robust testing” is not enough. You must describe the human’s specific role, or the specific vendor control you have verified.
1.1. What specific services or functionalities does your AI product, service, or tool provide?
Examples of information to provide:
- Core use cases and the specific problem being solved
- Intended user groups (students, faculty, admin staff, public)
- Technical dependencies (APIs, external data sources, cloud platforms)
- Workflow descriptions for nonspecialists
- Operational diagrams
Weak response (relies on generic claims): “The tool automates routine communication to increase office productivity. It uses advanced AI to draft responses, but a human always reviews them to ensure quality.”
Strong response (specific and clear): “We are using Microsoft Copilot (Enterprise license) to summarize transcripts from internal administrative meetings and draft follow-up emails. It connects to our existing Office 365 tenant. No data leaves the institutional boundary during processing. Data is stored on our SharePoint and only available to members of the HR team.”
1.2. What are the explicit limitations and delimitations of your service’s capabilities?
Examples of information to provide:
- “Not for use in…” disclaimers or documented boundaries
- Specific contexts where use would be inappropriate or dangerous
- Known technical limitations (e.g., knowledge cutoffs, hallucination rates)
Weak response: “The AI is generally accurate, but users are reminded that it can make mistakes and they should use their best judgment.”
Strong response: “This tool should not be used for grading student work, generating letters of recommendation or interpreting medical leave requests. It has a knowledge cutoff of 2023 and cannot access real-time student SIS data.”
Strong response: “During testing, the AI regularly confused the specific ancestries of Asian students — for instance, confusing Japan and Korea. For this reason, we have explicitly shut off all recommendations from the tool tagged with ‘nation of origin’ as an important reason for the recommendation.”
1.3. What ethical, regulatory, or quality guidelines did you follow?
Examples of information to provide:
- Alignment with frameworks (NIST AI RMF, OECD AI Principles, EU AI Act)
- Compliance with specific regulations (GDPR, HIPAA, FERPA)
- Results from internal IRB reviews or third-party audits
Weak response: “We follow all standard ethical AI principles and the vendor’s trust and safety guidelines to prevent misuse.”
Strong response: “We have confirmed the tool is HIPAA-compliant via a BAA with the vendor, and we applied the NIST AI RMF; a report is attached. An associated pilot project was reviewed by the IRB (Protocol #12345) and deemed exempt.”
1.4. What foreseeable risks or harms could arise from using this service?
Examples of information to provide:
- Anticipated risks (hallucinations, bias, data leakage)
- Risk categories (reputational, financial, individual or societal harm)
- Results from impact assessments (ethical, safety, human rights)
Weak response: “The risk is minimal because staff are required to check the output. The vendor also has filters in place to stop harmful content.”
Strong response: “Primary risks include ‘hallucination’ of policy details in draft emails, which could misinform students. There is also a risk of non-native English speakers receiving lower-quality summaries. We mitigate this by mandatory manual review (see Section 4.1), a disclaimer on all outputs, prohibitions on certain uses and an outreach number for harms that may occur.”
2.1. What data sources were used to train and test your AI service?
Examples of information to provide:
- Dataset size, quality and provenance (e.g., open source vs. scraped)
- Distinction between pre-training data and fine-tuning/RAG data.
- Licenses, ownership status, or datasheets.
Weak response: “The model is trained on a massive, diverse dataset curated by OpenAI to ensure high-quality responses across many topics.”
Strong response: “We utilize the base GPT-4 model (pretrained by OpenAI on public data). We also use retrieval-augmented generation (RAG), giving the RAG access to our internal departmental policy PDFs (approximately 500 documents). No student data is used to train or fine-tune the model.”
2.2. How representative or diverse is the data, and how do you mitigate bias?
Examples of information to provide:
- Demographic coverage and subgroup analysis
- Steps taken to identify bias (fairness audits, red-teaming)
- Citations of academic studies if relevant
Weak response: “The vendor has removed bias from the model using RLHF (reinforcement learning from human feedback), so it treats all users equally.”
Strong response: “The underlying model has known biases regarding gender roles. For our specific use case (technical support chat), we audited 100 interactions and found no significant performance difference between queries phrased in standard vs. nonstandard English.”
2.3. What data do you collect or log? Does it include PII?
Examples of information to provide:
- Logs of inputs/outputs, metadata and retention periods
- Transparency reports or data protection impact assessments
- Whether PII or sensitive attributes are solicited or stored
Weak response: “We log interactions to help improve the user experience, but we protect user privacy according to our standard privacy policy.”
Strong response: “We log all user prompts and system outputs for 30 days for debugging, then permanently delete them. IP addresses are anonymized. No student PII is solicited, but if a user accidentally enters it, it resides in the logs until the 30-day purge.”
2.4. How do you protect the privacy and security of sensitive information?
Examples of information to provide:
- Access controls (SSO, MFA) and encryption methods (transit/rest)
- Adherence to standards (ISO 27001, NIST Cybersecurity Framework, SOC 2)
- Anonymization or differential privacy techniques
Weak response: “We use a secure cloud platform that is trusted by major enterprises and encrypts data.”
Strong response: “Data is encrypted in transit (TLS 1.3) and at rest (AES-256). Access is restricted via single sign-on to authorized staff only. The vendor provides a SOC 2 Type II report verifying these controls. Data does not leave the U.S.-East region.”
3.1. How do you evaluate the performance of your AI service?
Examples of information to provide:
- Specific metrics (accuracy, recall, time saved, cost reduction)
- Benchmarking results against human performance
- Pilot deployment data or peer-reviewed studies
Weak response: “The tool has been very effective in our tests and users are happy with the quality of the writing.”
Strong response: “We conducted a pilot where the AI summarized 50 past meetings. Staff rated accuracy at 4.5/5. The tool reduced draft time from 30 minutes to five minutes per meeting. We track ‘accepted vs. edited’ rates for generated text.”
3.2. How do you evaluate the safety of the system (risks, errors, harms)?
Examples of information to provide:
- Error rates and adverse event tracking
- Scenario testing or “red-teaming” (trying to break the model)
- Continuous monitoring plans after deployment
Weak response: “We rely on the underlying model’s built-in safety features, which prevent it from generating harmful or illegal content.”
Strong response: “We utilized ‘red teaming,’ where staff intentionally tried to trick the bot into revealing confidential salary data. It failed two out of 10 times; we adjusted the system prompt to fix this before release. We monitor logs weekly for ‘jailbreak’ attempts.”
3.3. How do you make sure the service works for new groups or contexts?
Examples of information to provide:
- Testing in varied settings or demographics
- Procedures for local validation or calibration before expansion
- Portability assessments
Weak response: “Because the model is trained on general knowledge, it should work well for any department.”
Strong response: “Currently tested only in the engineering department. Before expanding to the humanities department, we will run a new validation set to ensure the model understands their specific terminology and citation styles.”
4.1. How are humans involved in the use of this service?
Examples of information to provide:
- Specific roles (frontline users, reviewers, auditors)
- Workflow steps (e.g., “human-in-the-loop” vs. “human-on-the-loop”)
- Frequency or ratio of human interventions
Weak response: “We maintain a human-in-the-loop approach where staff oversee the AI to prevent any issues.”
Strong response: “The AI generates a draft email, but the ‘Send’ button is disabled until a staff member edits the text and manually checks a ‘Reviewed’ box. Supervisors audit 10% of sent emails monthly.”
4.2. How do you communicate capabilities and risks to users?
Examples of information to provide:
- User guides, FAQs, model cards or system cards
- Plain-language disclaimers and uncertainty indicators
- Transparency notices distinguishing AI from humans
Weak response: “Users are informed during onboarding that they are using an AI tool.”
Strong response: “The chat interface includes a permanent banner stating: ‘AI can make mistakes. Verify important info.’ All generated documents include a watermark or footer indicating AI origin.”
4.3. What training or qualifications are required for users?
Examples of information to provide:
- Required certifications or professional qualifications
- AI ethics or safety training modules
- Onboarding curriculum content
Weak response: “Our staff are highly trained professionals who know how to evaluate information.”
Strong response: “All users must complete the ‘Data Privacy and Generative AI’ 30-minute learning module before being granted access. Supervisors receive additional training on spotting hallucinated citations.”
4.4. What processes exist for escalation or overriding outputs?
Examples of information to provide:
- Escalation triggers and supervisory approval steps
- Emergency stop mechanisms
- Override procedures
Weak response: “If the AI produces something wrong, the user can simply ignore it and write the text themselves.”
Strong response: “Users can flag a response as ‘harmful’ or ‘inaccurate’ directly in the UI. This triggers a review by the IT team. If the error rate exceeds 5% in a week, the system is automatically paused for maintenance.”
4.5. Who is responsible if the AI makes a mistake?
Examples of information to provide:
- Assignment of responsibility (project lead, department head, vendor)
- Complaint hotlines or feedback portals
- Incident response procedures
Weak response: “We work closely with the vendor to ensure that any errors are corrected in future updates.”
Strong response: “The director of admissions retains final responsibility for all decisions, regardless of AI input. Complaints regarding AI interactions can be routed through the standard student grievance portal, with a checkbox for ‘Technical/AI Issue.’”
5.1. What technical support and documentation are available?
Examples of information to provide:
- User manuals, admin runbooks and troubleshooting guides
- Help desk contacts and support hours
Weak response: “Support is available through our standard IT ticketing system.”
Strong response: “We have created a dedicated Confluence page with FAQs and a ‘Prompt Engineering Guide’ for staff. Level 1 support is handled by the departmental IT lead; Level 2 is escalated to the vendor.”
5.2. How do you plan to update and improve the service?
Examples of information to provide:
- Versioning policies and road maps
- Monitoring for regulatory changes
- Update frequency and postrelease monitoring
Weak response: “The system is cloud-based so it is automatically updated with the latest features.”
Strong response: “We review the system performance quarterly. We subscribe to the vendor’s changelog to monitor for model updates (e.g., GPT-4 to GPT-5) and will freeze our current version until the new model passes our internal validation test.”