
Practical safety engineering for prompt-driven tools: bias awareness, content filtering, refusal patterns, human-in-the-loop decisions, and privacy best practices tailored for educational contexts (student data sensitivity).
Safety taxonomy: hate, sexual, medical/legal, and privacy-related risks.
Filter & refusal engineering: whitelist/blacklist, classifier pre-filters, and post-generation checks.
Human-in-the-loop patterns: when and how to route to human review.
Data handling for student information: minimization, masking, and retention policies.
Testing safety: adversarial prompts and robustness checks.
Building a simple incident report and response playbook.
Activities
Implement a moderation layer for a demo assistant and run adversarial prompt tests; submit a safety report describing prevented violations and residual risks.
📦 Deliverable
Safety checklist, sample logs showing blocked content, and an incident response playbook.
Moderation API examples (conceptual), FERPA/GDPR overviews (plain-language), example safety policy templates.
Modules 1–4 recommended.
Shows parents that student projects and tools are built with explicit safeguards and privacy-first design.
APPLY TODAY FOR THE 2025/2026 ACADEMIC SESSION.