Safety-Critical Recommendation AI
Recommendation systems optimize for what a user will prefer — but when a wrong suggestion can cause real harm, such as recommending a dish containing an allergen to someone who must avoid it, preference and safety become competing objectives inside one model. This research examines how safety should be enforced in recommendation systems when the cost of an error is high.
Recent evaluations show that current language-model–based recommenders do not reliably resolve the tension between preference and safety, producing allergen-unsafe recommendations even when otherwise accurate. This thread investigates how such systems should be structured when safety is non-negotiable: how constraints are enforced, how decisions are made auditable, and how to measure whether a system is genuinely safe rather than merely accurate. The work draws on agentic LLM architectures, retrieval-based recommendation, provenance tracking, and rigorous offline safety evaluation, with a deployed recommendation system as its testbed. Faculty-advised.