2025
Yu Ying Chiu*, Michael S. Lee*, Rachel Calcott, Brandon Handoko, Paul de
Font-Reaulx, Paula Rodriguez, Chen Bo Calvin Zhang, Ziwen Han†, Udari Madhushani Sehwag,
Yash Maurya, Christina Q. Knight, Harry R. Lloyd, Florence Bacus, Mantas Mazeika, Bing
Liu,
Yejin Choi, Mitchell L. Gordon, Sydney Levine
arXiv preprint arXiv:2510.16380; under submission to ICLR 2026
TL;DR: Introduces MoReBench, a benchmark of 1,000 moral scenarios and 23k+ rubric
criteria to evaluate how language models reason about moral dilemmas, emphasizing pluralistic
and
process-focused moral reasoning over just final outcomes.
Yash Maurya, Ibrahim Mohamed Anis Chhaya, Hana Habib
IEEE Symposium on Privacy Expectations (ISoPE) 2025 & SUPA 2025 Workshop
on Societal & User-Centered Privacy in AI
TL;DR: Practitioner-oriented framework that organizes concrete ML privacy
mitigations, tools, and design patterns across the ML lifecycle to help teams operationalize
privacy-preserving AI in real-world deployments.
2024
Position: LLM Unlearning Benchmarks are Weak Measures of
Progress
Pratiksha Thaker, Shengyuan Hu, Neil Kale, Yash Maurya, Zhiwei Steven
Wu,
Virginia Smith
IEEE Conference on Secure and Trustworthy Machine Learning (SaTML), 2024
TL;DR: Discussion about the current state and limitations of LLM unlearning
benchmarks with a focus on forget/retain set methods
Designing a Benefit Assessment Protocol for AI
Systems
Rachel Kim*, Yash Maurya*, Goutam Mukku*
Course Project for Responsible AI Course(10-735) at CMU (Advisor: Professor
Hoda Heidari)
TL;DR: A structured protocol for systematically assessing
AI benefits to enable more
comprehensive AI evaluation.
Unified Locational Differential Privacy Framework
Aman Priyanshu*, Yash Maurya*, Suriya Ganesh*, Vy Tran*
arXiv preprint arXiv:2405.03903 (Advisor: Professor Hana Habib)
TL;DR: A privacy framework for aggregating sensitive location-based data while
protecting individual privacy through differential privacy mechanisms.
Guardrail baselines for unlearning in LLMs
Pratiksha Thaker, Yash Maurya, Shengyuan Hu, Zhiwei Steven Wu, Virginia
Smith
ICLR 2024 Workshop on Secure and Trustworthy Large Language Models
TL;DR: Simple guardrails (prompting and filtering) match finetuning's
effectiveness
for unlearning in LLMs, challenging current evaluation metrics.
Xinran Alexandra Li*, Yu-Ju Yang*, Yash Maurya*, Tian Wang*, Hana Habib,
Norman Sadeh, Lorrie Faith Cranor
Twentieth Symposium on Usable Privacy and Security (SOUPS 2024 Posters)
& SOUPS 2024 Societal & User-Centered Privacy in AI Workshop (SUPA 2024)
TL;DR: UsersFirst taxonomy outperforms LINDDUN PRO in detecting privacy notice and
choice threats in user study.
Tian Wang*, Xinran Alexandra Li*, Miguel Rivera-Lanas*, Yash Maurya*,
Hana
Habib, Lorrie Faith Cranor, Norman Sadeh
Twentieth Symposium on Usable Privacy and Security (SOUPS 2024 Posters)
& SOUPS 2024 Workshop on Privacy Threat Modeling (WPTM 2024)
TL;DR: UsersFirst: A user-centric framework for identifying and mitigating privacy
notice and choice threats, extending beyond LINDDUN
Is it Worth Storing Historical Gradients?
Joong Ho Choi*, Yingxin Liu*, Yash Maurya*
Course Project for Federated and Collaborative Learning Course(10-719) at
CMU (Advisor: Professor Virginia Smith)
TL;DR: Current weights beat historical gradients for detecting FL attacks, saving
storage and enhancing privacy.