Sara Price

Hi! I'm an independent AI alignment and safety researcher currently based out of the Bay Area. I've been working in machine learning since 2016 and made the switch to AI alignment work at the beginning of 2024 while participating in the Machine Learning Alignment Theory & Scholars (MATS) program.

My most recent work has focused on adversarial robustness of multimodal LLMs. We have been studying novel attacks that exploit the stochastic nature of LLM outputs in conjunction with their sensitivity to variations in continuous input spaces (i.e. audio or vision modalities).

Research

Future Events as Backdoor Triggers: Investigating Temporal Vulnerabilities in LLMs

Sara Price

Arjun Panickssery

Asa Cooper Stickland

Testing Robust Image Understanding Through Contextual Phrase Detection

Aishwarya Kamath*

Sara Price*

Nicolas Carion*

SATBench: Benchmarking the speed-accuracy tradeoff in object recognition by humans and dynamic neural networks

Ajay Subramanian

Sara Price

Applying Self Debiasing Techniques to Toxic Language Detection Models

Sara Price

Inequitable Access to EV Charging Infrastructure

Hafiz Anwar Ullah Khan

Sara Price

Charalampos Avraam

Contact

Email: sara.price1461@gmail.com

LinkedIn | GitHub | Google Scholar | Twitter