About Me


Hello, I am Tinghao Xie 谢廷浩, a second year ECE PhD candidate at Princeton, advised by Prof. Prateek Mittal. Previously, I received my Bachelor degree from Computer Science and Technology at Zhejiang University.

My Research

I hope to fully explore the breadth and depth of safe, secure, robust, and reliable AI systems. Specifically:

  • I am currently working around LLM alignment safety. Do you know 🚨fine-tuning aligned LLM can compromise safety, even when users do not intend to? Checkout our recent work on 🚨LLM Fine-tuning Risks [website] [paper] [code], which was exclusively reported on 📰New York Times!
  • I also have extensive research experience on DNN backdoor attacks and defenses:
    • Check my 📦backdoor-toolbox @ Github, which has helped many backdoor researchers!
    • To defense against backdoor attack at inference-time, we introduce a novel backdoor input detection method, by directly extracting the backdoor functionality to a backdoor expert model. Check our work 🛡️BaDExpert: Extracting Backdoor Functionality for Accurate Backdoor Input Detection [paper] (preprint) for details!
    • We proposes a proactive solution to identify backdoor poison samples in a poisoned training set in our work 🛡️Towards A Proactive ML Approach for Detecting Backdoor Poison Samples [paper] [code] (USENIX Security’23) . This is realized via a super intersting method named “Confusion Training” where we prevent an ML model from fitting the normal clean samples by deliberate mislabeling – the resulting model can only fit the backdoor poison samples.
    • Before that, our another work 🤔Revisiting the Assumption of Latent Separability for Backdoor Defenses [paper][code] (ICLR’23) studies the latent separation assumption made by state-of-the-art backdoor defenses, and designs adaptive attacks against such backdoor defenses.
    • 😈Subnet Replacement Attack (SRA)[paper][code] (CVPR’22 Oral) is my earlier work, proposing the first gray-box and physically realizable backdoor weight attack, collaborating with Xiangyu Qi @ Princeton University, advised by Principal Researcher Jifeng Zhu @ Tencent Zhuque Lab and Prof. Kai Bu @ ZJU.
    • When I was an undergraduate, I was fortunate to work with Prof. Ting Wang on backdoor certification[blog] and backdoor restoration[blog] @ Pennsylvania State University (currently Associate Professor @ Stony Brook University) as an intern, meanwhile co-advised by Prof. Shouling Ji @ ZJU NESA Lab.
  • Even earlier during my undergrad years (my first research experience actually lol), I worked with Prof. Jianhai Chen, designed and implemented Enchecap[code] – an encrypted (enclave-based) heterogeneous calculation protocol.


Click here (or the “Publications/Manuscripts” button in the nav bar) for more details!

📖 Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To!
Xiangyu Qi*, Yi Zeng*, Tinghao Xie*, Pin-Yu Chen, Ruoxi Jia, Prateek Mittal$^†$, Peter Henderson$^†$
ICLR 2024 (oral)
📰 This work was exclusively reported by New York Times, and covered by many other social medias!

📖 BaDExpert: Extracting Backdoor Functionality for Accurate Backdoor Input Detection
Tinghao Xie, Xiangyu Qi, Ping He, Yiming Li, Jiachen T. Wang, Prateek Mittal
ICLR 2024

📖 Towards A Proactive ML Approach for Detecting Backdoor Poison Samples
Xiangyu Qi, Tinghao Xie, Jiachen T. Wang, Tong Wu, Saeed Mahloujifar, Prateek Mittal
USENIX Security 2023

📖 Revisiting the Assumption of Latent Separability for Backdoor Defenses
Xiangyu Qi*, Tinghao Xie*, Yiming Li, Saeed Mahloujifar, Prateek Mittal
ICLR 2023

📖 Towards Practical Deployment-Stage Backdoor Attack on Deep Neural Networks
Xiangyu Qi*, Tinghao Xie*, Ruizhe Pan, Jifeng Zhu, Yong Yang, Kai Bu
CVPR 2022 (oral)

News & Facts