Page Not Found
Page not found. Your pixels are in another canvas.
谢廷浩", "url" : "https://tinghaoxie.com", "sameAs" : null }
A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.
Page not found. Your pixels are in another canvas.
About me
This is a page not in th emain menu
Published:
Use an ensemble of very narrow (1/2-channel wide) DNNs to classify MNIST, Fashion-MNIST and CIFAR-10.
Published:
Towards faithful backdoor trigger restoration.
Published:
Certified robustness for backdoor attack.
Published:
A Visual Question Answering model implemented in MindSpore and PyTorch. The model is a reimplementation of the paper Show, Ask, Attend, and Answer: A Strong Baseline For Visual Question Answering. It’s our final project for course DL4NLP at ZJU.
Question: What color is the girl’s necklace?
Answer: ‘white’
Published:
A C-like Compiler, R for either remarkable or retarded, implemented with Flex, Bison and LLVM.
Published:
An (onbuilding) encrypted (enclave-based) heterogeneous calculation protocol based on Nvidia CUDA and Intel SGX.
Published:
A group project in Computer Graphics course, including a simple but fully-featured 3D engine based on native WebGL and a wonderful flying game demo.
Short description of portfolio item number 1
Short description of portfolio item number 2
We propose the extremely small Ensemble of Narrow DNN Chains (ENDC) framework.
We summarize the intuitions from causality embraced by recent deep learning works.
We propose the first gray-box and physically realizable weights attack algorithm for backdoor injection, namely subnet replacement attack (SRA).
Is the latent separation unavoidable for backdoor poisoning attacks? In this paper, we design adaptive backdoor poisoning attacks to present counter-examples against this assumption.
We suggest a proactive backdoor defense paradigm in which defenders engage proactively with the entire model training and poison detection pipeline, directly enforcing and magnifying distinctive characteristics of the post-attacked model to facilitate poison detection. We introduce the technique of Confusion Training (CT) as a concrete instantiation of our framework. CT applies an additional poisoning attack to the already poisoned dataset, actively decoupling benign correlation while exposing backdoor patterns to detection.
We present a novel defense, against backdoor attacks on Deep Neural Networks (DNNs), wherein adversaries covertly implant malicious behaviors (backdoors) into DNNs. Our defense falls within the category of post-development defenses that operate independently of how the model was generated. The proposed defense is built upon a novel reverse engineering approach that can directly extract backdoor functionality of a given backdoored model to a backdoor expert model. The approach is straightforward – finetuning the backdoored model over a small set of intentionally mislabeled clean samples, such that it unlearns the normal functionality while still preserving the backdoor functionality, and thus resulting in a model (dubbed a backdoor expert model) that can only recognize backdoor inputs. Based on the extracted backdoor expert model, we show the feasibility of devising highly accurate backdoor input detectors that filter out the backdoor inputs during model inference. Further augmented by an ensemble strategy with a finetuned auxiliary model, our defense, BaDExpert (Backdoor Input Detection with Backdoor Expert), effectively mitigates 16 SOTA backdoor attacks while minimally impacting clean utility. The effectiveness of BaDExpert has been verified on multiple datasets (CIFAR10, GTSRB and ImageNet) across various model architectures (ResNet, VGG, MobileNetV2 and Vision Transformer).
📰This work was exclusively reported by New York Times and many other social medias!
Optimizing large language models (LLMs) for downstream use cases often involves the customization of pre-trained LLMs through further fine-tuning. Meta’s open release of Llama models and OpenAI’s APIs for fine-tuning GPT-3.5 Turbo on custom datasets also encourage this practice. But, what are the safety costs associated with such custom fine-tuning? We note that while existing safety alignment infrastructures can restrict harmful behaviors of LLMs at inference time, they do not cover safety risks when fine-tuning privileges are extended to end-users. Our red teaming studies find that the safety alignment of LLMs can be compromised by fine-tuning with only a few adversarially designed training examples. For instance, we jailbreak GPT-3.5 Turbo’s safety guardrails by fine-tuning it on only 10 such examples at a cost of less than $0.20 via OpenAI’s APIs, making the model responsive to nearly any harmful instructions. Disconcertingly, our research also reveals that, even without malicious intent, simply fine-tuning with benign and commonly used datasets can also inadvertently degrade the safety alignment of LLMs, though to a lesser extent. These findings suggest that fine-tuning aligned LLMs introduces new safety risks that current safety infrastructures fall short of addressing — even if a model’s initial safety alignment is impeccable, it is not necessarily to be maintained after custom fine-tuning. We outline and critically analyze potential mitigations and advocate for further research efforts toward reinforcing safety protocols for the custom fine-tuning of aligned LLMs.
Published:
This is a description of your talk, which is a markdown files that can be all markdown-ified like any other post. Yay markdown!
Published:
This is a description of your conference proceedings talk, note the different field in type. You can put anything in this field.
Undergraduate course, University 1, Department, 2014
This is a description of a teaching experience. You can use markdown like any other post.
Workshop, University 1, Department, 2015
This is a description of a teaching experience. You can use markdown like any other post.