Some code for "Stealing Part of a Production Language Model"
MIT License
A PyTorch baseline attack example for the NIPS 2017 adversarial competition
My Digital Palace - A Personal Journal for Reflection - A place to store all my thoughts
Adversarial Attack and Defense in Deep Ranking, T-PAMI, 2024
New distributional and shape attacks on neural networks that process 3D point cloud data.
Robust evasion attacks against neural network to find adversarial examples
Chain-of-Hindsight, A Scalable RLHF Method
[ECCV'24] The official GitHub page for ''Images are Achilles' Heel of Alignment: Exploiting Visua...
Extract full next-token probabilities via language model APIs
TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model trainin...
NIPS 2017 Adversarial Competition in PyTorch
Code for KDD 2020 paper Robust Spammer Detection by Nash Reinforcement Learning
🗣️ Tool to generate adversarial text examples and test machine learning models against them
Code for our ICLR Trustworthy ML 2020 workshop paper "Improved Image Wasserstein Attacks and Defe...
code for the ICLR'22 paper: On Robust Prefix-Tuning for Text Classification
Paper Collection of Adversarial Machine Learning