Robustness-Aware Word Embedding Improves Certified Robustness to Adversarial Word Substitutions

June 1, 2023·
Yibin Wang
Yibin Wang
Equal contribution
,
Yichen Yang
Equal contribution
,
Di He
,
Kun He
· 0 min read
Abstract
Natural Language Processing (NLP) models have gained great success on clean texts, but they are known to be vulnerable to adversarial examples typically crafted by synonym substitutions. In this paper, we target to solve this problem and find that word embedding is important to the certified robustness of NLP models. Given the findings, we propose the Embedding Interval Bound Constraint (EIBC) triplet loss to train robustness-aware word embeddings for better certified robustness. We optimize the EIBC triplet loss to reduce distances between synonyms in the embedding space, which is theoretically proven to make the verification boundary tighter. Meanwhile, we enlarge distances among non-synonyms, maintaining the semantic representation of word embeddings.Our method is conceptually simple and componentized. It can be easily combined with IBP training and improves the certified robust accuracy from 76.73% to 84.78% on the IMDB dataset while reducing the training epochs to half. Experiments demonstrate that our method outperforms various state-of-the-art certified defense baselines and generalizes well to unseen substitutions.
Type
Publication
Findings of ACL 2023
publications
Yibin Wang
Authors
Yibin Wang (he/him)
Incoming Ph.D. student

I am an incoming Ph.D. student in the Computer Science Department at Rutgers University. I received my Bachelor’s degree at Huazhong University of Science and Technology in 2024. I was under the guidance of Prof. Kun He @ HUST, Prof. Hao Wang @ Rutgers and Prof. Huan Zhang @ UIUC.

From such a gentle thing, from such a fountain of all delight, my every pain is born.
—— Michelangelo