Robustness-Aware Word Embedding Improves Certified Robustness to Adversarial Word Substitutions

June 1, 2023·

Yibin Wang

Equal contribution

Yichen Yang

Equal contribution

Di He

Kun He

· 0 min read

ACL Anthology Code PDF

Abstract

Natural Language Processing (NLP) models have gained great success on clean texts, but they are known to be vulnerable to adversarial examples typically crafted by synonym substitutions. In this paper, we target to solve this problem and find that word embedding is important to the certified robustness of NLP models. Given the findings, we propose the Embedding Interval Bound Constraint (EIBC) triplet loss to train robustness-aware word embeddings for better certified robustness. We optimize the EIBC triplet loss to reduce distances between synonyms in the embedding space, which is theoretically proven to make the verification boundary tighter. Meanwhile, we enlarge distances among non-synonyms, maintaining the semantic representation of word embeddings.Our method is conceptually simple and componentized. It can be easily combined with IBP training and improves the certified robust accuracy from 76.73% to 84.78% on the IMDB dataset while reducing the training epochs to half. Experiments demonstrate that our method outperforms various state-of-the-art certified defense baselines and generalizes well to unseen substitutions.

Type

Conference paper

Publication

Findings of ACL 2023

Last updated on February 24, 2026

Certified Robustness Trustworthy AI Natural Language Processing

Authors

Yibin Wang (he/him)

Incoming Ph.D. student

I am an incoming Ph.D. student in the Computer Science Department at Rutgers University. I received my Bachelor’s degree at Huazhong University of Science and Technology in 2024. I was under the guidance of Prof. Kun He @ HUST, Prof. Hao Wang @ Rutgers and Prof. Huan Zhang @ UIUC.

From such a gentle thing, from such a fountain of all delight, my every pain is born.
—— Michelangelo

← Continual learning of large language models: A comprehensive survey April 25, 2024

No results found

Robustness-Aware Word Embedding Improves Certified Robustness to Adversarial Word Substitutions