Robustness-Aware Word Embedding Improves Certified Robustness to Adversarial Word Substitutions

Abstract

Natural Language Processing (NLP) models have gained great success on clean texts, but they are known to be vulnerable to adversarial examples typically crafted by synonym substitutions. In this paper, we target to solve this problem and find that word embedding is important to the certified robustness of NLP models. Given the findings, we propose the Embedding Interval Bound Constraint (EIBC) triplet loss to train robustness-aware word embeddings for better certified robustness.

Publication
In Findings of ACL 2023
Yibin Wang
Yibin Wang
Intern

My research interests focus on trustworthy artificial intelligence, particularly in the areas of calibration, generalization, and adversarial robustness.