Trustworthy NLP

Our team has contributed significant efforts to advancing Trustworthy NLP, with a focus on developing more robust, fair, and explainable large language models (LLMs). Through innovations in model training, evaluation, and interpretation, we are hoping to develop LLMs that are reliable, unbiased, and transparent. Our work is helping overcome key challenges in deploying LLMs responsibly. Below, we summarize our work on robustness, fairness and explainability of LLMs.

LLM Explainability

LLM Fairness

LLM Robustness