2

Fairness via Representation Neutralization

Existing bias mitigation methods for DNN models primarily work on learning debiased encoders. This process not only requires a lot of instance-level annotations for sensitive attributes, it also does not guarantee that all fairness sensitive …

Towards Interpreting and Mitigating Shortcut Learning Behavior of NLU models

Recent studies indicate that NLU models are prone to rely on shortcut features for prediction. As a result, these models could potentially fail to generalize to real-world out-of-distribution scenarios. In this work, we show that the shortcut …

Techniques for Interpretable Machine Learning

Interpretable machine learning tackles the important problem that humans cannot understand the behaviors of complex machine learning models and how these models arrive at a particular decision. Although many approaches have been proposed, a …

Fairness in Deep Learning: A Computational Perspective

Deep learning is increasingly being used in high-stake decision making applications that affect individual lives. However, deep learning models might exhibit algorithmic discrimination behaviors with respect to protected groups, potentially posing …