Proceedings of the 29th ACM Conference on User Modeling, Adaptation and Personalization. June 2021. Pages 280–285
Abstract. Explanations for algorithmically generated recommendations is an important requirement for transparent and trustworthy recommender systems. When the internal recommendation model is not inherently interpretable (e.g., most contemporary systems are complex and opaque), or when access to the system is not available (e.g., recommendation as a service), explanations have to be generated post-hoc, i.e., after the system is trained. In this common setting, the standard approach is to provide plausible interpretations of the observed outputs of the system, e.g., by building a simple surrogate model that is inherently interpretable, and explaining that model. This however has several drawbacks. First, such explanations are not truthful, as they are rationalizations of the observed inputs and outputs constructed by another system. Second, there are privacy concerns, as to train a surrogate model, one has to know the interactions from users other than the one who seeks an explanation. Third, such explanations may not be scrutable and actionable, as they typically return weights for items or other users that are difficult to comprehend, and hard to act upon so to improve the quality of one’s recommendations. In this work, we present a model-agnostic explanation mechanism that is truthful, private, scrutable, and actionable. The key idea is to provide counterfactual explanations, defined as those small changes to the user’s interaction history that are responsible for observing the recommendation output to be explained. Without access to the internal recommendation model, finding concise counterfactual explanations is a hard search problem. We propose several strategies that seek to efficiently extract concise explanations under constraints. Experimentally, we show that these strategies are more efficient and effective than exhaustive and random search.