# Discrimination-aware data mining

Pedreschi, Dino and Ruggieri, Salvatore and Turini, Franco (2007) Discrimination-aware data mining. Technical Report del Dipartimento di Informatica . Università di Pisa, Pisa, IT.

In the context of civil rights law, discrimination refers to unfair or unequal treatment of people based on membership to a category or a minority, without regard to individual merit. Rules extracted from databases by data mining techniques, such as classification or association rules, when used for decision tasks such as benefit or credit approval, can be discriminatory, in the above sense. This deficiency of classification and association rules poses ethical and legal issues, as well as obstacles to practical application. In this paper, the notion of discriminatory classification rules is introduced and studied. Examples of potentially discriminatory attributes include gender, race, job, and age. A measure, termed $\alpha$-protection, of the discrimination power of a classification rule containing a discriminatory item is defined and its properties studied. We show that the introduced notion is non-trivial, in the sense that discriminatory rules can be derived from apparently safe ones under natural assumptions about background knowledge. Finally, we discuss how to check $\alpha$-protection and provide an empirical assessment on the German credit dataset.