Model Explanations are Part of Ethical Data Practice

Share |

Institutions involved in predictive modeling are using ever more advanced techniques to predict outcomes of interest from credit scoring to facial recognition to spam detection.  Institutions assess the performance of these models through standard measures such as accuracy (the number of correct predictions divided by the total number of predictions) or error rate (the number of incorrect predictions divided by the total number of predictions).  They can in addition assess the fairness of their predictions with respect to vulnerable groups using measures such as predictive parity across groups, statistical parity, or equal error rates.

Institutions also face legal and ethical obligations to explain the basis of their consequential decisions to those who are affected, to regulators and to the general public.  The idea is that people have rights based on autonomy and dignity to be able to understand why institutions make the decisions they do. When predictive models are the sole or a strong component of consequential decisions, this general obligation to explain decisions becomes an obligation to convey information about the workings of the models used. This duty to disclose is in addition to any required standards for measures of accuracy and fairness.

Some requirements for explanation are aimed at providing people with an understanding of specific decisions, rather than an understanding of the logic of the model that produced the decisions.  Thus, consumers who were rejected for credit could be told that this decision was based on the length of employment, the length of residency, and the excessive nature of their obligations in relationship to their income. Consumers can infer from this that these factors were elements in the predictive model used, but they are not told exactly how these factors contributed.

The Equal Credit Opportunity Act (ECOA) imposes one legal obligation. ECOA requires that “each applicant against whom adverse action is taken shall be entitled to a statement of reasons for such action from the creditor. . .A statement of reasons meets the requirements of this section only if it contains the specific reasons for the adverse action taken.”

Another legal obligation is part of the Fair Credit Reporting Act (FCRA). FCRA requires consumer reporting agencies to disclose “all of the key factors that adversely affected the credit score of the consumer in the model used, the total number of which shall not exceed 4.” In cases of an adverse action based on information in a credit report, FCRA requires disclosure of these key factors.

Both ECOA and FCRA require disclosures of explanations of specific outcomes of automated decisions. Regulation B implements ECOA and provides guidance for FCRA disclosures as well.  An appendix to Regulation B contains a Sample Notice of Action Taken and Statement of Reasons that contains a list of 24 factors that can be used to satisfy these disclosure obligations.

In addition, credit bureaus have also provided information about the factors used in developing credit scores.  Experian, for instance, says that total debt, the types of accounts, the number of late payments, and the age of accounts are factors that affect credit scores. FICO indicates that 35% of your credit score is derived from payment history, 30% from amounts owed, and 15% from the length of credit history.

There is no general obligation to disclose the actual formula used in predictive models.  Doing so would have adverse consequences for the incentive to develop accurate and error free models and would often frustrate the purpose of developing the models in the first place.  Disclosing the formula for detecting terrorist financing, for instance, would give terrorists a roadmap to avoid detection.

Still, there are some narrow circumstances where current law might require the disclosure of the formula used to make a consequential decision about people.  In a case involving the Houston school district and the SAS Educational Value-Added Assessment System (EVAAS), the court ruled that a dismissed teacher had a due process right to verify the score that was used as a basis for dismissal, which would not be possible for a proprietary system like the SAS EVAAS.  If this decision stands, school districts might not be able to dismiss teachers with a constitutionally protected property right in their jobs (such as a long-term contract) based on a proprietary algorithm that evaluates their effectiveness.

The European Union’s General Data Protection Regulation (GDPR), which will go into effect in May 2018, would impose an expanded obligation to explain automated decisions.  It would affect a much larger range of applications than is customary or expected in the United States.  The GDPR would apply whenever a decision has “a legal or similarly significant effect on someone.”  In addition, it would require information about the logic of predictive models, not just an understanding of the factors at play in specific decisions.

Some have argued that there is nothing new in GDPR because the previous Data Protection Directive already contained the requirement for explanation and it hadn’t been implemented in any specific way. But there is something new here. Article 15 of GDPR grants data subjects access to “meaningful information about the logic involved” in “automated decision-making.” This goes beyond the text in Article 12 of the earlier Data Protection Directive which called only for a right of access to “knowledge of the logic involved” in automated decisions.

What this means in practice is not yet clear. But the UK’s Independent Commissioner’s Office has interpreted GDPR as requiring explanations in a wide range of circumstances including assessments of an individual with respect to:

  • performance at work;
  • economic situation;
  • health;
  • personal preferences;
  • reliability;
  • behaviour;
  • location; or
  • movements.

ICO says that models used for these assessments have to “ensure processing is fair and transparent by providing meaningful information about the logic involved, as well as the significance and the envisaged consequences.”  Beyond explanations, ICO is looking for predictive models to “use appropriate mathematical or statistical procedure” and “implement appropriate technical and organisational measures to enable inaccuracies to be corrected and minimise the risk of errors.”

Institutions using predictive models will need to assess their systems to ensure that their processes and procedures provide these protection, not only as a matter of legal obligation, but also as a matter of ethical data practice. 

Mark Mark MacCarthy, Senior Vice President, Public Policy at SIIA, directs SIIA’s public policy initiatives in the areas of intellectual property enforcement, information privacy, cybersecurity, cloud computing and the promotion of educational technology. Follow Mark on Twitter at @Mark_MacCarthy.