Confounders: machine learning's blindspot
7 min read
The EU recently inked a proposal for wide-ranging new legislation pertaining to the use of AI in the union. causaLens has provided expert commentary on this incoming regulation for several major, international news outlets. Please find answers to a number of frequently asked questions.
The EU published a draft proposal for its new legislation that aims to tackle the usage of artificial intelligence and machine learning models within the union. Any company that deploys models that are considered high risk will be held accountable. Companies breaching the rules face fines of up to 6% of their global turnover or €30 million ($36 million), whichever is the higher figure.
The definition of “high risk” in the proposal is a bit nebulous — something that’s attracted criticism from some quarters. AI systems that could potentially have catastrophic consequences for the health and safety of citizens are identified in the proposal as high risk. As are AI models that can impinge on individuals’ rights or significantly alter the course of their lives. Examples include AI deployed in machinery or transportation, personal identification models, HR AI systems, credit scoring algorithms, and AI for law enforcement.
On financial services, we foresee that any model used within the retail business will be subject to these regulations. Including:
First, you need to determine whether your models should be classified as high risk or not. This is not a straightforward exercise. Model owners should err on the side of caution, given the substantial penalties that may exist for lack of disclosure. If you are still unsure, please don’t hesitate to contact us, we can help with the assessment on a case by case basis.
The following steps should be taken, as a minimum:
The regulation is unclear. We would strongly advise businesses to be on the safe side and not just rely on SHAP values, which might lead to large fines. (See question below).
Fines of up to 30 million euros ($36 million) or 6% revenue, whichever is greater. There are also less tangible but perhaps equally significant reputation risks to consider.
The proposed regulations are extremely wide-ranging and are likely to impact many AI applications.
Here is a simple example: A bank that is issuing mortgages develops an AI tool to decide on the rate to charge their customers. This tool will take a lot of data as input, including zip code and financial information for the client. While ethnicity isn’t necessarily used as input, the model may end up being judged to be discriminatory because neighborhood is used as a proxy for race.
There are many similar documented examples of models used in production that, purely by lack of oversight, end up posing ethical risks or behaving not as intended.
It depends how you approach the requirements.
It’s likely that your existing data science team is not set up to meet all the requirements. You need to be able to explain all your models in deployment, at length and coherently, to government workers. You need a database of all models used by the company. You need stress tests, unit tests and explainability reports, and you need to know where and how your models are used. Meeting these requirements with your existing human resource and tech capability is likely to be very expensive, time-consuming and difficult.
For companies operating in the financial services, these regulations are expected to be more restrictive than the usual SR11-7 requirements – as they cover elements that are specific to ML models.
The liability lies with the organization using the software. Even if a company buys software from a third party software provider, it will be legally and financially responsible.
Even the simplest models seem to fall within the scope of the regulations. See the extract below:
ARTIFICIAL INTELLIGENCE TECHNIQUES AND APPROACHES
a. Machine learning approaches, including supervised, unsupervised and reinforcement learning, using a wide variety of methods including deep learning;
b. Logic- and knowledge-based approaches, including knowledge representation, inductive (logic) programming, knowledge bases, inference/deductive engines, (symbolic) reasoning and expert systems;
c. Statistical approaches, Bayesian estimation, search and optimization methods.
As with GDPR, if your model impacts EU markets or citizens then you fall within the scope of the legislation.
Causal AI is truly explainable. Its decisions and predictions can be explained to regulators and other stakeholders in language they can understand. Unlike pseudo-explainable SHAP methods, causal models can be pre-loaded with a set of core assumptions and principles that cannot be violated in deployment. This makes them less vulnerable to adversarial attacks and unintended failure modes. Causal models also enable model owners to eliminate implicit biases, whereas conventional models exacerbate biases. Discrimination is ultimately causal, not correlational, and it takes a causal model to remedy.
Checking whether a model discriminates against gender takes more than switching the “gender” field from Male to Female – we need to understand as well that being Male/Female ends up being causally related to other variables which serve as input for the model. Understanding the whole landscape of causal relationships is key to addressing biases in ML models.
In addition, the Causal AI Enterprise Platform has a built-in model governance framework. It includes a model library with audit tools, the ability to automatically: stress test models, generate unit tests and explainability reports. It provides a straightforward way to deploy models so they can be accounted for, having all models in the same place through our deployment framework and ability to monitor them live in production.
Contact us to learn how you can future-proof your business.