When AI Goes Wrong: An Algorithm Wronged Hundreds of Families in the Netherlands


in AI Adventures

At the beginning of 2021, as a result of a scandal in which thousands of families were falsely accused of child benefit fraud, all the involved in the Dutch government resigned. The reason these families were wrongly accused was due to a flaw in the artificial intelligence used in the allocation software. This is probably the only case in which artificial intelligence has played such a pivotal role in a government, at least so far…

Any claim for child benefit in the Netherlands requires a claim to be submitted to the Dutch tax authorities. From 2013, these documents are reviewed by an algorithm which, in addition to giving a grade, also uses the documents to train the model. Based on what it “learned” from the previous documents, the algorithm was supposed to assess whether there was a risk of fraud. Then employees would double-check the documents suspected of being risky. In reality, the algorithm managed to develop “biases” and incorrectly flag claims as fraudulent. The officials in the tax administration who were supposed to check this data didn’t really check it (probably they trusted the technology more than they should) and immediately stood by the decision of the algorithm. A good topic of debate that opens up here is who is ultimately at fault – whether it is the AI that misjudged it, or the humans that were lazy or blindly trusted the technology, to an extent that they didn’t bother rechecking it.

Regardless, as a direct result of this incident, many families received invitations to return the funds they received as child benefit. Part of the reason these requests were marked as high risk was lack of signatures, or dual citizenship. The number of affected families is estimated to be around 26,000. Some victims had to return more than 100,000 euros, writes the BBC.

“An analysis of the issues found evidence of bias.” Many of the victims had low incomes, and a disproportionate number belonged to an ethnic minority or had an immigrant background. The model considered the applicant not being a Dutch citizen to be an increased risk factor, IEEE Spectrum reported.

The simplest way to avoid this problem is to open the algorithm and make it transparent. The algorithm that was used for these decisions was completely closed, and there was no explanation on how it works. In addition, the people who had to make the final decision were not aware of the mistakes that the algorithm could make and just approved the decision of the AI.

This proved to be a valid lesson for the Netherlands, and in the future all artificial intelligence that is public related will be under the control of the state data agency.

A wrongly trained AI model can cause an abundance of headaches 

The danger of “biased” algorithm training is not new. As early as 1988, the United Kingdom’s Commission for Racial Equality found discrimination among applicants to the British Medical School. The software that made the selection of candidates to be called for an interview had a built-in bias. The algorithm used was biased against women and applicants with non-European names.

Even in situations where the algorithm is free of error, training with data embedded with human biases is a serious danger. In the United States, artificial intelligence that was supposed to assess the risk of a future offense incorrectly marked African Americans twice as often as whites. This data is further used in various instances of the legal system in the United States; from the right to pardon, to determining the amount of bail. Judges in Arizona, Colorado, Delaware, Kentucky, Louisiana, Oklahoma, Virginia, Washington, and Wisconsin receive this information during sentencing.

Another problem with using artificial intelligence is accuracy. A ProPublica investigation shows that Northpointe’s algorithm, which is widely used in the US, is terribly unreliable. For over 7,000 arrested in Broward, Florida, predictions of repeat serious crime were 20% accurate. At best, if minor crimes are included, the algorithm hits 61% of the time. Mistake and misjudgment are done with both whites and African Americans. However, the algorithm more regularly wrongly marks African-Americans as high-risk for recidivism, and overlooks whites. Interestingly, none of the 137 questions on which the model is built refers to skin color. The computer draws conclusions based on other parameters, such as: “How many convicted members are there in the family?” or “Did you often fight at school?”.

Errors occur in algorithms in a number of ways. One is training with inappropriate datasets. In this case, the artificial intelligence learns on data that contains biased human decisions. Another reason for error is if the AI is trained on data sets where some groups don’t have adequate representation. As a consequence, some face recognition technologies don’t recognize faces of communities not represented in the data set.

Notify of
Inline Feedbacks
View all comments