The first study to compare the performances of trained forensic facial examiners, people known as super-recognisers who have a natural talent for face identification, and facial-recognition computer algorithms, has revealed that a combination of human and computer decision-making is most accurate.
The study, by a team of scientists from the National Institute of Standards and Technology in the US and three universities including UNSW Sydney, is published in the Proceedings of the National Academy of Sciences.
“Experts in face identification often play a crucial role in criminal cases,” says study team member, UNSW psychologist Dr David White.
“Deciding whether two images are of the same person, or two different people, can have profound consequences.
“When facial comparison evidence is presented in court, it can determine the outcome of a criminal trial. Errors on these decisions can potentially set a guilty person free, or wrongly convict an innocent person,” he says.
The international study involved a total of 184 participants from five continents - a large number for an experiment of this type.
Eighty-seven were trained professional facial examiners, while 13 were super-recognisers - people with exceptional natural ability, but no training. The remaining 84 were control participants with no special training or natural ability, including 53 fingerprint examiners and 31 undergraduate students.
Participants received pairs of face images and rated the likelihood of each pair being the same person on a seven-point scale. The research team intentionally selected extremely challenging pairs, using images taken with limited control of illumination, expression and appearance.
They then tested four of the latest computerised facial recognition algorithms, all developed between 2015 and 2017, using the same image pairs.
“As a group, trained forensic examiners outperformed the other groups,” says Dr White.
“Another important insight from the study was that the most advanced facial-recognition algorithms are now as accurate as the very best humans.
“However, the results with people showed large variation in accuracy of individuals in all the groups tested. This ranged from near random guessing, with an accuracy of about 50%, to a perfect score of 100%.
“This variability is a problem, because it is common practice for just one examiner to present face identification decisions in court,” says Dr White.
The study found that combining several examiners’ opinions produced higher accuracy than one examiner working alone, and led to less variability in accuracy compared to individual responses.
“But the surprising best solution to the problem of individual variability is to combine the responses from one examiner with the responses from the best algorithm. A combination of human and computer decision-making leads to the most accurate results,” Dr White says.
The study included researchers from the University of Texas and the University of Maryland.