AI can learn to say “I don’t know” – and that would be a great advance

In high-risk areas, such as medicine, law or engineering – or even in everyday situations – it is often safer to admit “I don’t know” than to give a wrong answer.

However, many artificial intelligence models still prefer to risk a response even when they don’t have enough confidence.

To meet this challenge, computer scientists at Johns Hopkins University have developed a new method that allows AI models to devote more time to reflection and use a trust score to decide when to refrain from answering.

New system uses confidence levels and penalties to teach models to refuse dangerous responses in sensitive contexts – Image: Suri Studio/Shutterstock

How the study was done

The research, published in the arXiv repository and which will be presented at the 63rd Meeting of the Computational Linguistics Association, shows that longer chains of reasoning help models respond more accurately – but only to a certain extent.
Even with more processing time, errors still occur when there are no penalties associated with incorrect responses.
The team tested different risk scenarios: exams (no penalties), with the game Jeopardy! (Rewards and equivalent penalties) and critical contexts (more severely penalized errors).
They found that, under stricter rules, models should avoid responding if they don’t have enough confidence after processing the problem.

IA admitting that you don’t know can avoid greater damage

Although this can frustrate users in everyday situations, it is essential in contexts where a wrong answer can have serious consequences.

Now, researchers encourage the AI community to adopt metrics that take into account the cost of error, promoting the development of safer, more transparent models and aware of their limitations.

( fontes: olhar digital)