What does trustworthiness mean?

Question

Accepted Answer

The trustworthiness of ML models encompasses criteria that enable stakeholders to assess whether the model aligns with their expectations. Trustworthiness is created for example by criteria like robustness, predictability, explainability, controllability, generalization, bias and fairness and more criteria.

AI controllability means, ability of an external agent to control the AI, its output or the behaviour of the item influenced by the AI output in order to prevent harm.
AI explainability means, property of an AI system to express important factors influencing the AI system's outputs in a way that humans can understand.
AI predictability means, ability of the AI system to produce trusted predictions, i.e. predications are accurate and there is statistical evidence.
AI generalization means, ability of an AI model to adapt and perform well on previously unseen data during inference.
AI robustness means, ability to maintain an acceptable level of performance although input is "imperfect", e.g. image data which is partially corrupt or there is significant sensor noise.
AI bias means, an AI model or dataset can be systematically prejudiced towards some kind of (potentially erroneous) assumption. This assumption stems from the inherent statistical distributions (e.g. over classes) in a dataset that can be learned by a model.
AI fairness means, if the model bias is linked to a difference in treatment of certain subgroups of humans (e.g. ethnic minorities, age or sex) this model is considered unfair. AI fairness is the reasonable absence of unfairness.

Knowledge Nugget