Explainable AI explained | InfoWorld


While machine learning and deep learning models often produce good classifications and predictions, they are almost never perfect. Models almost always have some percentage of false positive and false negative predictions. That’s sometimes acceptable, but matters a lot when the stakes are high. For example, a drone weapons system that falsely identifies a school as a terrorist base could inadvertently kill innocent children and teachers unless a human operator overrides the decision to attack.

The operator needs to know why the AI classified the school as a target and the uncertainties of the decision before allowing or overriding the attack. There have certainly been cases where terrorists used schools, hospitals, and religious centers as bases for missile attacks. Was this school one of those? Is there intelligence or a recent observation that identifies the school as currently occupied by such terrorists? Are there reports or observations that establish that no students or teachers are present in the school?

If there are no such explanations, the model is essentially a black box, and that’s a huge problem. For any AI decision that has an impact — not only a life and death impact, but also a financial impact or a regulatory impact — it is important to be able to clarify what factors went into the model’s decision.

What is explainable AI?

Explainable AI (XAI), also called interpretable AI, refers to machine learning and deep learning methods that can explain their decisions in a way that humans can understand. The hope is that XAI will eventually become just as accurate as black-box models.

Explainability can be ante-hoc (directly interpretable white-box models) or post-hoc (techniques to explain a previously trained model or its prediction). Ante-hoc models include explainable neural networks (xNNs), explainable boosting machines (EBMs), supersparse linear integer models (SLIMs), reversed time attention model (RETAIN), and Bayesian deep learning (BDL).

Post-hoc explainability methods include local interpretable model-agnostic explanations (LIME) as well as local and global visualizations of model predictions such as accumulated local effect (ALE) plots, one-dimensional and two-dimensional partial dependence plots (PDPs), individual conditional expectation (ICE) plots, and decision tree surrogate models.

Copyright © 2021 IDG Communications, Inc.



Source link