IBM’s journey towards responsible AI adoption with AI Verify

IBM, the multinational technology giant headquartered in Armonk, New York, is renowned for many of its innovative technology solutions, including AI. One of IBM’s key offerings is Industry Accelerators, a comprehensive set of tools that enable users to analyse data, build models, and display results to tackle common business issues.

IBM is dedicated to developing and promoting trustworthy AI – AI that is transparent, explainable, fair, and secure. The company believes that ethical AI practices are essential for building public trust in AI systems. IBM also advocates for the adoption of industry-wide standards and best practices for AI development and deployment. As part of its ongoing commitment to ethical AI, joining AI Verify international pilot was a natural step for IBM.

IBM piloted AI Verify testing framework on credit risk:

  • AI model: Binary classification model
  • Use Case: This use case involves the classification of customers based on their level of risk for lending. The model assigns labels of “Risk” or “No Risk” to each customer and is part of IBM’s Industry Accelerators. The model was trained using logistic regression from scikit-learn

IBM’s Positive Feedback on the Testing Process and potential areas of enhancements:

  • The user manual and quick start guide were well-written and documented
  • It only took 10 to 15 minutes to generate the testing report for the above model

Based on IBM’s experience, some areas of the Minimum Viable Product (MVP) could benefit from crowd-sourcing international efforts as the sciences for such testing are nascent. In particular, for technical testing of robustness, explainability, and fairness:

  • As there is limited support for model types and algorithms, this could be an area of potential contributions from the broader community
  • Having more detailed information on the model type, features used, and the algorithm used in AI Verify report for the benefit of a technical audience.
  • Robustness: While the explanation of robustness was good, the metric was calculated using accuracy. Adding plugins such as CLEVER (Cross-Lipschitz Extreme Value for nEtwork Robustness) to AI Verify could provide additional metrics to measure robustness
  • Explainability was well implemented with SHAP’s dot plot, force plot, and bar plot for both local and global explanations. However, users looking to investigate the outcome of a particular datapoint would not be able to select specific data points to run the local explanation

Examples of Dot plot, Force plot and Bar plot in AI Verify MVP:​

Dot plot
Force plot
Bar plot
  • In addition to technical explanations, including business-level explanations like counterfactual and contrastive explanations could value add the report
  • Fairness: The fairness tree has been designed to help users select the most appropriate fairness metrics for their specific use case. However, selecting the right metrics is not always straightforward, so AI Verify offers a workflow to guide users through the selection process. While IBM finds the fairness tree concept intuitive and interesting, the current design does not allow technical users to change the final selection of metrics on the summary report. This presents an opportunity for potential contributions

IBM’s commitment to trustworthy AI reflects its belief that AI has the potential to transform society in positive ways, but only if it is developed and deployed in a responsible and ethical manner.

AI Verify is indeed a good start towards the implementation of AI governance. IBM would be supporting this framework in its AI governance platform and continuous monitoring framework.


Share your experience using AI Verify today


Thank you for completing the form. Your submission was successful.

Preview all the questions


Your organisation’s background – Could you briefly share your organisation’s background (e.g. sector, goods/services offered, customers), AI solution(s) that has/have been developed/used/deployed in your organisation, and what it is used for (e.g. product recommendation, improving operation efficiency)?


Your AI Verify use case – Could you share the AI model and use case that was tested with AI Verify? Which version of AI Verify did you use?


Your reasons for using AI Verify – Why did your organisation decide to use AI Verify?


Your experience with AI Verify – Could you share your journey in using AI Verify? For example, preparation work for the testing, any challenges faced, and how were they overcome? How did you find the testing process? Did it take long to complete the testing?


Your key learnings and insights – Could you share key learnings and insights from the testing process? For example, 2 to 3 key learnings from the testing process? Any actions you have taken after using AI Verify?


Your thoughts on trustworthy AI – Why is demonstrating trustworthy AI important to your organisation and to any other organisations using AI systems? Would you recommend AI Verify? How does AI Verify help you demonstrate trustworthy AI?
Enter your name and email address below to download the Discussion Paper by Aicadium and IMDA.
Disclaimer: By proceeding, you agree that your information will be shared with the authors of the Discussion Paper.