We’re excited to bring back Transform 2022 in person on July 19 and virtually from July 20-28. Join leaders in AI and data for in-depth discussions and exciting networking opportunities. Register today!

Rapid technical advancements and widespread adoption of artificial intelligence (AI)-based products and workflows are influencing many aspects of human and business activities in banking, healthcare, advertising, and more. others. While the accuracy of AI models is arguably the most important factor to consider when deploying AI-based products, there is an urgent need to understand how AI can be designed to work responsibly. .

Responsible AI is a framework that any organization developing software should adopt to build customer confidence in the transparency, accountability, fairness, and security of all deployed AI solutions. At the same time, a key aspect of making AI accountable is having a development pipeline that can foster reproducibility of results and manage lineage of data and ML models.

Low-code machine learning is gaining popularity with tools like PyCaret, H2O.ai and DataRobot, allowing data scientists to run pre-built models for feature engineering, data cleaning, model development and comparison of statistical performance. However, the missing pieces of these packages are often models around responsible AI that evaluates ML models for fairness, transparency, explainability, causality and more.

Here, we demonstrate a quick and easy way to integrate PyCaret with the Microsoft RAI (Responsible AI) framework to generate a detailed report showing error analysis, explainability, causality, and counterfactuals. The first part is a walkthrough for developers to show how a RAI dashboard can be built. The second part is a detailed evaluation of the RAI report.

Code Walkthrough

First, we install the necessary libraries. This can be done on your local machine with Python 3.6+ or on a SaaS platform like Google Colab.

!pip install raiwidgets
!pip install pycaret
!pip install — upgrade pandas
!pip install — upgrade numpy

Pandas and Numpy upgrade is needed at the moment but should be fixed shortly. Also, don’t forget to restart the runtime if you’re installing in Google Colab.

Then we load the data from GitHub and clean the data and perform feature engineering with PyCaret.

import pandas as pd, numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
csv_url = ‘https://raw.githubusercontent.com/sahutkarsh/loan-prediction-analytics-vidhya/master/train.csv'
dataset_v1 = pd.read_csv (csv_url)
dataset_v1 = dataset_v1.dropna()
from pycaret.classification import *
clf_setup = setup(data = dataset_v1, target = ‘Loan_Status’,
train_size=0.8, categorical_features=[‘Gender’, ‘Married’, ‘Education’,
‘Self_Employed’, ‘Property_Area’], 
imputation_type=’simple’, categorical_imputation = ‘mode’, ignore_features=[‘Loan_ID’], fix_imbalance=True, silent=True, session_id=123)

The dataset is a dataset of simulated loan applications with characteristics such as gender, marital status, employment, income, etc. applicants. PyCaret has a nice feature to make training and test dataframes available after feature engineering via the get _config method. We use it to get some clean functionality that we’ll feed into the RAI widget later.

X_train = get_config(variable=”X_train”).reset_index().drop([‘index’], axis=1)
y_train = get_config(variable=”y_train”).reset_index().drop([‘index’], axis=1)[‘Loan_Status’]
X_test = get_config(variable=”X_test”).reset_index().drop([‘index’], axis=1)
y_test = get_config(variable=”y_test”).reset_index().drop([‘index’], axis=1)[‘Loan_Status’]
df_train = X_train.copy()
df_train[‘LABEL’] = y_train
df_test = X_test.copy()
df_test[‘LABEL’] = y_test

Now we run PyCaret to create several models and compare them on Recall as a statistical performance measure.

top5_results = compare_models(n_select=5, sort="Recall")
Figure 1 – PyCaret models compared on Recall

Our top model is a random forest classifier with a recall of 0.9, which we can plot here.

selected_model = top5_results[0]
Figure 2 – AUC for ROC curves of selected model

Now we’ll write our 10 lines of code to build an RAI dashboard using the dataframes and models we generated from PyCaret.

cat_cols = [‘Gender_Male’, ‘Married_Yes’, ‘Dependents_0’, ‘Dependents_1’, ‘Dependents_2’, ‘Dependents_3+’, ‘Education_Not Graduate’, ‘Self_Employed_Yes’, ‘Credit_History_1.0’, ‘Property_Area_Rural’, ‘Property_Area_Semiurban’, ‘Property_Area_Urban’]
from raiwidgets import ResponsibleAIDashboard
from responsibleai import RAIInsights

rai_insights = RAIInsights(selected_model, df_train, df_test, ‘LABEL’, ‘classification’,

rai_insights.causal.add(treatment_features=[‘Credit_History_1.0’, ‘Married_Yes’])
rai_insights.counterfactual.add(total_CFs=10, desired_class=’opposite’)

The code above, while quite minimalistic, does a lot under the hood. It creates RAI information for classification and adds modules for explainability and error analysis. Next, a causal analysis is performed based on two processing characteristics, including credit history and marital status. In addition, a counterfactual analysis is performed for 10 scenarios. Now, let’s generate the dashboard.


The code above will start the dashboard on a port like 5000. On a local machine, you can directly go to http://localhost:5000 and see the dashboard. On Google Colab, you need to do a simple trick to see this dashboard.

from google.colab.output import eval_js


This will give you a URL to view the RAI dashboard. You can see some components of the RAI dashboard below. Here are some major results from this analysis that were automatically generated to complement the AutoML analysis performed by PyCaret.

Results: Responsible AI Report

Parse error: We find that the error rate is high for rural land areas and our model has a negative bias for this characteristic.

Global explainability – importance of features: We find that the importance of the characteristic remains the same in both cohorts – all data (blue) and rural ownership area (orange). We see that for the orange cohort, property square footage has a bigger impact, but credit history is still the #1 factor.

Local explainability: We see that credit history is also an important feature for an individual prediction – line #20.

Counterfactual analysis: We see that for the same row #20, an N to Y decision may be possible (based on the data) if the credit history and loan amount are changed.

Causal inference: We consider causal analysis to investigate the impact of two treatments, credit history and employment status, and find that credit history has a greater direct impact on approval.

A responsible AI analysis report showing model error analysis, explainability, causal inference and counterfactuals can add great value to the traditional statistical measures of precision-recall that we usually use as levers to assess the models. With modern tools like PyCaret and RAI dashboards, it is easy to create these reports. These reports can be developed using other tools – the key is that data scientists need to evaluate the patterns of these models on Responsible AI to ensure their models are ethical and accurate.

Dattaraj Rao is Chief Data Scientist at Persistent.


Welcome to the VentureBeat community!

DataDecisionMakers is where experts, including data technicians, can share data insights and innovations.

If you want to learn more about cutting-edge insights and up-to-date information, best practices, and the future of data and data technology, join us at DataDecisionMakers.

You might even consider writing your own article!

Learn more about DataDecisionMakers

About The Author

Related Posts