Model Explainability

mlcli includes built-in model explainability tools to help you understand why your models make specific predictions. Use SHAP, LIME, and other techniques to build trust in your models.

Quick Start

Generate SHAP explanations for a trained model:

Terminal

# Generate SHAP summary plot
mlcli explain models/rf_model.pkl \
  --data data/test.csv \
  --method shap \
  --output explanations/

Available Explainers

SHAP

SHapley Additive exPlanations - game-theoretic approach to explain model predictions.

Feature importance
Dependency plots
Force plots
Summary plots

LIME

Local Interpretable Model-agnostic Explanations - explains individual predictions.

Local explanations
Model-agnostic
Interpretable models
Feature weights

Feature Importance

Built-in feature importance from tree-based models.

Fast computation
Native support
Gini importance
Permutation importance

Partial Dependence

Show the marginal effect of features on predictions.

1D plots
2D interaction plots
ICE plots
Feature effects

SHAP Explanations

SHAP (SHapley Additive exPlanations) uses game theory to explain the output of any machine learning model. It connects optimal credit allocation with local explanations.

Summary Plot

Shows the importance and impact of all features across all predictions:

Terminal

mlcli explain model.pkl --method shap --plot summary

Force Plot

Explains a single prediction by showing feature contributions:

Terminal

mlcli explain model.pkl --method shap --plot force --sample 0

Dependence Plot

Shows the effect of a single feature across all predictions:

Terminal

mlcli explain model.pkl --method shap --plot dependence --feature age

LIME Explanations

LIME (Local Interpretable Model-agnostic Explanations) explains individual predictions by approximating the model locally with an interpretable model.

Single Prediction

Terminal

mlcli explain model.pkl \
  --method lime \
  --sample 42 \
  --num-features 10

Batch Explanations

Terminal

mlcli explain model.pkl \
  --method lime \
  --samples 0,1,2,3,4 \
  --output explanations/lime/

Feature Importance

For tree-based models, get built-in feature importance scores:

Terminal

# Feature importance from model
mlcli explain model.pkl --method importance

# Permutation importance (model-agnostic)
mlcli explain model.pkl \
  --method permutation \
  --data test.csv \
  --n-repeats 10

Partial Dependence Plots

Visualize the marginal effect of one or two features on predictions:

Terminal

# 1D PDP
mlcli explain model.pkl --method pdp --features age,income

# 2D interaction PDP
mlcli explain model.pkl --method pdp --features age,income --interaction

YAML Configuration

Configure explainability in your experiment file:

YAML

# config.yaml
explainability:
  methods:
    - shap
    - lime
  shap:
    plot_types:
      - summary
      - force
      - dependence
    max_samples: 100
  lime:
    num_features: 10
    num_samples: 5000
  output_dir: ./explanations

Programmatic Access

Use the Python API for more control:

Python

from mlcli.explainer import SHAPExplainer, LIMEExplainer

# Load model and data
model = load_model("models/rf_model.pkl")
X_test = pd.read_csv("data/test.csv")

# SHAP explanations
shap_explainer = SHAPExplainer(model)
shap_values = shap_explainer.explain(X_test)
shap_explainer.plot_summary(shap_values, X_test)

# LIME explanations
lime_explainer = LIMEExplainer(model, X_test)
explanation = lime_explainer.explain_instance(X_test.iloc[0])
explanation.show_in_notebook()

Best Practices

Sample size: For SHAP, use a representative sample (100-1000 instances) for summary plots
Background data: SHAP requires background data - use training data or a summary
Interpretation: SHAP values are additive - they sum to the difference from the expected value
Model type: Use TreeExplainer for tree models (faster), KernelExplainer for others

Preprocessing Hyperparameter Tuning