Model Explainability

mlcli includes built-in model explainability tools to help you understand why your models make specific predictions. Use SHAP, LIME, and other techniques to build trust in your models.

Quick Start

Generate SHAP explanations for a trained model:

Terminal
# Generate SHAP summary plot
mlcli explain models/rf_model.pkl \
  --data data/test.csv \
  --method shap \
  --output explanations/

Available Explainers

SHAP

SHapley Additive exPlanations - game-theoretic approach to explain model predictions.

  • Feature importance
  • Dependency plots
  • Force plots
  • Summary plots

LIME

Local Interpretable Model-agnostic Explanations - explains individual predictions.

  • Local explanations
  • Model-agnostic
  • Interpretable models
  • Feature weights

Feature Importance

Built-in feature importance from tree-based models.

  • Fast computation
  • Native support
  • Gini importance
  • Permutation importance

Partial Dependence

Show the marginal effect of features on predictions.

  • 1D plots
  • 2D interaction plots
  • ICE plots
  • Feature effects

SHAP Explanations

SHAP (SHapley Additive exPlanations) uses game theory to explain the output of any machine learning model. It connects optimal credit allocation with local explanations.

Summary Plot

Shows the importance and impact of all features across all predictions:

Terminal
mlcli explain model.pkl --method shap --plot summary

Force Plot

Explains a single prediction by showing feature contributions:

Terminal
mlcli explain model.pkl --method shap --plot force --sample 0

Dependence Plot

Shows the effect of a single feature across all predictions:

Terminal
mlcli explain model.pkl --method shap --plot dependence --feature age

LIME Explanations

LIME (Local Interpretable Model-agnostic Explanations) explains individual predictions by approximating the model locally with an interpretable model.

Single Prediction

Terminal
mlcli explain model.pkl \
  --method lime \
  --sample 42 \
  --num-features 10

Batch Explanations

Terminal
mlcli explain model.pkl \
  --method lime \
  --samples 0,1,2,3,4 \
  --output explanations/lime/

Feature Importance

For tree-based models, get built-in feature importance scores:

Terminal
# Feature importance from model
mlcli explain model.pkl --method importance

# Permutation importance (model-agnostic)
mlcli explain model.pkl \
  --method permutation \
  --data test.csv \
  --n-repeats 10

Partial Dependence Plots

Visualize the marginal effect of one or two features on predictions:

Terminal
# 1D PDP
mlcli explain model.pkl --method pdp --features age,income

# 2D interaction PDP
mlcli explain model.pkl --method pdp --features age,income --interaction

YAML Configuration

Configure explainability in your experiment file:

YAML
# config.yaml
explainability:
  methods:
    - shap
    - lime
  shap:
    plot_types:
      - summary
      - force
      - dependence
    max_samples: 100
  lime:
    num_features: 10
    num_samples: 5000
  output_dir: ./explanations

Programmatic Access

Use the Python API for more control:

Python
from mlcli.explainer import SHAPExplainer, LIMEExplainer

# Load model and data
model = load_model("models/rf_model.pkl")
X_test = pd.read_csv("data/test.csv")

# SHAP explanations
shap_explainer = SHAPExplainer(model)
shap_values = shap_explainer.explain(X_test)
shap_explainer.plot_summary(shap_values, X_test)

# LIME explanations
lime_explainer = LIMEExplainer(model, X_test)
explanation = lime_explainer.explain_instance(X_test.iloc[0])
explanation.show_in_notebook()

Best Practices

  • Sample size: For SHAP, use a representative sample (100-1000 instances) for summary plots
  • Background data: SHAP requires background data - use training data or a summary
  • Interpretation: SHAP values are additive - they sum to the difference from the expected value
  • Model type: Use TreeExplainer for tree models (faster), KernelExplainer for others