Model Explainability
mlcli includes built-in model explainability tools to help you understand why your models make specific predictions. Use SHAP, LIME, and other techniques to build trust in your models.
Quick Start
Generate SHAP explanations for a trained model:
# Generate SHAP summary plot
mlcli explain models/rf_model.pkl \
--data data/test.csv \
--method shap \
--output explanations/Available Explainers
SHAP
SHapley Additive exPlanations - game-theoretic approach to explain model predictions.
- Feature importance
- Dependency plots
- Force plots
- Summary plots
LIME
Local Interpretable Model-agnostic Explanations - explains individual predictions.
- Local explanations
- Model-agnostic
- Interpretable models
- Feature weights
Feature Importance
Built-in feature importance from tree-based models.
- Fast computation
- Native support
- Gini importance
- Permutation importance
Partial Dependence
Show the marginal effect of features on predictions.
- 1D plots
- 2D interaction plots
- ICE plots
- Feature effects
SHAP Explanations
SHAP (SHapley Additive exPlanations) uses game theory to explain the output of any machine learning model. It connects optimal credit allocation with local explanations.
Summary Plot
Shows the importance and impact of all features across all predictions:
mlcli explain model.pkl --method shap --plot summaryForce Plot
Explains a single prediction by showing feature contributions:
mlcli explain model.pkl --method shap --plot force --sample 0Dependence Plot
Shows the effect of a single feature across all predictions:
mlcli explain model.pkl --method shap --plot dependence --feature ageLIME Explanations
LIME (Local Interpretable Model-agnostic Explanations) explains individual predictions by approximating the model locally with an interpretable model.
Single Prediction
mlcli explain model.pkl \
--method lime \
--sample 42 \
--num-features 10Batch Explanations
mlcli explain model.pkl \
--method lime \
--samples 0,1,2,3,4 \
--output explanations/lime/Feature Importance
For tree-based models, get built-in feature importance scores:
# Feature importance from model
mlcli explain model.pkl --method importance
# Permutation importance (model-agnostic)
mlcli explain model.pkl \
--method permutation \
--data test.csv \
--n-repeats 10Partial Dependence Plots
Visualize the marginal effect of one or two features on predictions:
# 1D PDP
mlcli explain model.pkl --method pdp --features age,income
# 2D interaction PDP
mlcli explain model.pkl --method pdp --features age,income --interactionYAML Configuration
Configure explainability in your experiment file:
# config.yaml
explainability:
methods:
- shap
- lime
shap:
plot_types:
- summary
- force
- dependence
max_samples: 100
lime:
num_features: 10
num_samples: 5000
output_dir: ./explanationsProgrammatic Access
Use the Python API for more control:
from mlcli.explainer import SHAPExplainer, LIMEExplainer
# Load model and data
model = load_model("models/rf_model.pkl")
X_test = pd.read_csv("data/test.csv")
# SHAP explanations
shap_explainer = SHAPExplainer(model)
shap_values = shap_explainer.explain(X_test)
shap_explainer.plot_summary(shap_values, X_test)
# LIME explanations
lime_explainer = LIMEExplainer(model, X_test)
explanation = lime_explainer.explain_instance(X_test.iloc[0])
explanation.show_in_notebook()Best Practices
- Sample size: For SHAP, use a representative sample (100-1000 instances) for summary plots
- Background data: SHAP requires background data - use training data or a summary
- Interpretation: SHAP values are additive - they sum to the difference from the expected value
- Model type: Use TreeExplainer for tree models (faster), KernelExplainer for others