Examples
Comprehensive examples and configuration templates to help you get started with mlcli. Copy and adapt these examples for your own projects.
Quick Start Commands
Get up and running quickly with these common commands:
Train Random Forest
Train a basic Random Forest classifier
mlcli train -d data.csv -m random_forest --target labelTrain with Config
Train using a JSON configuration file
mlcli train --config configs/rf_config.jsonHyperparameter Tuning
Tune hyperparameters with random search
mlcli tune --config configs/tune_rf.json --method random --n-trials 50Model Explanation
Generate SHAP explanations for your model
mlcli explain --model models/model.pkl --data test.csv --method shapList Experiments
View all experiment runs
mlcli list-runsInteractive UI
Launch the interactive terminal UI
mlcli uiRecommended Project Structure
my-ml-project/
├── data/
│ ├── train.csv
│ ├── test.csv
│ └── raw/
├── configs/
│ ├── random_forest.json
│ ├── lightgbm.json
│ └── tuning/
│ └── tune_rf.json
├── models/
│ └── (trained models saved here)
├── runs/
│ └── (experiment logs saved here)
└── notebooks/
└── analysis.ipynbClassification Examples
Configuration examples for classification tasks:
{
"model_type": "random_forest",
"data": {
"train_path": "data/train.csv",
"test_path": "data/test.csv",
"target_column": "label"
},
"preprocessing": {
"scaler": "standard",
"handle_missing": "mean"
},
"params": {
"n_estimators": 100,
"max_depth": 10,
"min_samples_split": 2,
"random_state": 42
},
"output": {
"model_dir": "models/",
"run_dir": "runs/"
}
}Run with: mlcli train --config configs/random_forest.json
Clustering Examples
Configuration examples for unsupervised clustering:
{
"model_type": "kmeans",
"data": {
"train_path": "data/unlabeled.csv",
"feature_columns": null
},
"preprocessing": {
"scaler": "standard",
"handle_missing": "mean"
},
"params": {
"n_clusters": 5,
"init": "k-means++",
"n_init": 10,
"max_iter": 300,
"algorithm": "lloyd"
},
"output": {
"model_dir": "models/",
"formats": ["pickle", "joblib"]
}
}Run with: mlcli train --config configs/clustering/kmeans.json
Anomaly Detection Examples
Configuration examples for anomaly and outlier detection:
{
"model_type": "isolation_forest",
"data": {
"train_path": "data/normal_data.csv",
"feature_columns": null
},
"preprocessing": {
"scaler": "standard",
"handle_missing": "mean"
},
"params": {
"n_estimators": 100,
"max_samples": "auto",
"contamination": "auto",
"max_features": 1.0,
"bootstrap": false
},
"output": {
"model_dir": "models/",
"formats": ["pickle", "joblib"]
}
}Run with: mlcli train --config configs/anomaly/isolation_forest.json
Hyperparameter Tuning Examples
Configuration examples for hyperparameter optimization:
{
"model_type": "random_forest",
"tuning": {
"method": "random",
"n_trials": 50,
"cv_folds": 5,
"scoring": "accuracy"
},
"param_grid": {
"n_estimators": [50, 100, 200, 300],
"max_depth": [5, 10, 15, 20, null],
"min_samples_split": [2, 5, 10],
"min_samples_leaf": [1, 2, 4]
},
"data": {
"train_path": "data/train.csv",
"target_column": "label"
},
"output": {
"model_dir": "models/",
"save_best": true
}
}Run with: mlcli tune --config configs/tuning/tune_rf.json
Python API Examples
Use mlcli programmatically in your Python scripts:
from mlcli.trainers import (
RandomForestTrainer,
LightGBMTrainer,
KMeansTrainer,
IsolationForestTrainer
)
from mlcli.preprocessor import PreprocessingPipeline
import pandas as pd
# Load data
df = pd.read_csv('data/train.csv')
X = df.drop('target', axis=1)
y = df['target']
# Option 1: Simple training
trainer = RandomForestTrainer(n_estimators=100, max_depth=10)
trainer.train(X, y)
predictions = trainer.predict(X_test)
metrics = trainer.evaluate(X_test, y_test)
print(f"Accuracy: {metrics['accuracy']:.4f}")
# Option 2: With preprocessing
pipeline = PreprocessingPipeline([
('scaler', 'standard'),
('selector', 'select_k_best', {'k': 10})
])
X_processed = pipeline.fit_transform(X)
# Option 3: LightGBM with early stopping
lgb_trainer = LightGBMTrainer(
n_estimators=500,
learning_rate=0.1,
num_leaves=31
)
lgb_trainer.train(X, y, X_val=X_val, y_val=y_val, early_stopping_rounds=50)
# Option 4: Clustering
kmeans = KMeansTrainer(n_clusters=5)
kmeans.train(X)
clusters = kmeans.predict(X)
print(f"Silhouette Score: {kmeans.evaluate(X)['silhouette_score']:.4f}")
# Option 5: Anomaly Detection
iso_forest = IsolationForestTrainer(contamination=0.1)
iso_forest.train(X_normal)
anomalies = iso_forest.get_anomalies(X_test)
print(f"Found {len(anomalies)} anomalies")End-to-End Workflow
Complete workflow from data to deployed model:
# Step 1: Preprocess data
mlcli preprocess \
--data data/raw.csv \
--output data/processed.csv \
--methods standard_scaler,select_k_best
# Step 2: Train multiple models
mlcli train --config configs/random_forest.json
mlcli train --config configs/lightgbm.json
mlcli train --config configs/xgboost.json
# Step 3: Compare experiments
mlcli list-runs
# Step 4: Tune the best model
mlcli tune \
--config configs/tuning/tune_lightgbm.json \
--method optuna \
--n-trials 100
# Step 5: Explain the model
mlcli explain \
--model models/best_model.pkl \
--data data/test.csv \
--method shap \
--output explanations/
# Step 6: Evaluate on test set
mlcli eval \
--model models/best_model.pkl \
--data data/test.csvDownload Example Configs
Get all example configurations from the repository:
# Clone repository
git clone https://github.com/codeMaestro78/MLcli.git
# Navigate to examples
cd MLcli/examples
# View available configs
ls configs/Need More Examples?
Check out the GitHub repository for more examples, or open an issue to request specific examples.