Examples

Comprehensive examples and configuration templates to help you get started with mlcli. Copy and adapt these examples for your own projects.

Quick Start Commands

Get up and running quickly with these common commands:

Train Random Forest

Train a basic Random Forest classifier

mlcli train -d data.csv -m random_forest --target label

Train with Config

Train using a JSON configuration file

mlcli train --config configs/rf_config.json

Hyperparameter Tuning

Tune hyperparameters with random search

mlcli tune --config configs/tune_rf.json --method random --n-trials 50

Model Explanation

Generate SHAP explanations for your model

mlcli explain --model models/model.pkl --data test.csv --method shap

List Experiments

View all experiment runs

mlcli list-runs

Interactive UI

Launch the interactive terminal UI

mlcli ui

Recommended Project Structure

Project Structure
my-ml-project/
├── data/
│   ├── train.csv
│   ├── test.csv
│   └── raw/
├── configs/
│   ├── random_forest.json
│   ├── lightgbm.json
│   └── tuning/
│       └── tune_rf.json
├── models/
│   └── (trained models saved here)
├── runs/
│   └── (experiment logs saved here)
└── notebooks/
    └── analysis.ipynb

Classification Examples

Configuration examples for classification tasks:

configs/random_forest.json
{
  "model_type": "random_forest",
  "data": {
    "train_path": "data/train.csv",
    "test_path": "data/test.csv",
    "target_column": "label"
  },
  "preprocessing": {
    "scaler": "standard",
    "handle_missing": "mean"
  },
  "params": {
    "n_estimators": 100,
    "max_depth": 10,
    "min_samples_split": 2,
    "random_state": 42
  },
  "output": {
    "model_dir": "models/",
    "run_dir": "runs/"
  }
}

Run with: mlcli train --config configs/random_forest.json

Clustering Examples

Configuration examples for unsupervised clustering:

configs/clustering/kmeans.json
{
  "model_type": "kmeans",
  "data": {
    "train_path": "data/unlabeled.csv",
    "feature_columns": null
  },
  "preprocessing": {
    "scaler": "standard",
    "handle_missing": "mean"
  },
  "params": {
    "n_clusters": 5,
    "init": "k-means++",
    "n_init": 10,
    "max_iter": 300,
    "algorithm": "lloyd"
  },
  "output": {
    "model_dir": "models/",
    "formats": ["pickle", "joblib"]
  }
}

Run with: mlcli train --config configs/clustering/kmeans.json

Anomaly Detection Examples

Configuration examples for anomaly and outlier detection:

configs/anomaly/isolation_forest.json
{
  "model_type": "isolation_forest",
  "data": {
    "train_path": "data/normal_data.csv",
    "feature_columns": null
  },
  "preprocessing": {
    "scaler": "standard",
    "handle_missing": "mean"
  },
  "params": {
    "n_estimators": 100,
    "max_samples": "auto",
    "contamination": "auto",
    "max_features": 1.0,
    "bootstrap": false
  },
  "output": {
    "model_dir": "models/",
    "formats": ["pickle", "joblib"]
  }
}

Run with: mlcli train --config configs/anomaly/isolation_forest.json

Hyperparameter Tuning Examples

Configuration examples for hyperparameter optimization:

Python API Examples

Use mlcli programmatically in your Python scripts:

example_usage.py
from mlcli.trainers import (
    RandomForestTrainer,
    LightGBMTrainer,
    KMeansTrainer,
    IsolationForestTrainer
)
from mlcli.preprocessor import PreprocessingPipeline
import pandas as pd

# Load data
df = pd.read_csv('data/train.csv')
X = df.drop('target', axis=1)
y = df['target']

# Option 1: Simple training
trainer = RandomForestTrainer(n_estimators=100, max_depth=10)
trainer.train(X, y)
predictions = trainer.predict(X_test)
metrics = trainer.evaluate(X_test, y_test)
print(f"Accuracy: {metrics['accuracy']:.4f}")

# Option 2: With preprocessing
pipeline = PreprocessingPipeline([
    ('scaler', 'standard'),
    ('selector', 'select_k_best', {'k': 10})
])
X_processed = pipeline.fit_transform(X)

# Option 3: LightGBM with early stopping
lgb_trainer = LightGBMTrainer(
    n_estimators=500,
    learning_rate=0.1,
    num_leaves=31
)
lgb_trainer.train(X, y, X_val=X_val, y_val=y_val, early_stopping_rounds=50)

# Option 4: Clustering
kmeans = KMeansTrainer(n_clusters=5)
kmeans.train(X)
clusters = kmeans.predict(X)
print(f"Silhouette Score: {kmeans.evaluate(X)['silhouette_score']:.4f}")

# Option 5: Anomaly Detection
iso_forest = IsolationForestTrainer(contamination=0.1)
iso_forest.train(X_normal)
anomalies = iso_forest.get_anomalies(X_test)
print(f"Found {len(anomalies)} anomalies")

End-to-End Workflow

Complete workflow from data to deployed model:

workflow.sh
# Step 1: Preprocess data
mlcli preprocess \
  --data data/raw.csv \
  --output data/processed.csv \
  --methods standard_scaler,select_k_best

# Step 2: Train multiple models
mlcli train --config configs/random_forest.json
mlcli train --config configs/lightgbm.json
mlcli train --config configs/xgboost.json

# Step 3: Compare experiments
mlcli list-runs

# Step 4: Tune the best model
mlcli tune \
  --config configs/tuning/tune_lightgbm.json \
  --method optuna \
  --n-trials 100

# Step 5: Explain the model
mlcli explain \
  --model models/best_model.pkl \
  --data data/test.csv \
  --method shap \
  --output explanations/

# Step 6: Evaluate on test set
mlcli eval \
  --model models/best_model.pkl \
  --data data/test.csv

Download Example Configs

Get all example configurations from the repository:

Terminal
# Clone repository
git clone https://github.com/codeMaestro78/MLcli.git

# Navigate to examples
cd MLcli/examples

# View available configs
ls configs/

Need More Examples?

Check out the GitHub repository for more examples, or open an issue to request specific examples.