MLflow, Docker, Kubernetes, and CI/CD methodologies
This is a comprehensive topic, so I'll guide you through the key concepts and provide a simplified example. Keep in mind that a full, production-ready implementation involves more complexity, but this will give you a solid foundation.
Understanding the Core Components
Before diving into the example, let's understand why each tool is essential:
-
MLflow: An open-source platform for managing the ML lifecycle.
- MLflow Tracking: Records parameters, metrics, and artifacts (models, datasets) of your experiments. Crucial for reproducibility and comparison.
- MLflow Models: Provides a standard format for packaging ML models, making them deployable across various platforms.
- MLflow Model Registry: A centralized hub for managing the lifecycle of ML models, including versioning, stage transitions (Staging, Production), and annotations.
- MLflow Projects: A format for packaging ML code in a reusable and reproducible way.
-
Docker: A platform for developing, shipping, and running applications in containers.
- Containerization: Packages your ML model and its dependencies (Python environment, libraries) into a portable unit. This ensures consistency across different environments (development, testing, production).
-
Kubernetes (K8s): An open-source system for automating deployment, scaling, and management of containerized applications.
- Orchestration: Manages your Docker containers, ensuring they run, scale, and recover from failures automatically.
- Scalability: Easily scales your model serving infrastructure up or down based on demand.
- High Availability: Distributes your application across multiple nodes, ensuring it remains available even if a node fails.
-
CI/CD (Continuous Integration/Continuous Deployment): A set of practices to automate and streamline the software development and deployment process.
- Continuous Integration (CI): Automates the process of merging code changes, running tests, and building artifacts (like Docker images) whenever a new change is pushed to the repository.
- Continuous Deployment (CD): Automates the deployment of tested and validated artifacts to production environments. For ML, this means deploying new model versions.
End-to-End ML Pipeline Steps
Here's a typical end-to-end ML pipeline with these tools:
- Data Ingestion & Preprocessing: Get your data ready.
- Model Training & Experiment Tracking: Train your model and log experiments.
- Model Versioning & Registration: Manage different versions of your trained models.
- Model Testing & Validation: Ensure your model performs as expected.
- Model Packaging (Dockerization): Create a deployable container for your model.
- Model Deployment (Kubernetes): Deploy the containerized model for serving.
- CI/CD Automation: Automate the entire workflow.
- Monitoring (Optional but Crucial): Keep an eye on model performance in production.
Practical Example: Wine Quality Prediction
Let's build a simplified pipeline to predict wine quality.
Prerequisites:
- Python 3.8+
- Docker Desktop installed and running
- kubectl (Kubernetes command-line tool)
- minikube (for local Kubernetes cluster) or access to a cloud Kubernetes cluster (EKS, GKE, AKS)
- MLflow (
pip install mlflow scikit-learn pandas) - Flask or FastAPI (for model serving API) (
pip install flask gunicorn) - Git
Step 0: Set up your Environment
- Create a project directory:
Bash
mkdir mlops_wine_quality cd mlops_wine_quality - Initialize a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: .\venv\Scripts\activate
3. Install dependencies:
pip install1 mlflow scikit-learn pandas flask gunicorn
4. Start MLflow Tracking Server (locally):
mlflow ui
```
This will start the MLflow UI, typically accessible at http://127.0.0.1:5000. Keep this running in a separate terminal.
- Start a local Kubernetes cluster (using Minikube):
Verify it's running:Bashminikube startBashkubectl get nodes
Step 1: Data Preprocessing & Model Training
Create a Python script named train_model.py:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
import mlflow
import mlflow.sklearn
import logging
logging.basicConfig(level=logging.WARN)
logger = logging.getLogger(__name__)
def evaluate_metrics(actual, pred):
accuracy = accuracy_score(actual, pred)
precision = precision_score(actual, pred, average='weighted', zero_division=0)
recall = recall_score(actual, pred, average='weighted', zero_division=0)
f1 = f1_score(actual, pred, average='weighted', zero_division=0)
return accuracy, precision, recall, f1
if __name__ == "__main__":
# Log to MLflow
mlflow.set_experiment("WineQuality_Classification")
# Load data (using a simple dummy dataset for demonstration)
# In a real scenario, this would involve more robust data loading/preprocessing
data = {
'fixed acidity': [7.4, 7.8, 7.8, 11.2, 7.4, 7.4, 7.9, 7.3, 7.8, 7.5],
'volatile acidity': [0.70, 0.88, 0.76, 0.28, 0.70, 0.66, 0.60, 0.65, 0.58, 0.50],
'citric acid': [0.00, 0.00, 0.04, 0.56, 0.00, 0.00, 0.06, 0.00, 0.02, 0.36],
'residual sugar': [1.9, 2.6, 2.3, 1.9, 1.9, 1.8, 1.6, 1.2, 2.0, 6.1],
'chlorides': [0.076, 0.098, 0.092, 0.075, 0.076, 0.075, 0.069, 0.065, 0.073, 0.071],
'free sulfur dioxide': [11.0, 25.0, 40.0, 17.0, 11.0, 13.0, 15.0, 15.0, 9.0, 10.0],
'total sulfur dioxide': [34.0, 67.0, 60.0, 60.0, 34.0, 40.0, 59.0, 21.0, 18.0, 24.0],
'density': [0.9978, 0.9968, 0.9968, 0.9980, 0.9978, 0.9978, 0.9964, 0.9946, 0.9968, 0.9978],
'pH': [3.51, 3.20, 3.26, 3.16, 3.51, 3.51, 3.30, 3.39, 3.36, 3.38],
'sulphates': [0.56, 0.68, 0.65, 0.58, 0.56, 0.56, 0.46, 0.47, 0.57, 0.58],
'alcohol': [9.4, 9.8, 9.8, 9.8, 9.4, 9.4, 10.8, 10.0, 9.5, 9.7],
'quality': [5, 5, 5, 6, 5, 5, 5, 7, 7, 5] # Target variable
}
wine_df = pd.DataFrame(data)
X = wine_df.drop("quality", axis=1)
y = wine_df["quality"]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Train a RandomForestClassifier
n_estimators = 100
max_depth = 10
with mlflow.start_run(run_name="RandomForest_WineQuality"):
model = RandomForestClassifier(n_estimators=n_estimators, max_depth=max_depth, random_state=42)
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
# Evaluate model
accuracy, precision, recall, f1 = evaluate_metrics(y_test, y_pred)
logger.info(f"RandomForestClassifier (n_estimators={n_estimators}, max_depth={max_depth}):")
logger.info(f" Accuracy: {accuracy:.4f}")
logger.info(f" Precision: {precision:.4f}")
logger.info(f" Recall: {recall:.4f}")
logger.info(f" F1-Score: {f1:.4f}")
# Log parameters and metrics to MLflow
mlflow.log_param("n_estimators", n_estimators)
mlflow.log_param("max_depth", max_depth)
mlflow.log_metric("accuracy", accuracy)
mlflow.log_metric("precision", precision)
mlflow.log_metric("recall", recall)
mlflow.log_metric("f1_score", f1)
# Log the model (Scikit-learn flavor)
mlflow.sklearn.log_model(model, "wine_quality_model", registered_model_name="WineQualityModel")
logger.info("Model logged to MLflow Model Registry as 'WineQualityModel'.")
Run the training script:
python train_model.py
Now, navigate to your MLflow UI (http://127.0.0.1:5000). You'll see a new experiment and a run with the logged parameters, metrics, and the wine_quality_model artifact. In the "Models" section, you'll find "WineQualityModel" registered.
Step 2: Model Versioning and Staging
In the MLflow UI, go to the Models tab, click on "WineQualityModel". You'll see Version 1.
Click on Version 1, then click on the Stage dropdown and set it to Staging.
This marks this specific version as ready for testing or pre-production. In a real scenario, you might have multiple versions in 'Staging' and promote the best one to 'Production'.
Step 3: Model Serving API (Flask/Gunicorn)
Create a file named app.py for the Flask API:
import os
import mlflow
import pandas as pd
from flask import Flask, request, jsonify
import logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
app = Flask(__name__)
# Load the model from MLflow Model Registry
# We'll fetch the 'Staging' version for now. In a real CD pipeline,
# you'd likely fetch the 'Production' version.
model_name = "WineQualityModel"
model_stage = "Staging" # Or "Production"
try:
# Set MLflow tracking URI if using a remote server
# os.environ["MLFLOW_TRACKING_URI"] = "http://your-mlflow-server:5000"
model = mlflow.pyfunc.load_model(model_uri=f"models:/{model_name}/{model_stage}")
logger.info(f"Successfully loaded model: {model_name} (Stage: {model_stage})")
except Exception as e:
logger.error(f"Error loading MLflow model: {e}")
model = None # Handle case where model loading fails
@app.route("/predict", methods=["POST"])
def predict():
if model is None:
return jsonify({"error": "Model not loaded. Please check server logs."}), 500
try:
data = request.get_json(force=True)
# Assuming the input data is a list of dictionaries, one for each prediction
# Example: [{"fixed acidity": 7.4, ..., "alcohol": 9.4}]
input_df = pd.DataFrame(data)
predictions = model.predict(input_df)
return jsonify({"predictions": predictions.tolist()})
except Exception as e:
logger.error(f"Error during prediction: {e}")
return jsonify({"error": str(e)}), 400
@app.route("/health", methods=["GET"])
def health_check():
if model is not None:
return jsonify({"status": "healthy", "model_loaded": True}), 200
else:
return jsonify({"status": "unhealthy", "model_loaded": False}), 503
if __name__ == "__main__":
# Use gunicorn for production-ready serving.
# For local testing, you can use app.run(host='0.0.0.0', port=5001, debug=True)
# The Gunicorn command will be used in the Dockerfile.
logger.info("Starting Flask application. Use Gunicorn for production.")
app.run(host='0.0.0.0', port=5001, debug=True) # For local testing only
Create a requirements.txt for the Flask app's dependencies:
mlflow
scikit-learn
pandas
flask
gunicorn
Step 4: Model Packaging (Dockerization)
Create a Dockerfile in the same directory as app.py and requirements.txt:
# Use a lightweight Python base image
FROM python:3.9-slim-buster
# Set the working directory in the container
WORKDIR /app
# Copy the requirements file and install dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy the application code
COPY app.py .
# Expose the port your Flask app will run on
EXPOSE 5001
# Command to run the application using Gunicorn
CMD ["gunicorn", "--bind", "0.0.0.0:5001", "app:app"]
Build the Docker image:
docker build -t wine-quality-predictor:latest .
Test the Docker image locally (optional):
docker run -p 5001:5001 wine-quality-predictor:latest
Once the container is running, open another terminal and test the prediction:
curl -X POST -H "Content-Type: application/json" \
-d '[{"fixed acidity": 7.4, "volatile acidity": 0.70, "citric acid": 0.00, "residual sugar": 1.9, "chlorides": 0.076, "free sulfur dioxide": 11.0, "total sulfur dioxide": 34.0, "density": 0.9978, "pH": 3.51, "sulphates": 0.56, "alcohol": 9.4}]' \
http://localhost:5001/predict
You should get a JSON response with predictions.
Step 5: Model Deployment (Kubernetes)
Now, let's deploy this Docker image to Kubernetes.
Create a Kubernetes deployment YAML file named deployment.yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: wine-quality-predictor-deployment
labels:
app: wine-quality-predictor
spec:
replicas: 1 # You can scale this up later
selector:
matchLabels:
app: wine-quality-predictor
template:
metadata:
labels:
app: wine-quality-predictor
spec:
containers:
- name: wine-quality-predictor-container
image: wine-quality-predictor:latest # Use the image you built
ports:
- containerPort: 5001
env:
- name: MLFLOW_TRACKING_URI
value: "http://host.minikube.internal:5000" # Point to your MLflow UI for model loading
# For cloud K8s, this would be your remote MLflow tracking server URI
imagePullPolicy: Never # Use "Never" for local minikube images, "IfNotPresent" or "Always" for registry images
---
apiVersion: v1
kind: Service
metadata:
name: wine-quality-predictor-service
spec:
selector:
app: wine-quality-predictor
ports:
- protocol: TCP
port: 80 # External port
targetPort: 5001 # Internal container port
type: LoadBalancer # Use NodePort for minikube, LoadBalancer for cloud K8s
Important Note for MLFLOW_TRACKING_URI:
- For Minikube:
http://host.minikube.internal:5000allows the container inside Minikube to access your host machine's MLflow UI. - For Cloud Kubernetes: You would point
MLFLOW_TRACKING_URIto a publicly accessible MLflow tracking server (e.g., hosted on Databricks, or a separate server you've deployed).
Deploy to Kubernetes:
kubectl apply -f deployment.yaml
Check deployment status:
kubectl get deployments
kubectl get pods
kubectl get services
Access the service (for Minikube):
minikube service wine-quality-predictor-service
This will open the service in your browser or provide a URL (e.g., http://192.168.49.2:30000). Replace localhost:5001 in your curl command with this URL.
Step 6: CI/CD Methodologies (Example with GitHub Actions)
CI/CD automates this entire process. Here's a conceptual GitHub Actions workflow. You'd typically store this in .github/workflows/main.yaml.
name: ML Pipeline CI/CD
on:
push:
branches:
- main # Trigger on pushes to the main branch
workflow_dispatch: # Allows manual triggering
jobs:
train_and_deploy:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.9'
- name: Install dependencies
run: |
pip install mlflow scikit-learn pandas flask gunicorn
# Step 1: Set up MLflow Tracking (e.g., remote MLflow server for CI/CD)
# In a real scenario, you'd configure MLFLOW_TRACKING_URI to a shared MLflow server
# - name: Configure MLflow Tracking URI
# run: |
# echo "MLFLOW_TRACKING_URI=http://your-remote-mlflow-server:5000" >> $GITHUB_ENV
- name: Run ML Training and Log Model
run: |
python train_model.py
# You would need to ensure MLflow can connect to a remote tracking server
# For this example, we assume MLflow is running locally or a mock server is used.
# In a real pipeline, you'd configure MLFLOW_TRACKING_URI to a persistent MLflow server.
# Step 2: Build Docker Image
- name: Build Docker image
run: |
docker build -t wine-quality-predictor:${{ github.run_number }} .
# Step 3: Tag and Push Docker Image to a Container Registry (e.g., Docker Hub)
# This requires Docker Hub credentials configured as GitHub Secrets
# - name: Log in to Docker Hub
# uses: docker/login-action@v2
# with:
# username: ${{ secrets.DOCKER_USERNAME }}
# password: ${{ secrets.DOCKER_PASSWORD }}
# - name: Push Docker image
# run: |
# docker push wine-quality-predictor:${{ github.run_number }}
# Step 4: Deploy to Kubernetes
# This requires kubectl and Kubernetes cluster access configured
- name: Set up Kubeconfig (for minikube in CI/CD, usually a cloud K8s cluster)
run: |
# Example for minikube locally:
# minikube start --driver=docker # If not already running
# kubectl config use-context minikube
# For a cloud cluster, you'd use a service account key or OIDC with provider actions
# For a practical example with a cloud provider, you'd typically have:
# - uses: azure/aks-set-context@v1
# with:
# creds: ${{ secrets.AZURE_CREDENTIALS }}
# cluster-name: your-aks-cluster
# resource-group: your-resource-group
# Or for AWS EKS:
# - uses: aws-actions/configure-aws-credentials@v1
# with:
# aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
# aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
# aws-region: us-east-1
# - run: |
# aws eks update-kubeconfig --name your-eks-cluster --region us-east-1
echo "Kubeconfig setup for deployment (conceptual for local/minikube)"
- name: Deploy to Kubernetes
run: |
# For production, you'd update the deployment.yaml to use the new Docker image tag:
# kubectl set image deployment/wine-quality-predictor-deployment wine-quality-predictor-container=wine-quality-predictor:${{ github.run_number }}
# For this example, we'll re-apply the deployment, assuming the image is available.
# In a real CI/CD, you'd also need to ensure the image is pulled from a registry accessible by K8s.
kubectl apply -f deployment.yaml
# Step 5: Model Testing (Optional, but Recommended in CI/CD)
# You'd add a step here to run integration tests against the deployed model.
- name: Run Post-Deployment Tests
run: |
echo "Running post-deployment tests (e.g., curl to the service endpoint)..."
# This would involve waiting for the service to be ready and then
# sending a test request to the deployed model's /predict endpoint.
# Example:
# kubectl wait --for=condition=ready pod -l app=wine-quality-predictor --timeout=300s
# SERVICE_IP=$(minikube service wine-quality-predictor-service --url)
# curl -X POST -H "Content-Type: application/json" -d '[{"fixed acidity": 7.4, ...}]' $SERVICE_IP/predict
echo "Tests passed!" # Placeholder
Explanation of CI/CD concepts in the YAML:
on: push: The workflow triggers automatically on code pushes tomain.workflow_dispatch: Allows manual triggering from the GitHub Actions UI.jobs: Defines the workflow steps.runs-on: ubuntu-latest: The job runs on a fresh Ubuntu virtual machine.uses: actions/checkout@v3: Checks out your repository code.uses: actions/setup-python@v4: Sets up the Python environment.MLFLOW_TRACKING_URI: Crucial for connecting your CI/CD environment to your persistent MLflow Tracking Server and Model Registry. In a real setup, this would be a URL to your remote MLflow server.docker build/docker push: Builds the Docker image and pushes it to a container registry. You'd use GitHub Secrets for credentials.- Kubernetes Deployment: The
kubectl apply -f deployment.yamlcommand deploys your application to the Kubernetes cluster. In a real-world scenario, you'd managekubeconfigsecurely (e.g., via OIDC or service account tokens for cloud providers).
Further Enhancements & MLOps Considerations
- Data Versioning (DVC): Manage and version your datasets alongside your code.
- Automated Retraining: Set up scheduled jobs (e.g., using Kubernetes CronJobs or Airflow) to automatically retrain models based on new data or performance degradation.
- Model Monitoring: Implement tools like Prometheus and Grafana to monitor model performance (e.g., prediction latency, error rates, data drift, concept drift) in production. This feeds back into the pipeline for potential retraining.
- A/B Testing/Canary Deployments: Safely roll out new model versions to a subset of users before full deployment using Kubernetes features like Services and Ingress.
- Feature Store: A centralized repository for managing and serving features, ensuring consistency between training and inference.
- Secrets Management: Securely manage credentials for MLflow, Docker registries, and Kubernetes (e.g., Kubernetes Secrets, Vault).
- Rollbacks: Define strategies to quickly revert to a previous, stable model version if issues arise.
- Logging: Implement comprehensive logging for model inference requests and responses.
This example provides a fundamental blueprint. The depth of implementation for each component can vary greatly based on your specific project needs and organizational maturity. The key is to automate as much as possible, track everything for reproducibility, and ensure your models are reliable and scalable in production
Comments