ML week 1
Adapted from 1
Week 1: Introduction to ML & Deployment
Objective: Understand core ML concepts, deploy a simple model via an API, and identify deployment challenges.
Key Concepts Explained
- Machine Learning Basics
- What is ML? Algorithms that learn patterns from data to make predictions or decisions.
- Supervised Learning Example: Classify iris flowers into species (setosa, versicolor, virginica) using measurements (sepal length, petal width).
# Sample code: Load the Iris dataset from sklearn.datasets import load_iris iris = load_iris() X, y = iris.data, iris.target # Features (measurements) and labels (species)
- Model Deployment
- Definition: Making a trained model accessible to users/apps via an interface (e.g., API).
- Why Flask? A lightweight Python web framework for building APIs.
- API (Application Programming Interface)
- A set of rules allowing apps to communicate. Example: A mobile app sends data to your model via an API and receives predictions.
Mini Exercises
1. Train a Basic Model
from sklearn.linear_model import LogisticRegression
# Split data into training and testing sets
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
# Train a logistic regression model
model = LogisticRegression()
model.fit(X_train, y_train)
# Predict on test data
predictions = model.predict(X_test)
print("Accuracy:", model.score(X_test, y_test))
2. Create a Simple Flask Endpoint
from flask import Flask
app = Flask(__name__)
@app.route('/hello')
def hello():
return "Hello, World!"
if __name__ == '__main__':
app.run(port=5000)
- Run with
python app.py
, then visithttp://localhost:5000/hello
in your browser.
Project Walkthrough: Deploy an Iris Classifier
Step 1: Install Dependencies
pip install flask scikit-learn
Step 2: Save the Trained Model
Add this code to train_model.py
:
import joblib
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
iris = load_iris()
model = LogisticRegression().fit(iris.data, iris.target)
joblib.dump(model, 'iris_model.pkl')
Run it: python train_model.py
(generates iris_model.pkl
).
Step 3: Build the Flask API
Create app.py
:
from flask import Flask, request, jsonify
import joblib
import numpy as np
app = Flask(__name__)
model = joblib.load('iris_model.pkl')
@app.route('/predict', methods=['POST'])
def predict():
data = request.json
features = np.array([data['sepal_length'], data['sepal_width'],
data['petal_length'], data['petal_width']]).reshape(1, -1)
prediction = model.predict(features)
return jsonify({'species': int(prediction[0])})
if __name__ == '__main__':
app.run(port=5000)
Step 4: Test the API
Send a POST request using curl
(or Postman):
curl -X POST -H "Content-Type: application/json" -d '{"sepal_length":5.1, "sepal_width":3.5, "petal_length":1.4, "petal_width":0.2}' http://localhost:5000/predict
Expected response: {"species":0}
(0 = setosa).
Step 5: Write the 1-Page Report
Template: Deployment Pitfalls Example
- Security: No input validation (e.g., someone could send text instead of numbers).
- Scalability: Flask’s default server can’t handle many users at once.
- Monitoring: No logging to track failed requests or model accuracy drift.
Questions
- What is supervised learning?
- What does an API endpoint do?
- Name one security risk in deploying a model.
Dictionary
- API: A bridge between software applications (e.g., your model and a mobile app).
- Deployment: Making a model usable by others via an interface like an API.
- Flask: A Python library for building web servers.
- Model: A file (e.g.,
.pkl
) containing learned patterns from data. - Pre-trained Model: A model already trained on data (e.g.,
iris_model.pkl
).
Resources
- Flask Quickstart: Official Documentation
- scikit-learn Iris Tutorial: scikit-learn Guide
- Postman API Testing Tool: Download Postman
Debugging
If stuck, debug by:
- Checking ports (is another app using port 5000?).
- Verifying input data shape matches the model’s expectations.
Answers to questions above
- Supervised learning uses labeled data to train models to predict outcomes.
- An API endpoint accepts input data, processes it (e.g., runs a model), and returns a result.
- Risk: Lack of input validation allows malicious data to crash the model.
Credits
-
Deepseek. ↩