
Build an Image Classifier on Azure: VGG16, Hymenoptera Data, and Flask Deployment
Learn how to build an end-to-end machine learning solution on Azure: train a VGG16 model on the Hymenoptera dataset for insect image classification, deploy it as an API endpoint, and create a Flask web application to consume the model. This tutorial covers data handling, model training, registration, deployment, and web integration—all within the Azure ecosystem.
Get the Data, Train and Register the Model
Launch the studio and open a notebook. Check basic setup needed for this tutorial here:
Create a folder Hymenoptera and within folder create a file named hymenoptera.ipynb. Open the .ipynb file, select the compute instance that you created in previous tutorial and the kernel Python 3.8 – AzureML.
Imports
import torch
import torchvision
import kagglehub
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, models, transforms
import os
Get the data from Kaggle
We will download the data from Kaggle
# Download latest version
path = kagglehub.dataset_download("ajayrana/hymenoptera-data")
print("Path to dataset files:", path)
Finding the right folder
from pathlib import Path
def get_folders(path):
# Convert the input path to a Path object
p = Path(path)
# Check if the path exists; if not, raise an error
if not p.exists():
raise FileNotFoundError(f"The path {path} does not exist.")
# Check if the path is a directory; if not, raise an error
if not p.is_dir():
raise NotADirectoryError(f"The path {path} is not a directory.")
# Return a sorted list of folder names in the path
return sorted(item.name for item in p.iterdir() if item.is_dir())
get_folders('/home/azureuser/.cache/kagglehub/datasets/ajayrana/hymenoptera-data/versions/1')
dataset_path = path+'/hymenoptera_data'
Define train and validation dir
# Define train and validation directories
train_dir = os.path.join(dataset_path, "train")
val_dir = os.path.join(dataset_path, "val")
print("Train Directory:", train_dir)
print("Validation Directory:", val_dir)
# Check if the directories exist
assert os.path.isdir(train_dir), "Train directory not found!"
assert os.path.isdir(val_dir), "Validation directory not found!"
print("Classes in train folder:", os.listdir(train_dir))
print("Classes in val folder:", os.listdir(val_dir))
Data Transformations
data_transforms = {
"train": transforms.Compose([
transforms.Resize((224, 224)),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
]),
"val": transforms.Compose([
transforms.Resize((224, 224)),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
]),
}
Data Creation
image_datasets = {
"train": datasets.ImageFolder(train_dir, data_transforms["train"]),
"val": datasets.ImageFolder(val_dir, data_transforms["val"]),
}
Data Loading
dataloaders = {
"train": torch.utils.data.DataLoader(image_datasets["train"], batch_size=32, shuffle=True),
"val": torch.utils.data.DataLoader(image_datasets["val"], batch_size=32, shuffle=False),
}
Class name
class_names = image_datasets["train"].classes
#The class names are extracted from the training dataset for later use (e.g., identifying labels).
Device Selection
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
Model Loading
# Load VGG16 model
model = models.vgg16(pretrained=True)
Classifier Modification
# Modify the classifier
num_ftrs = model.classifier[6].in_features
model.classifier[6] = nn.Linear(num_ftrs, len(class_names))
Device Assignment
model = model.to(device)
Loss and optimizer setup
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.classifier.parameters(), lr=0.001)
Training Loop
num_epochs = 5
for epoch in range(num_epochs):
print(f"Epoch {epoch+1}/{num_epochs}")
for phase in ["train", "val"]:
if phase == "train":
model.train()
else:
model.eval()
running_loss, correct = 0.0, 0
for inputs, labels in dataloaders[phase]:
inputs, labels = inputs.to(device), labels.to(device)
optimizer.zero_grad()
with torch.set_grad_enabled(phase == "train"):
outputs = model(inputs)
loss = criterion(outputs, labels)
if phase == "train":
loss.backward()
optimizer.step()
running_loss += loss.item() * inputs.size(0)
correct += (outputs.argmax(1) == labels).sum().item()
epoch_loss = running_loss / len(image_datasets[phase])
epoch_acc = correct / len(image_datasets[phase])
print(f"{phase} Loss: {epoch_loss:.4f}, Acc: {epoch_acc:.4f}")
print("Training complete.")
Save the model
model_path = "vgg16_hymenoptera.pth"
torch.save(model.state_dict(), model_path)
print("Model saved.")
Register the model in the current workspace
from azureml.core import Workspace
from azureml.core.model import Model
ws = Workspace.from_config()
Model.register(
workspace=ws,
model_path=model_path,
model_name="vgg16_hymenoptera",
description="Fine-tuned VGG16 model for hymenoptera classification",
)
print("Model registered in Azure ML.")
Endpoint Creation
Create a file deploy_model.ipynb
Score.py
This script (score.py) defines a model inference pipeline for deployment on Azure ML. It includes two main functions: init() to load a pre-trained VGG16 model fine-tuned for Hymenoptera classification (ants vs. bees), and run() to process image input and return predictions. The script supports JSON input with image data, preprocesses it, performs inference, and returns the predicted class label.
%%writefile score.py
import json
import torch
import torch.nn as nn
import torchvision.transforms as transforms
from torchvision import models
from PIL import Image
import os
import io
# Define class labels
CLASS_NAMES = ["ant", "bee"]
# Load model at initialization
def init():
global model
model_path = os.path.join(os.getenv("AZUREML_MODEL_DIR", ""), "vgg16_hymenoptera.pth")
print(f"AZUREML_MODEL_DIR: {os.getenv('AZUREML_MODEL_DIR')}")
print(f"Model path: {model_path}")
if not os.path.exists(model_path):
print("Error: Model file not found!")
return
model = models.vgg16(pretrained=False)
model.classifier[6] = nn.Linear(4096, 2) # Adjust for 2 classes
model.load_state_dict(torch.load(model_path, map_location=torch.device("cpu")))
model.eval()
print("Model loaded successfully.")
# Inference function
def run(raw_data):
try:
print("Received data for inference.")
# Parse JSON input
data = json.loads(raw_data)
image_bytes = data.get("image") # Expect base64 or binary image data
if image_bytes is None:
return json.dumps({"error": "No image found in request."})
print("Image data received.")
# Convert image bytes to PIL Image
image = Image.open(io.BytesIO(bytearray(image_bytes))).convert("RGB")
# Preprocess image
transform = transforms.Compose([
transforms.Resize((224, 224)),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
img_tensor = transform(image).unsqueeze(0) # Add batch dimension
# Perform inference
with torch.no_grad():
outputs = model(img_tensor)
_, predicted = torch.max(outputs, 1)
# Return human-readable class label
predicted_label = CLASS_NAMES[predicted.item()]
print(f"Predicted label: {predicted_label}")
return json.dumps({"prediction": predicted_label})
except Exception as e:
print(f"Error during inference: {e}")
return json.dumps({"error": str(e)})
# Run test when executed as a script
if __name__ == "__main__":
print("Initializing model...")
init()
# Test inference with a sample image
test_image_path = "1.png" # Replace with an actual image file for testing
if os.path.exists(test_image_path):
with open(test_image_path, "rb") as f:
image_bytes = f.read()
print("Running inference test...")
response = run(json.dumps({"image": list(image_bytes)}))
print("Inference result:", response)
else:
print(f"Test image '{test_image_path}' not found. Please add a test image to verify inference.")
- Initialization (init): Loads the VGG16 model from a file (vgg16_hymenoptera.pth) in the Azure ML model directory, adjusts the classifier for two classes, and sets it to evaluation mode.
- Inference (run): Accepts JSON input with image bytes, preprocesses the image (resize, normalize), runs it through the model, and returns the predicted label (“ant” or “bee”) or an error message.
- Testing: Includes a test block to verify functionality with a local image file (1.png) when run as a script.
- Environment: Uses PyTorch, PIL, and standard libraries, designed for Azure ML endpoint deployment.
Put image of bee or an ant in same folder, (in above code its 1.png)and run the following command.
environment.yml
This environment.yml file defines a Conda environment named pytorch-env for a PyTorch-based Azure ML deployment. It specifies Python 3.8 and key dependencies like PyTorch, TorchVision, and Azure ML libraries, along with additional packages required for model inference, API creation, and deployment.
%%writefile environment.yml
name: pytorch-env
dependencies:
- python=3.8
- joblib
- pip
- pip:
- torch==2.1.0
- torchvision==0.16.0
- pillow
- azureml-sdk
- azure-ai-ml # Required for SDK v2
- azureml-defaults # Required for deployment
- azureml-inference-server-http # Required for scoring
- fastapi # Required for API
- uvicorn # Required for API server
- Base Environment: Uses Python 3.8 as the runtime.
- Core Dependencies: Includes joblib (via Conda) and PyTorch ecosystem packages (torch==2.1.0, torchvision==0.16.0, pillow) via pip for model training/inference and image processing.
- Azure ML Support: Installs azureml-sdk, azure-ai-ml (SDK v2), azureml-defaults, and azureml-inference-server-http for model registration, deployment, and scoring on Azure ML.
- API Requirements: Adds fastapi and uvicorn to enable a lightweight API server for the deployed endpoint.
deploy.py
This deploy.py script deploys a registered VGG16 model (vgg16_hymenoptera) as a web service on Azure Container Instances (ACI) using Azure ML. It loads the workspace, retrieves the model, sets up the environment from environment.yml, configures the inference script (score.py), and deploys the service, then prints the deployment status and endpoint URL.
%%writefile deploy.py
from azureml.core import Workspace, Model, Environment
from azureml.core.webservice import AciWebservice
from azureml.core.model import InferenceConfig
# Load the Azure ML workspace
ws = Workspace.from_config()
# Get registered model
model = Model(ws, name="vgg16_hymenoptera")
# Define environment
env = Environment.from_conda_specification(name="pytorch-env", file_path="environment.yml")
# Define inference config
inference_config = InferenceConfig(entry_script="score.py", environment=env)
# Define deployment config for Azure Container Instance (ACI)
deployment_config = AciWebservice.deploy_configuration(cpu_cores=1, memory_gb=2)
# Deploy model as a web service
service = Model.deploy(
workspace=ws,
name="vgg16-serviceee",
models=[model],
inference_config=inference_config,
deployment_config=deployment_config
)
service.wait_for_deployment(show_output=True)
# Print deployment info
print(f"Service State: {service.state}")
print(f"Scoring URI: {service.scoring_uri}")
- Workspace & Model: Connects to an Azure ML workspace and retrieves the registered model “vgg16_hymenoptera”.
- Environment: Uses the pytorch-env Conda environment defined in environment.yml.
- Inference Config: Links the score.py script with the environment for scoring.
- Deployment Config: Specifies ACI with 1 CPU core and 2 GB of memory.
- Service Deployment: Deploys the model as “vgg16-serviceee”, waits for completion, and outputs the service state and scoring URI.
Run the deploy.py
!deploy.py
It will take some time to deploy.
After successful deployment you will see.
Running
2025-03-24 07:09:26+00:00 Registering the environment.
2025-03-24 07:09:32+00:00 Building image..
2025-03-24 07:23:32+00:00 Generating deployment configuration..
2025-03-24 07:23:34+00:00 Submitting deployment to compute..
2025-03-24 07:23:40+00:00 Checking the status of deployment vgg16-service..
2025-03-24 07:26:41+00:00 Checking the status of inference endpoint vgg16-service.
Succeeded
ACI service creation operation finished, operation "Succeeded"
Service State: Healthy
Scoring URI: http://26f07f70-646a-40e6-bb4a-8419dcc7efc4.eastus2.azurecontainer.io/score
Flask Web Application
In you local machine create a folder named FLASK_AZURE with following files.

app.py
This Flask application creates a web interface to upload images and get predictions from an Azure ML endpoint hosting a VGG16 model. It handles file uploads, converts images to bytes, sends them to the endpoint, and displays the predicted class (e.g., “ant” or “bee”) on an HTML page.
from flask import Flask, request, render_template
import requests
import json
from PIL import Image
import io
app = Flask(__name__)
# Azure ML Endpoint URL
AZURE_ENDPOINT = "http://c6b3572a-9125-41ed-b730-28ffe7be14f6.eastus2.azurecontainer.io/score" # Replace with your actual endpoint URL
@app.route('/', methods=['GET', 'POST'])
def index():
if request.method == 'POST':
if 'file' not in request.files:
return render_template('index.html', prediction="No file uploaded.")
file = request.files['file']
if file.filename == '':
return render_template('index.html', prediction="No file selected.")
image = Image.open(file.stream).convert("RGB")
img_bytes = io.BytesIO()
image.save(img_bytes, format='JPEG')
img_bytes = img_bytes.getvalue()
# Send image to Azure ML endpoint
headers = {'Content-Type': 'application/json'}
payload = json.dumps({"image": list(img_bytes)})
response = requests.post(AZURE_ENDPOINT, data=payload, headers=headers)
try:
response_json = response.json() # Ensure we parse JSON safely
if isinstance(response_json, dict): # Check if it's a dictionary
prediction = response_json.get("prediction", "Unknown")
else:
prediction = f": {response_json}"
except requests.exceptions.JSONDecodeError:
prediction = f"Invalid JSON response: {response.text}"
return render_template('index.html', prediction=f"Predicted: {prediction}")
return render_template('index.html', prediction=None)
if __name__ == '__main__':
app.run(debug=True)
- Flask Setup: Defines a Flask app with a single route (/) for both GET and POST requests.
- Image Processing: On POST, it checks for an uploaded file, opens it as a PIL Image, converts it to JPEG bytes, and prepares it as JSON payload.
- Endpoint Call: Sends the image bytes to the Azure ML scoring URI (AZURE_ENDPOINT) via a POST request and retrieves the prediction.
- Response Handling: Parses the JSON response safely, extracts the prediction, and renders it on index.html; handles errors gracefully.
- Run: Launches the app in debug mode for local testing.
- Replace the AZURE_ENDPOINT URL with the actual scoring URI from your deployed Azure ML service (e.g., from deploy.py output).
- Requires an index.html template (not shown) with a file upload form and a placeholder for the prediction result.
index.html
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Image Classification</title>
<style>
body {
font-family: Arial, sans-serif;
text-align: center;
margin: 50px;
}
form {
margin-top: 20px;
}
input[type="file"] {
margin: 10px 0;
}
.result {
margin-top: 20px;
font-size: 20px;
font-weight: bold;
color: #333;
}
</style>
</head>
<body>
<h1>Upload an Image for Classification</h1>
<form action="/" method="post" enctype="multipart/form-data">
<input type="file" name="file" accept="image/*" required>
<br>
<button type="submit">Upload and Predict</button>
</form>
{% if prediction %}
<div class="result">{{ prediction }}</div>
{% endif %}
</body>
</html>
Now, run the following command.
python app.py
You might need to install few python libraries.
The output will look something like:

Deploying Flask Application on Azure
Create a requirements.txt in same folder as app.py.
Flask
gunicorn
requests
pillow
Login to azure
az login
Create a resource group
az group create --name FlaskRGroup --location eastus
Create an Azure Service plan
az appservice plan create --name FlaskPlan --resource-group FlaskRGroup --sku F1
Direct Deployment
az webapp up --name FlaskAppAzure --resource-group FlaskRGroup --runtime "PYTHON:3.8"
Your terminal will wokk something like

Go to the url:

You might want to stop the server or delete it completely if not in use.
az webapp stop --name FlaskAppAzure --resource-group FlaskRGroup
az webapp start --name FlaskAppAzure --resource-group FlaskRGroup
az webapp delete --name FlaskAppAzure --resource-group FlaskRGroup
Also you want to kill the endpoint and compute instance when not in use.