/
Running a Pre-trained Image Classification Model on IBM-AIU

Running a Pre-trained Image Classification Model on IBM-AIU

IBM-AIU is optimized for data inference and currently does not support model training. At this time, each user is limited to 1 AIU. This guide provides instructions for running pre-trained models, such as ResNet50, on IBM-AIU hardware.

Background of ResNet50

ResNet50 is a deep convolutional neural network (CNN) architecture that was developed by Microsoft Research in 2015. It is a variant of the popular ResNet architecture, which stands for “Residual Network.” The “50” in the name refers to the number of layers in the network, which is 50 layers deep.

How it works

ResNet50 uses skip connections to add outputs from previous layers to the outputs of stacked layers. This allows the model to learn an identity function, which ensures that the higher layers of the model do not perform worse than the lower layers.

How it is used

ResNet50 is often used in transfer learning scenarios, where a pre-trained model is fine-tuned on smaller datasets for specific tasks. For example, a pre-trained ResNet-50 model can be used to classify images into 1,000 object categories.

Benefits

ResNet50 delivers accurate training results with faster processing time. It's superior to CNN because it improves network convergence and learning ability. 

Code

Firstly, we need to import the required libraries for the project.

import pandas as pd import numpy as np import torch from torchvision import models, transforms from PIL import Image, UnidentifiedImageError import urllib import os from torch_sendnn import torch_sendnn # Import the IBM AIU Compiler Backend

The libraries listed above are pre-installed in the IBM-AIU container. If you need additional libraries that aren't included, please install them as needed. You can install packages using pip after logging into the pod, or directly within your Python code. The following code provides an example of how to check if the library 'openpyxl' is installed and install it if necessary within the Python code.

import subprocess import sys # Install the library using subprocess def install_package(package): subprocess.check_call([sys.executable, "-m", "pip", "install", package]) def check_and_install(package_name): try: __import__(package_name) print(f"{package_name} is already installed.") except ImportError: print(f"{package_name} is not installed. Installing now...") install_package(package_name) print(f"{package_name} installed successfully.") # check and install 'ucimlrepo' check_and_install('openpyxl')

Next, we will load and preprocess the test images for inference. In this example, we will use images in JPG format, and we recommend using pictures of animals.

# Image preprocessing transformation preprocess = transforms.Compose([ transforms.Resize(256), transforms.CenterCrop(224), transforms.ToTensor(), transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]), ]) # Function to load and preprocess multiple images def load_images(image_paths): images = [] for image_path in image_paths: try: image = Image.open(image_path) image = preprocess(image) images.append(image) except (IOError, UnidentifiedImageError): print(f"Skipping non-image file: {image_path}") # Skip non-image files return torch.stack(images) # Paths of the images to load image_folder = "Path to the folder of all images" # Collect all images' paths image_files = [os.path.join(image_folder, img) for img in os.listdir(image_folder)] # Load images and preprocess images = load_images(image_files)

We also need to download the ImageNet class labels to use with the ResNet50 model. The label file at the URL https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt contains the class labels for the ImageNet dataset, which is a widely used benchmark in computer vision. Each line in the file corresponds to a class label associated with the ImageNet dataset. When you run an image through a pre-trained model (like ResNet50) on ImageNet, the output of the model will be class indices (usually numeric values). You can use these indices to look up the corresponding class names from this label file.

Now, we can load the model and deploy it to the IBM-AIU for inference. The backend parameter has two values: 'sendnn', which loads the model onto the IBM-AIU, and 'inductor', which keeps the model on the CPU.

Currently, the AIU requires initial data to start processing, and the amount of this startup data dictates the batch size for prediction. For instance, if one piece of startup data is used, predictions will process one at a time. If two pieces of startup data are used, predictions will process in pairs. In this example, we begin with a single data point.

Since we are using only a single data point to start, we can only process one data point at a time during the actual prediction process.

We have successfully executed the model. If you have additional requirements, please adjust it according to your specific needs. Please note that uploading test data to the AIU is not yet permitted in current stage.

More about ResNet50

The accuracy of the pre-trained ResNet50 model on the ImageNet dataset is typically around 76% to 77% top-1 accuracy and about 93% to 94% top-5 accuracy.

  • Top-1 Accuracy: This measures the percentage of images for which the model's top prediction (the single most likely class) matches the true label.

  • Top-5 Accuracy: This measures the percentage of images for which the true label is among the model's top five predictions.

These figures can vary slightly depending on the specific implementation and any fine-tuning or additional preprocessing steps that might be applied. Additionally, if the model is evaluated on different datasets or in different contexts, the accuracy may differ.