The webpage Running a Pre-trained Image Classification on IBM-AIU, demonstrates how to load and run a pre-trained model on IBM’s AIU for data prediction. This page offers insights into AIU’s performance, particularly in comparison with the GPU A100 on the DGX On-Prem cloud, helping users understand efficiency and speed differences between these hardware setups.

Performance Comparison on DGX On-Prem and IBM-AIU

Natural Language Processing (NLP) Task: Question Answering

We ran the RoBERTa model from Hugging Face 100 times across two difference hardware platforms, DGX On-Prem and IBM-AIU, to measure performance on a single question answering task (e.g., SQuAD tasks). RoBERTa, a robustly optimized BERT pretraining approach, is an extension of BERT that improves upon its performance by training with larger mini-batches, longer sequences, and dynamically changing masking patterns. Each run utilized one GPU A100 on DGX On-Prem and one IBM-AIU resource, with the model configured to handle both tokenization and fine-tuning for the SQuAD dataset.

RoBERTa-Base-SQuAD2	1 GPU on DGX On-Prem	1 IBM-AIU

RoBERTa-Base-SQuAD2	1 GPU on DGX On-Prem	1 IBM-AIU
Total Runs	100	100
Average Inference Time with Warm-up (s)	0.0098	0.2629
Warm-up Time / Run #1 (s)	0.4642	25.8690
Average Inference Time without Warm-up (s)	0.0053	0.0043

Image Classification

We ran the ResNet50 model 100 times across two different hardware platforms, DGX On-Prem and IBM-AIU, to measure performance on a single image prediction. Each run utilized one GPU A100 on DGX On-Prem and one IBM-AIU resource. The input image was preprocessed and rescaled to a size of (3,224,224). Here’s what each dimension refers to:

3: number of color channels (Red, Green, Blue).
224: height of the image in pixels.
224: width of the image in pixels.

ResNet50 for 1 Image	1 GPU on DGX On-Prem	1 IBM-AIU

ResNet50 for 1 Image	1 GPU on DGX On-Prem	1 IBM-AIU
Total Runs	100	100
Average Inference Time with Warm-up (s)	0.0090	0.1289
Warm-up Time / Run #1 (s)	0.4200	12.6197
Average Inference Time without Warm-up (s)	0.0048	0.0028

Video Classification

We ran the ResNet3D model with weights R3D_18_Weights 100 times across two different hardware platforms, DGX On-Prem and IBM-AIU, to assess performance on a single video prediction. Each run used one A100 GPU on DGX On-Prem and one IBM-AIU resource. The input video was preprocessed and rescaled to (3,16,112,112), where each dimension represents:

3: color channels (RGB)
16: frames (time steps)
112: height of each frame in pixels
112: width of each frame in pixels

ResNet3D for 1 Video	1 GPU on DGX On-Prem	1 IBM-AIU

ResNet3D for 1 Video	1 GPU on DGX On-Prem	1 IBM-AIU
Total Runs	100	100
Average Inference Time with Warm-up (s)	0.0068	0.0731
Warm-up Time / Run #1 (s)	0.4727	4.4354
Average Inference Time without Warm-up (s)	0.0021	0.0291

askIT

IBM-AIU Performance Review

Performance Comparison on DGX On-Prem and IBM-AIU

Natural Language Processing (NLP) Task: Question Answering

Image Classification

Video Classification