/
IBM-AIU Performance Review

IBM-AIU Performance Review

The webpage Running a Pre-trained Image Classification on IBM-AIU, demonstrates how to load and run a pre-trained model on IBM’s AIU for data prediction. This page offers insights into AIU’s performance, particularly in comparison with the GPU A100 on the DGX On-Prem cloud, helping users understand efficiency and speed differences between these hardware setups.

Performance Comparison on DGX On-Prem and IBM-AIU

Natural Language Processing (NLP) Task: Question Answering

We ran the RoBERTa model from Hugging Face 100 times across two difference hardware platforms, DGX On-Prem and IBM-AIU, to measure performance on a single question answering task (e.g., SQuAD tasks). RoBERTa, a robustly optimized BERT pretraining approach, is an extension of BERT that improves upon its performance by training with larger mini-batches, longer sequences, and dynamically changing masking patterns. Each run utilized one GPU A100 on DGX On-Prem and one IBM-AIU resource, with the model configured to handle both tokenization and fine-tuning for the SQuAD dataset.

RoBERTa-Base-SQuAD2

1 GPU on DGX On-Prem

1 IBM-AIU

RoBERTa-Base-SQuAD2

1 GPU on DGX On-Prem

1 IBM-AIU

Total Runs

100

100

Average Inference Time with Warm-up (s)

0.0098

0.2629

Warm-up Time / Run #1 (s)

0.4642

25.8690

Average Inference Time without Warm-up (s)

0.0053

0.0043

Image Classification

We ran the ResNet50 model 100 times across two different hardware platforms, DGX On-Prem and IBM-AIU, to measure performance on a single image prediction. Each run utilized one GPU A100 on DGX On-Prem and one IBM-AIU resource. The input image was preprocessed and rescaled to a size of (3,224,224). Here’s what each dimension refers to:

  • 3: number of color channels (Red, Green, Blue).

  • 224: height of the image in pixels.

  • 224: width of the image in pixels.

ResNet50 for 1 Image

1 GPU on DGX On-Prem

1 IBM-AIU

ResNet50 for 1 Image

1 GPU on DGX On-Prem

1 IBM-AIU

Total Runs

100

100

Average Inference Time with Warm-up (s)

0.0090

0.1289

Warm-up Time / Run #1 (s)

0.4200

12.6197

Average Inference Time without Warm-up (s)

0.0048

0.0028

Video Classification

We ran the ResNet3D model with weights R3D_18_Weights 100 times across two different hardware platforms, DGX On-Prem and IBM-AIU, to assess performance on a single video prediction. Each run used one A100 GPU on DGX On-Prem and one IBM-AIU resource. The input video was preprocessed and rescaled to (3,16,112,112), where each dimension represents:

  • 3: color channels (RGB)

  • 16: frames (time steps)

  • 112: height of each frame in pixels

  • 112: width of each frame in pixels

ResNet3D for 1 Video

1 GPU on DGX On-Prem

1 IBM-AIU

ResNet3D for 1 Video

1 GPU on DGX On-Prem

1 IBM-AIU

Total Runs

100

100

Average Inference Time with Warm-up (s)

0.0068

0.0731

Warm-up Time / Run #1 (s)

0.4727

4.4354

Average Inference Time without Warm-up (s)

0.0021

0.0291