Programming Contest for Best Summary Report Generation for Mlcommons Inference Results version 1.1
closed with the note: Everyone is busy

Official mlcommons inference results for Round 1.1 was released on September 22. The official results are in the form of spreadsheets. So this task is to generate the best summary report for mlcommons results. 

Just some background

Mlcommons inference submission is like the Olympics for AI hardware/software vendors where by they demonstrate their performance in a peer reviewed manner. 

Submission Divisions

  • Closed (More restrictions and guidelines to be followed and provides an opportunity for hardware comparisons)
  • Open (More relaxed and aimed at showcasing specific hardware/software optimizations)
  • Both the above are possible with and without power measurement

Submission Categories

  • Datacenter (Offline (throughput) and server (load handling when input arrives in a Poisson distribution fashion) scenarios compulsory for closed division)
  • Edge (Offline and Singlestream (average latency for a single query) compulsory for closed division)

For this task we will focus ONLY on Closed division.

  • Metric to be considered: “Samples/s” (Singlestream Latency need not be considered)

Benchmarks

  1. ResNet
  2. SSD-small (Edge only)
  3. SSD-Large
  4. 3D-UNet
  5. RNN-T
  6. BERT-99
  7. BERT-99.9 (Datacenter only)
  8. DLRM-99 (Datacenter only)
  9. DLRM-99.9 (Datacenter only)

So, 6 benchmarks in total for edge and 8 benchmarks in total for datacenter.

So, we’ll have the following comparisons (44 in total) for Samples/s

  1. Datacenter Offline with Power * 8
  2. Datacenter Server with Power * 8
  3. Edge Offline with Power * 6
  4. Datacenter Offline without Power * 8
  5. Datacenter Server without Power * 8
  6. Edge Offline without Power * 6

More Info: https://github.com/mlcommons/inference_policies/blob/master/inference_rules.adoc

Additional Metrics to be Considered

  1. Performance per Watt (only under Closed-Power and so 22 comparisons in total)
  2. Server/Offline Ratio (8 with power and 8 without power)

Hence in total we’ll have 44 + 22 + 16 = 82 comparisons for Version 1.1

Historical Best

For this we need to consider results of all the 4 rounds which have happened so far (0.5, 0.7, 1.0 and 1.1) and consider the data together. Some scenario might have changed across the rounds – Server scenario not in v0.5 for example) and can be ignored. So, this will also produce 82 comparisons as like for Version 1.1. Thus in total we’ll have 164 comparisons. 

Some constraints for the comparisons

  • For a given Accelerator, we consider only the best score (whichever row gives the maximum metric value which can change as per the “metric” used for comparison) and ignore all other results using the same accelerator
  • For each comparison graph, we need to compare the (up to) top 10 values (top 10 accelerators) with the X-axis showing just the accelerator name (like NVIDIA A100-PCIE-40GB) but also linking to the submission ID like “1.1-006” (can be in an additional table) so that other submission details can be looked at  

Presentation of Results 

  1. The results must be presented ordered by benchmark and then by category then by “with/without” power. For example, for Reset we will do as follows:
    1. Offline Samples/s for Datacenter w Power
    2. Performance per Watt for Datacenter Offline
    3. Server Samples/s for Datacenter w Power
    4. Performance per Watt for Datacenter Server
    5. Server/Offline for Datacenter w Power
    6. Offline Samples/s for Datacenter without Power
    7. Server Samples/s for Datacenter without Power
    8. Server/Offline for Datacenter without Power
    9. Offline Samples/s for Edge w Power
    10. Performance per Watt for Edge Offline
    11. Offline Samples/s for Edge without Power
  2. The results can be presented either in HTML form or PDF
  3. Scripts to produce the results –  please make them available in Github and email your results including the Github link with README to reproduce the results, to “[email protected] dot com” 

Evaluation

  1. Presentation of results will have high consideration for contest evaluation
  2. Automation is the key – if you try to do this task manually you are more likely to fail
  3. Unfortunately there’ll be only one winner but this task will make you learn a lot. And the winner will get Rs. 15000 cash prize from GATE Overflow. If equally good submissions are made prize money will be shared.
  4. Any clarification regarding the contest – please comment under this blog
  5. Deadline for submission: 9:00 pm IST, October 3

Mlcommons v1.1 Submission Results can be viewed at the following links (use the dropdown for Other rounds)

  1. https://mlcommons.org/en/inference-datacenter-11/
  2. https://mlcommons.org/en/inference-edge-11/
posted in From GO Admins Sep 24 closed Oct 4 by
by
581 views
2
Like
0
Love
0
Haha
0
Wow
0
Angry
0
Sad

1 Comment

1 comment

Like
1
Good to see that unemployment rate in India is 0 now 😀 Closing this post.

Subscribe to GO Classes for GATE CSE 2022

Ask
Quick search syntax
tags tag:apple
author user:martin
title title:apple
content content:apple
exclude -tag:apple
force match +apple
views views:100
score score:10
answers answers:2
is accepted isaccepted:true
is closed isclosed:true