Premium

New Benchmarks Measure AI Performance Across Hardware Systems

MLCommons, an artificial intelligence group, has introduced two new benchmarks designed to measure how quickly advanced hardware and software can run AI applications. The benchmarks come as chipmakers increasingly focus on systems that efficiently support AI tools following the rise of services like OpenAI’s ChatGPT. These tests are part of MLPerf, a standard used in the industry to evaluate AI system speed. One benchmark uses Meta’s Llama 3.1, a large AI model with 405 billion parameters, and assesses tasks like general question answering math, and code generation. The test also evaluates how systems handle large queries and combine data from multiple sources to generate responses.

Become a Subscriber

Please purchase a subscription to continue reading this article.

Subscribe Now

Nvidia submitted several of its chips for testing, including its latest Grace Blackwell servers, which showed a 2.8 to 3.4 times speed improvement over the previous generation, even when using fewer GPUs. Dell Technologies also participated, while Advanced Micro Devices did not submit results for the larger model test. The second benchmark, also based on a Meta open-source AI model, simulates real-world AI tools like chatbots, focusing on faster response times. MLCommons aims to use these benchmarks to reflect the growing demand for speed and efficiency in AI applications across various platforms.

Read more