By Yuval Böger
Performance benchmarks provide users with a frame of reference to compare products in the same category. There are many popular classic benchmarking tools, such as MLPerf for machine learning, PassMark, and 3DMark for GPUs. It stands to reason that users of quantum computers can benefit from similar tools.
Benchmarks are critical as users struggle to translate hardware characteristics such as gate fidelity, coherence times, and qubit connectivity into meaningful business insights. Ultimately, as fun as the underlying technology is, business users want to know how quickly they can get valuable results from a given computer, or more generally, which computer (if any) is best suited to solving a particular problem. Benchmarks are also useful for validating claims made by vendors, such as claims of gate fidelity or error correction effectiveness, and serve as internal development tools for such vendors.
In fact, several commercial, academic, and standards organizations have started benchmarking, including those from IBM, QED-C, Super.tech, The Unitary Fund, Sandia National Labs, and Atos.
These benchmarking suites typically fall into two categories: 1) system performance tests, which measure hardware-related parameters such as speed and noise, and 2) application performance tests, which compare simulated reference algorithm results with actual execution results on different hardware platforms.
In my opinion, what is missing is some kind of benchmark to determine the best hardware to run a custom algorithm or program that an organization has developed. Some call this “predictive benchmarking”, which also takes into account the known or measured deficiencies of a given platform to predict and recommend the best platform for a given application. Such predictive benchmarking is interesting for two reasons: 1) there could be dramatic differences in execution quality between different quantum computers, and 2) since companies have access to multiple types of machines via quantum cloud providers, it would not be difficult to use the platform to switch if this were the case the results would justify it.
I recently had the opportunity to discuss benchmarking with Pranav Gokhale, VP of Quantum Software at ColdQuanta (and former CEO of Super.tech, acquired by ColdQuanta). Gokhale and his collaborators began working on benchmarking in mid-2021 and earlier this year released their suite of open-source benchmarks called SupermarQ, along with comparative measurements. SupermarQ includes application-oriented tests in areas such as logistics, finance, chemistry, and encryption, as well as a test to measure error correction performance. Pranav mentioned that a key design goal of their suite was to scale it to large numbers of qubits while maintaining a seemingly contradictory goal of classical verifiability.
I asked Pranav for market feedback on their product. He noted significant commercial and academic interest in benchmarking different algorithms and devices, as well as interest from hardware vendors using SupermarQ to track progress in their hardware development. Interestingly, Pranav reports that SupermarQ results often deviate significantly from predicted results based solely on qubit coherence and gate fidelity numbers. He says this happens because imperfections often correlate (like qubit crosstalk). Therefore, Super.tech believes their suite of benchmarks will help disappoint the quantum market by demonstrating real-world performance metrics for quantum computers.
Many hardware vendors might have legitimate claims about the inaccuracy of benchmarking suites. Vendors might claim that they can rewrite and optimize these test applications for their platforms using platform-specific features, native gates, or better configuration of their transpiler. As several recent hackathons and programming competitions have shown, there are numerous ways to implement a given algorithm, sometimes varying by orders of magnitude in efficiency.
In classical machine learning, Alexnet, winner of a global competition to develop an image classification algorithm, has revolutionized the field. Suppose quantum computing organizations have made a similar effort, providing example datasets and searching for the best quantum solution. In this case, vendors could demonstrate the power of their quantum platforms with optimal algorithms and settings. Both end-users and researchers could benefit from such efforts.
Benchmarking is important. Without them, we would be comparing the proverbial apples and oranges. But quantum benchmarking seems to be in its infancy.
Yuval Boger is a senior executive in the field of quantum computing. Known as the original “Qubit Guy”, he most recently served as Chief Marketing Officer for Classiq.
October 24, 2022