Training NISQ QNN's

"Training Quantum Neural Networks on NISQ Devices"

Published by Kerstin Beer, Daniel List, Gabriel Müller, Tobias J. Osborne, Christian Struckmann (Leibniz Universität Hannover), 18th April 2021

NISQ algorithms
Training NISQ QNN's

Relatively recently, an increasing interest can be observed in exploring the combination of quantum computational methods and machine learning; on one hand, traditional machine learning tools are used for improving aspects of quantum computational efforts, while on the other hand quantum computational algorithms are designed to enhance parts of machine learning strategies. While provable quantum speedups have been identified for some specific ML tasks, fault tolerant quantum computers (FTQC) are required to execute them in practice, which are not yet available. A growing body of work is now exploring quantum machine learning models implemented as parameterized quantum circuits, whose parameters are variationally optimized via a classical-quantum hybrid feedback loop. Such variational algorithms are expected to fare better even on near-term, noisy intermediate scale quantum (NISQ) computers. Amongst these architectures, quantum neural networks (QNN) are one of the most prominent ones which are used for example to learn unitaries, perform classification tasks, solve differential equations, and decrease the level of noise in quantum data.

As a side note, we would like to point out that in-principle any architecture that combines at a high level the concepts of quantum computing and artificial neural networks can be identified as a quantum neural network. In early developments, QNNs were designed by directly translating each component of a classical neural network to a suitable quantum counterpart. However, this is not always directly feasible, with the most notable example being the non-linear activation function common in classical NNs, whereas regular quantum unitary dynamics is linear. In order to introduce non-linearities, measurement, controlled decoherence or circuitry-feedback is required. With such processes one may construct a ‘quantum perceptron’. Other architectures considered to fall within the QNN category include the quantum Boltzmann machines and variationally parametrized circuits like the Quantum Approximate Optimization Algorithm (QAOA). More recently, kernel methods and nonlinear quantum feature maps are now seen as interesting alternatives to the perceptron-style QNNs. Whether any of the example architectures should or should not be called QNN is semantically interesting, but at the end of the day it is more important whether they can solve ML tasks well.

Despite their many advantages, QNN architectures still face many limitations on NISQ devices. One such limitation that is commonly encountered, is the presence of Barren plateaus when exploiting gradient-based training methods which prohibits the algorithm from finding the path towards the energy minimum due to the landscape becoming flat during the training. Also, the high noise levels in higher-depth quantum circuits limit the computational accuracy of the costs and gradients.

In this work, the authors present a comparative analysis of two QNN architectures, namely the Dissipative Quantum Neural Network (DQNN), whose building-block (a ‘perceptron’) is a completely positive map, and the QAOA algorithm, which are both implemented on IBM’s NISQ devices via Qiskit. The objective is to evaluate the performance of both methods while implementing certain tasks such as learning an unknown unitary operator.

In the case of DQNN, perceptron maps act on layers of different qubits, whereas the QAOA defines them as a sequence of operations on the same qubits. These networks are implemented using 6 and 4 qubits for DQNN and QAOA respectively, including the initialization and measurement. The training of the networks was executed in a hybrid manner. At each epoch, the cost was evaluated by the quantum execution, which was then used to update the parameters classically. In an ideal (noise-free) case, the training cost should always be monotonously increasing for the chosen parameters. In this work, DQNN is shown to reach higher validation costs as compared to the QAOA. Another contrasting observation is that the validation costs increase with the number of training pairs in the case of DQNN while QAOA’s validation cost is approximately uniformly distributed around the mean. The results show that both networks are capable of generalizing the available information despite the high noise levels. However, the generalization capability of DQNN is more reliable than QAOA.

The authors further evaluate and compare the noise tolerance of both of these methods. Out of the two primary sources of noise; the readout noise influences both of these networks in a similar manner. However, in the presence of gate noise, DQNN is observed to have a higher identity cost resulting in higher training and validation cost as compared to QAOA. This implies that DQNN is less susceptible to gate noise in comparison.

Overall, the work demonstrates that, although both architectures have high noise tolerance, DQNN has more potential in terms of reliability, accuracy and lesser susceptibility to noise sources as compared to QAOA when implemented on the current NISQ devices. Improving the performance of DQNN strongly correlates with the improvement of quantum hardware. As quantum hardware becomes more reliable in the near future, by lowering the levels of noise and reducing the need for high amounts of qubits due to resettable qubits, DQNN with multiple layers can be used. Such a DQNN can potentially explore problems involving higher-dimensional unitaries and non-unitary maps.