Quantum Kernel Bandwidth

"Importance of Kernel Bandwidth in Quantum Machine Learning"

Published by Ruslan Shaydulin & Stefan M. Wild, 11th November 2021

Machine learning
Quantum Kernel Bandwidth

Quantum Kernels have shown great promise in many quantum supervised machine learning methods where an appropriately defined kernel may provide speedups over classical ML methods. In quantum kernel methods, data points are mapped to quantum states using a quantum feature map, and the value of the kernel between two data points is given by some similarity measure (such as fidelity) of the corresponding quantum states. The advantage compared to classical kernels lies in processing the data using an exponentially sized Hilbert space and performing classically hard computations.

One major limitation to these methods is the exponentially decreasing fidelity of two random quantum states with the number of qubits, which leads to the exponential vanishing of kernel values that makes learning impossible, as such small differences can not be distinguished nor used for training. One can overcome this by controlling the inductive bias of the quantum kernel methods, by projecting the quantum state into a lower-dimensional subspace which can be done via hyperparameter tuning. One such hyperparameter is the kernel’s ‘bandwidth’. In this work, the authors identify quantum kernel bandwidth as a centrally important hyperparameter for quantum kernel methods.

The work considers the problem of supervised learning, specifically the task of classification. Given a training dataset, the goal is to learn a map from data points to labels that is in agreement with the true map with high probability on an unseen test set. The datapoint is encoded in a quantum state by a parameterized unitary. This unitary is referred to as a Hamiltonian evolution quantum feature map. A kernel matrix is then obtained by computing the quantum kernel for all pairs of data points. This value can be computed on a quantum computer by measuring the value of appropriate observable on the state. This kernel matrix is then used inside a Support Vector Machine (SVM) or other kernel methods.

The results demonstrate that varying the quantum kernel bandwidth, typically via the scaling factor and Trotter steps in the feature map, controls the expressiveness of the model. For the quantum feature maps considered, the bandwidth can be controlled by rescaling the data points. Numerical simulations were done with multiple quantum feature maps and datasets using up to 26 qubits. The results show that larger scaling factor leads to a narrow kernel for which the Support Vector Classifier can fit any labels, leading to overfitting. On the other hand, choosing too small values of scaling factor leads to a wide kernel making it insufficiently expressive, hence leading to underfitting. Optimizing the bandwidth can improve the performance of quantum kernel methods with qubit count. It was also observed that hardware limitations such as finite precision of controls and the variance introduced by sampling, do not limit the overall performance significantly.

The overall work discusses the potential of controlling the inductive bias of quantum kernels via projecting them into a lower-dimensional subspace using hyperparameter operations. Combining this projection with bandwidth optimization, leads to more precise modulation of the inductive bias of the model. Following the results of this work, it would be interesting to explore more elaborate feature maps with more tuned hyperparameters. Optimizing hyperparameters of quantum kernels may enable the tuning of inductive bias of the model in a classically feasible way and at the same time can bridge the gap between expensive fully trainable quantum embeddings and fixed feature maps.