Quantum learning is still in its early stages, but many exciting research projects and applications are already emerging. Over the past few years, many algorithms have been developed by combining machine learning and quantum computing, encompassing both AI for quantum and quantum for AI. The primary objective of AI for quantum is to leverage machine learning to efficiently and resourcefully understand key properties of quantum systems. Meanwhile, the goal of quantum for AI is to harness the power of quantum computers to enhance the efficiency and accuracy of machine learning algorithms. As this field rapidly evolves, it calls for a deeper theoretical understanding of quantum machine learning models, as well as how to utilize machine learning to advance quantum science.
π Can quantum neural networks always advance classical learning models on multi-class classification? If not, what kind of datasets can be learned by quantum neural classifiers with computational advantages? π In this work, we have unified a diverse range of quantum neural classifiers (QCs) into a single general framework and evaluated their capabilities depending on the task at hand. Our results suggest that for quantum dataπ‘, QCs can be advantageous in situations where there are limited training examples available. However, for classical data such as images, QCs are not as effective as classical algorithms π€. Notably, all of our results were established under the optimal setting, which implies the fundamental limitations of QCs in multi-class classification beyond the limits of barren plateaus. In other words, we are now in the stage of creating new quantum learning models that can effectively learn classical data while providing proven benefits. π What is the foremost reason leading to the different behavior of QCs towards quantum and classical data? π Our Theorem 1 indicates that even though the amount of training data is relatively limited, the risk curve of optimal quantum classifiers (QCs) takes on a U-shape, which is distinct from the double-descent risk curve of deep neural classifiers. These distinct risk curves provide evidence of the varied power between QCs and classical classifiers (CCs). Additionally, the different dynamics of risk curves allow us to identify the potential advantages of QCs. Namely, by fitting the risk dynamics of a QC and a CC to a specific learning task, we may compare the valley of the QC's curve with its classical counterparts to determine the superiority of each model π. π How is the U-shape risk curve of the optimal QCs derived? π The (expected) risk of QC is approached by separately analyzing the empirical risk and the generalization error. Lemmas 1-2 demonstrate that the three conditions of the quantum feature states and the measurements to attain the near-zero empirical risk are (i) the feature states have the vanished variability in the same class; (ii) all feature states are equal length and are orthogonal in the varied classes; (iii) any feature state is alignment with the measure operator in the same class. These conditions are similar to the 'neural collapse' phenomenon found in deep neural classifiers. Furthermore, they are also related to the Helstrom bound and SIC-POVM, indicating a strong link between QML and QIP. In Lemma 4, we prove that deep QCs can never achieve a completely vanished empirical risk due to the concentration property of quantum observables. As for the generalization capability, we employ the concept of robustness to get a tight error bound for QCs (see Lemma 3). This is the first non-trivial bound in the over-parameterized regime. By combining the fact that deep QCs are not able to reach a zero empirical risk with the manipulatable generalization error, we obtain the U-shape risk curve of QCs.
π Backgroud & Motivation Exciting developments in quantum machine learning (QML) have shown provable advantages in learning quantum dynamics. Namely, the incorporation of entanglement into quantum operations and measurements leads to the improved query/sample complexity. A key aspect that was overlooked in QML is the impact of entangled data (each example in the training dataset is entangled with a reference system) on QML's ability to learn quantum dynamics. However, understanding the role of data is crucial in machine learning. The emerging field of data-centric AI highlights the importance of high-quality datasets in enhancing AI model performance. Recent breakthroughs with large language models have demonstrated the power of well-engineered training datasets. Our study delves deeper into this pivotal aspect, unearthing fascinating findings presented below.
π Building upon prior results, a speculation is that high-entangled data plays a pivotal role in enhancing the performance of QML models, e.g., use single training example to achieve the optimal performance. But does this speculation hold true? π In this work, we challenge the aforementioned speculation and provide evidence of a dual effect of the entangled dataset on prediction error, depending on the allowed number of measurements m. To quantify the power of data, we examine the performance of quantum learning models within the framework of the no-free-lunch (NFL) theorem. Leveraging information-theoretical tools, we establish a lower bound on the average prediction error of these models. Our analysis reveals an intriguing pattern: when m is sufficiently large, increasing the entanglement degree of the training data consistently reduces the prediction error. However, when m is small, augmenting the entanglement degree amplifies the prediction error. These findings shed new light on the intricate relationship between entangled datasets and prediction accuracy. More importantly, contrary to previous beliefs that entanglement primarily benefits QML in terms of sample/query complexity, our research uncovers the transition role of entanglement. Our findings pave the way for designing QML protocols, especially for those tailored for execution on early-stage quantum computers with limited access to quantum resources.
π When and how can we use harness the power of entangled data to unlock potential quantum advantages? π The transition role of entangled data valuable insights and practical guidance for its effective utilization: i) Utilize highly entangled data when there is a sufficient number of measurements allowed for learning quantum dynamics. (ii) Opt for low-entangled data when the number of allowed measurements is limited for learning quantum dynamics. (iii) Avoid using entangled data for learning quantum dynamics when the allowed query complexity to the target unitary is restricted.
π What is the connection between these findings and prior research? π The findings presented in Theorem 1 reveal that the minimum query complexity to incoherently learn quantum dynamics with zero prediction error is \(Nm = \Omega(2^nrc_1c_2)\). This bound is tighter than the one achieved by Huang et. al. in the same setting, as documented in their publication titled 'Information-Theoretic Bounds on Quantum Advantage in Machine Learning' [Link]]. From the perspective of quantum state tomography, the derived lower bound in Theorem 1 is optimal under the nonadaptive measurement with a constant number of outcomes, supported by the results in 'Lower bounds for learning quantum states with single-copy measurements' [Link]. Furthermore, when considering an infinite number of measurements, our findings align with the results presented in 'Reformulation of the No-Free-Lunch Theorem for Entangled Datasets' [Link].
π (Shadow) quantum tomography and classical simulators are popular methods to help us understand the dynamics of quantum computers. However, these approaches face challenges when applied to large-qubit systems due to their high computational costs, including extensive interactions with quantum devices, and their inability to efficiently handle complex quantum states. Machine learning (ML) has recently demonstrated its potential in understanding the quantum universe. A seminal work in this area is "Provably efficient machine learning for quantum many-body problems" published in Science [Link]. Despite the progress in quantum many-body physcis, how to leverage ML to advance the study of large-qubit quantum circuits remains elusive.
π This work seeks to bridge this gap by answering: Can a classical ML model efficiently predict the linear properties of bounded-gate quantum circuits? π A bounded-gate quantum circuit investigated here consists of a limited number of non-clifford quantum gates, denoted by d, associated with an arbitrary number of Clifford gates denoted by G-d. πThe classical ML model assumes that quantum resources are only accessible during the training stage. At the inference stage, predictions must be made offline, with no access to quantum systems. β οΈThis focus on this setting is because it covers various applications, including quantum algorithm development, error correction, and system certification. Understanding how to efficiently learn and predict the behavior of these circuits provides a foundation for advancing quantum computing technologies.
π What is the learnability of using classical learners to predict the linear properties of bounded-gate quantum circuits? π We rigorously establish that, while sample-efficient learning is possible (polynomially scaling with d), the computational complexity can be prohibitive. This result highlights the intrinsic difficulty in using classical learners for this task, echoing the results observed in other quantum learning theory studies.
π Can we mitigate the exponential gap in computational complexity under certain practical settings? π Yes! We propose a novel machine-learning model that leverages classical shadows and truncated trigonometric monomial kernels. This approach effectively reduces the computational burden, achieving polynomial scaling in many practical scenarios, and makes the task of predicting the linear properties of bounded-gate circuits more feasible. We prove that in many practical scenarios, our proposal promises an accurate prediction with high truncations. This enables our proposal to be a polynomial scaling in memory, sample, and runtime complexities with d and independent with the qubit count.
π How can the proposed ML predictor be applied to advance practical quantum computing problems? π The proposed ML model has broad applications, ranging from quantum algorithm development, error correction, and system certification. A concrete example is using the proposed model to pretrain variational quantum algorithms offline, significantly reducing the quantum resources required for quantum algorithm optimization.