Qu&Co comments on this publication:
Reinforcement learning6 differs from supervised and unsupervised learning in that it takes into account a scalar parameter (reward) to evaluate the input-output relation in a trial and error way. In this paper, Cardenas-Lopez et al. propose a protocol to perform generalized quantum reinforcement learning. They consider diverse possible scenarios for an agent, an environment, and a register that connects them, involving multi-qubit and multi-level systems, as well as open-system dynamics and they propose possible implementations of this protocol in trapped ions and superconducting circuits.