Quantized Fourier and Polynomial Features for more Expressive Tensor Network Models

Research output: Contribution to journalConference articleScientificpeer-review

43 Downloads (Pure)

Abstract

In the context of kernel machines, polynomial and Fourier features are commonly used to provide a nonlinear extension to linear models by mapping the data to a higher-dimensional space. Unless one considers the dual formulation of the learning problem, which renders exact large-scale learning unfeasible, the exponential increase of model parameters in the dimensionality of the data caused by their tensor-product structure prohibits to tackle high-dimensional problems. One of the possible approaches to circumvent this exponential scaling is to exploit the tensor structure present in the features by constraining the model weights to be an underparametrized tensor network. In this paper we quantize, i.e. further tensorize, polynomial and Fourier features. Based on this feature quantization we propose to quantize the associated model weights, yielding quantized models. We show that, for the same number of model parameters, the resulting quantized models have a higher bound on the VC-dimension as opposed to their non-quantized counterparts, at no additional computational cost while learning from identical features. We verify experimentally how this additional tensorization regularizes the learning problem by prioritizing the most salient features in the data and how it provides models with increased generalization capabilities. We finally benchmark our approach on large regression task, achieving state-of-the-art results on a laptop computer.
Original languageEnglish
Pages (from-to)1261-1269
Number of pages16
JournalProceedings of Machine Learning Research
Volume238
Publication statusPublished - 2024
Event27th International Conference on Artificial Intelligence and Statistics (AISTATS) - Valencia, Spain
Duration: 2 May 20244 May 2024

Fingerprint

Dive into the research topics of 'Quantized Fourier and Polynomial Features for more Expressive Tensor Network Models'. Together they form a unique fingerprint.

Cite this