SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "L773:2835 8856 "

Sökning: L773:2835 8856

  • Resultat 1-16 av 16
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Adib Yaghmaie, Farnaz, 1987-, et al. (författare)
  • Online Optimal Tracking of Linear Systems with Adversarial Disturbances
  • 2023
  • Ingår i: Transactions on Machine Learning Research. - 2835-8856. ; :04
  • Tidskriftsartikel (refereegranskat)abstract
    • This paper presents a memory-augmented control solution to the optimal reference tracking problem for linear systems subject to adversarial disturbances. We assume that the dynamics of the linear system are known and that the reference signal is generated by a linear system with unknown dynamics. Under these assumptions, finding the optimal tracking controller is formalized as an online convex optimization problem that leverages memory of past disturbance and reference values to capture their temporal effects on the performance. That is, a (disturbance, reference)-action control policy is formalized, which selects the control actions as a linear map of the past disturbance and reference values. The online convex optimization is then formulated over the parameters of the policy on its past disturbance and reference values to optimize general convex costs. It is shown that our approach outperforms robust control methods and achieves a tight regret bound O(√T) where in our regret analysis, we have benchmarked against the best linear policy.
  •  
2.
  •  
3.
  • Bånkestad, Maria, et al. (författare)
  • Variational Elliptical Processes
  • 2023
  • Ingår i: Transactions on Machine Learning Research. - 2835-8856.
  • Tidskriftsartikel (refereegranskat)abstract
    • We present elliptical processes—a family of non-parametric probabilistic models that subsumes Gaussian processes and Student's t processes. This generalization includes a range of new heavy-tailed behaviors while retaining computational tractability. Elliptical processes are based on a representation of elliptical distributions as a continuous mixture of Gaussian distributions. We parameterize this mixture distribution as a spline normalizing flow, which we train using variational inference. The proposed form of the variational posterior enables a sparse variational elliptical process applicable to large-scale problems. We highlight advantages compared to Gaussian processes through regression and classification experiments. Elliptical processes can supersede Gaussian processes in several settings, including cases where the likelihood is non-Gaussian or when accurate tail modeling is essential.
  •  
4.
  • Bökman, Georg, et al. (författare)
  • In search of projectively equivariant networks
  • 2023
  • Ingår i: Transactions on Machine Learning Research. - 2835-8856.
  • Tidskriftsartikel (refereegranskat)abstract
    • Equivariance of linear neural network layers is well studied. In this work, we relax the equivariance condition to only be true in a projective sense. Hereby, we introduce the topic of projective equivariance to the machine learning audience. We theoretically study the relation of projectively and linearly equivariant linear layers. We find that in some important cases, surprisingly, the two types of layers coincide. We also propose a way to construct a projectively equivariant neural network, which boils down to building a standard equivariant network where the linear group representations acting on each intermediate feature space are lifts of projective group representations. Projective equivariance is showcased in two simple experiments. Code for the experiments is provided in the supplementary material.
  •  
5.
  • Coelho Mollo, Dimitri, et al. (författare)
  • Beyond the Imitation Game : Quantifying and extrapolating the capabilities of language models
  • 2023
  • Ingår i: Transactions on Machine Learning Research. ; :5
  • Tidskriftsartikel (refereegranskat)abstract
    • Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-future capabilities and limitations of language models. To address this challenge, we introduce the Beyond the Imitation Game benchmark (BIG-bench). BIG-bench currently consists of 204 tasks, contributed by 442 authors across 132 institutions. Task topics are diverse, drawing problems from linguistics, childhood development, math, common-sense reasoning, biology, physics, social bias, software development, and beyond. BIG-bench focuses on tasks that are believed to be beyond the capabilities of current language models. We evaluate the behavior of OpenAI's GPT models, Google-internal dense transformer architectures, and Switch-style sparse transformers on BIG-bench, across model sizes spanning millions to hundreds of billions of parameters. In addition, a team of human expert raters performed all tasks in order to provide a strong baseline. Findings include: model performance and calibration both improve with scale, but are poor in absolute terms (and when compared with rater performance); performance is remarkably similar across model classes, though with benefits from sparsity; tasks that improve gradually and predictably commonly involve a large knowledge or memorization component, whereas tasks that exhibit "breakthrough" behavior at a critical scale often involve multiple steps or components, or brittle metrics; social bias typically increases with scale in settings with ambiguous context, but this can be improved with prompting. 
  •  
6.
  •  
7.
  • Englesson, Erik, et al. (författare)
  • Logistic-Normal Likelihoods for Heteroscedastic Label Noise
  • 2023
  • Ingår i: Transactions on Machine Learning Research. - : Transactions on Machine Learning Research (TMLR). - 2835-8856. ; 8
  • Tidskriftsartikel (refereegranskat)abstract
    • A natural way of estimating heteroscedastic label noise in regression is to model the observed (potentially noisy) target as a sample from a normal distribution, whose parameters can be learned by minimizing the negative log-likelihood. This formulation has desirable loss attenuation properties, as it reduces the contribution of high-error examples. Intuitively, this behavior can improve robustness against label noise by reducing overfitting. We propose an extension of this simple and probabilistic approach to classification that has the same desirable loss attenuation properties. Furthermore, we discuss and address some practical challenges of this extension. We evaluate the effectiveness of the method by measuring its robustness against label noise in classification. We perform enlightening experiments exploring the inner workings of the method, including sensitivity to hyperparameters, ablation studies, and other insightful analyses.
  •  
8.
  • Fay, Dominik, et al. (författare)
  • Adaptive Hyperparameter Selection for Differentially Private Gradient Descent
  • 2023
  • Ingår i: Transactions on Machine Learning Research. - 2835-8856.
  • Tidskriftsartikel (refereegranskat)abstract
    • We present an adaptive mechanism for hyperparameter selection in differentially private optimization that addresses the inherent trade-off between utility and privacy. The mechanism eliminates the often unstructured and time-consuming manual effort of selecting hyperparameters and avoids the additional privacy costs that hyperparameter selection otherwise incurs on top of that of the actual algorithm.We instantiate our mechanism for noisy gradient descent on non-convex, convex and strongly convex loss functions, respectively, to derive schedules for the noise variance and step size. These schedules account for the properties of the loss function and adapt to convergence metrics such as the gradient norm. When using these schedules, we show that noisy gradient descent converges at essentially the same rate as its noise-free counterpart. Numerical experiments show that the schedules consistently perform well across a range of datasets without manual tuning.
  •  
9.
  • Gamba, Matteo, et al. (författare)
  • Deep Double Descent via Smooth Interpolation
  • 2023
  • Ingår i: Transactions on Machine Learning Research. - : Transactions on Machine Learning Research (TMLR). - 2835-8856. ; 4
  • Tidskriftsartikel (refereegranskat)abstract
    •  The ability of overparameterized deep networks to interpolate noisy data, while at the same time showing good generalization performance, has been recently characterized in terms of the double descent curve for the test error. Common intuition from polynomial regression suggests that overparameterized networks are able to sharply interpolate noisy data, without considerably deviating from the ground-truth signal, thus preserving generalization ability. At present, a precise characterization of the relationship between interpolation and generalization for deep networks is missing. In this work, we quantify sharpness of fit of the training data interpolated by neural network functions, by studying the loss landscape w.r.t. to the input variable locally to each training point, over volumes around cleanly- and noisily-labelled training samples, as we systematically increase the number of model parameters and training epochs. Our findings show that loss sharpness in the input space follows both model- and epoch-wise double descent, with worse peaks observed around noisy labels. While small interpolating models sharply fit both clean and noisy data, large interpolating models express a smooth loss landscape, where noisy targets are predicted over large volumes around training data points, in contrast to existing intuition.
  •  
10.
  •  
11.
  • Hult, Ludvig, et al. (författare)
  • Diagnostic Tool for Out-of-Sample Model Evaluation
  • 2023
  • Ingår i: Transactions on Machine Learning Research. - : OpenReview. - 2835-8856. ; :10
  • Tidskriftsartikel (refereegranskat)abstract
    • Assessment of model fitness is a key part of machine learning. The standard paradigm of model evaluation is analysis of the average loss over future data. This is often explicit in model fitting, where we select models that minimize the average loss over training data asa surrogate, but comes with limited theoretical guarantees. In this paper, we consider the problem of characterizing a batch of out-of-sample losses of a model using a calibration dataset. We provide finite-sample limits on the out-of-sample losses that are statistically valid under quite general conditions and propose a diagonistic tool that is simple to compute andinterpret. Several numerical experiments are presented to show how the proposed  method quantifies the impact of distribution shifts, aids the analysis of regression, and enables model selection as well as hyperparameter tuning.
  •  
12.
  • Häggström, Henrik, 1997, et al. (författare)
  • Fast, accurate and lightweight sequential simulation-based inference using Gaussian locally linear mappings
  • 2024
  • Ingår i: Transactions on Machine Learning Research. - 2835-8856.
  • Tidskriftsartikel (refereegranskat)abstract
    • Bayesian inference for complex models with an intractable likelihood can be tackled using algorithms performing many calls to computer simulators. These approaches are collectively known as "simulation-based inference" (SBI). Recent SBI methods have made use of neural networks (NN) to provide approximate, yet expressive constructs for the unavailable likelihood function and the posterior distribution. However, the trade-off between accuracy and computational demand leaves much space for improvement. In this work, we propose an alternative that provides both approximations to the likelihood and the posterior distribution, using structured mixtures of probability distributions. Our approach produces accurate posterior inference when compared to state-of-the-art NN-based SBI methods, even for multimodal posteriors, while exhibiting a much smaller computational footprint. We illustrate our results on several benchmark models from the SBI literature and on a biological model of the translation kinetics after mRNA transfection.
  •  
13.
  • Osama, Muhammad, et al. (författare)
  • Online Learning for Prediction via Covariance Fitting : Computation, Performance and Robustness
  • 2023
  • Ingår i: Transactions on Machine Learning Research. - : Transactions on Machine Learning Research. - 2835-8856.
  • Tidskriftsartikel (refereegranskat)abstract
    • We consider the online learning of linear smoother predictors based on a covariance model of the outcomes. To control its degrees of freedom in an appropriate manner, the covariance model parameters are often learned using cross-validation or maximum-likelihood techniques. However, neither technique is suitable when training data arrives in a streaming fashion. Here we consider a covariance-fitting method to learn the model parameters, initially used  in spectral estimation. We show that this results in a computation efficient online learning method in which the resulting predictor can be updated sequentially. We prove that, with high probability, its out-of-sample error approaches the minimum achievable level at root-$n$ rate. Moreover, we show that the resulting predictor enjoys two different robustness properties. First, it minimizes the out-of-sample error with respect to the least favourable distribution within a given Wasserstein distance from the empirical distribution. Second, it is robust against errors in the covariate training data. We illustrate the performance of the proposed method in a numerical experiment.
  •  
14.
  • Viset, Frida, et al. (författare)
  • Exploiting Hankel-Toeplitz Structures for Fast Computation of Kernel Precision Matrices
  • 2024
  • Ingår i: Transactions on Machine Learning Research. - 2835-8856.
  • Tidskriftsartikel (refereegranskat)abstract
    • The Hilbert-space Gaussian Process (HGP) approach offers a hyperparameter-independent basis function approximation for speeding up Gaussian Process (GP) inference by projecting the GP onto M basis functions. These properties result in a favorable data-independent O(M3) computational complexity during hyperparameter optimization but require a dominating one-time precomputation of the precision matrix costing O(NM2) operations. In this paper, we lower this dominating computational complexity to O(NM) with no additional approximations. We can do this because we realize that the precision matrix can be split into a sum of Hankel-Toeplitz matrices, each having O(M) unique entries. Based on this realization we propose computing only these unique entries at O(NM) costs. Further, we develop two theorems that prescribe sufficient conditions for the complexity reduction to hold generally for a wide range of other approximate GP models, such as the Variational Fourier Feature (VFF) approach. The two theorems do this with no assumptions on the data and no additional approximations of the GP models themselves. Thus, our contribution provides a pure speed-up of several existing, widely used, GP approximations, without further approximations.
  •  
15.
  •  
16.
  • Åkerblom, Niklas, 1987, et al. (författare)
  • A Combinatorial Semi-Bandit Approach to Charging Station Selection for Electric Vehicles
  • 2023
  • Ingår i: Transactions on Machine Learning Research. - 2835-8856.
  • Tidskriftsartikel (refereegranskat)abstract
    • In this work, we address the problem of long-distance navigation for battery electric vehicles (BEVs), where one or more charging sessions are required to reach the intended destination. We consider the availability and performance of the charging stations to be unknown and stochastic, and develop a combinatorial semi-bandit framework for exploring the road network to learn the parameters of the queue time and charging power distributions. Within this framework, we first outline a method for transforming the road network graph into a graph of feasible paths between charging stations to handle the constrained combinatorial optimization problem in an efficient way. Then, for the feasibility graph, we use a Bayesian approach to model the stochastic edge weights, utilizing conjugate priors for the one-parameter exponential and two-parameter gamma distributions, the latter of which is novel to multi-armed bandit literature. Finally, we apply combinatorial versions of Thompson Sampling, BayesUCB and Epsilon-greedy to the problem. We demonstrate the performance of our framework on long-distance navigation problem instances in large-scale country-sized road networks, with simulation experiments in Norway, Sweden and Finland.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-16 av 16
Typ av publikation
tidskriftsartikel (16)
Typ av innehåll
refereegranskat (16)
Författare/redaktör
Zachariah, Dave (3)
Schön, Thomas B., Pr ... (3)
Stoica, Peter, 1949- (3)
Sjölund, Jens, Biträ ... (2)
Azizpour, Hossein, 1 ... (2)
Englesson, Erik (2)
visa fler...
Kahl, Fredrik (1)
Kleyko, Denis, 1990- (1)
Johansson, Karl Henr ... (1)
Adib Yaghmaie, Farna ... (1)
Modares, Hamidreza (1)
Lindsten, Fredrik, 1 ... (1)
Coelho Mollo, Dimitr ... (1)
Johansson, Mikael (1)
Rodrigues, Pedro (1)
Gustafsson, Fredrik ... (1)
Haghir Chehreghani, ... (1)
Björkman, Mårten, 19 ... (1)
Magnússon, Sindri, 1 ... (1)
Kullberg, Anton, 199 ... (1)
Baumann, Dominik, Ph ... (1)
Trimpe, Sebastian (1)
Solowjow, Friedrich (1)
Danelljan, Martin (1)
Bånkestad, Maria (1)
Taghia, Jalil (1)
Bökman, Georg (1)
Flinth, Axel, PhD, 1 ... (1)
Srivastava, Aarohi (1)
Wu, Ziyi (1)
Forbes, Florence (1)
Åkerblom, Niklas, 19 ... (1)
Ek, Sofia (1)
Johansson, Fredrik D ... (1)
Mehrpanah, Amir (1)
Fay, Dominik (1)
Zimmermann, Heiko (1)
Hult, Ludvig (1)
Gamba, Matteo (1)
Häggström, Henrik, 1 ... (1)
Oudoumanessah, Geoff ... (1)
Picchini, Umberto, 1 ... (1)
Naesseth, Christian ... (1)
Osama, Muhammad (1)
Solin, Arno (1)
Viset, Frida (1)
Wesel, Frederiek (1)
Meent, Jan-Willem va ... (1)
visa färre...
Lärosäte
Uppsala universitet (7)
Linköpings universitet (3)
Umeå universitet (2)
Kungliga Tekniska Högskolan (2)
Chalmers tekniska högskola (2)
Göteborgs universitet (1)
visa fler...
Stockholms universitet (1)
RISE (1)
visa färre...
Språk
Engelska (16)
Forskningsämne (UKÄ/SCB)
Naturvetenskap (15)
Teknik (5)
Humaniora (1)

År

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy