SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Hellström Fredrik 1993) "

Sökning: WFRF:(Hellström Fredrik 1993)

  • Resultat 1-8 av 8
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Catena, Riccardo, 1978, et al. (författare)
  • New constraints on inelastic dark matter from IceCube
  • 2018
  • Ingår i: Journal of Cosmology and Astroparticle Physics. - : IOP Publishing. - 1475-7516. ; 2018:10
  • Tidskriftsartikel (refereegranskat)abstract
    • We study the capture and subsequent annihilation of inelastic dark matter (DM) in the Sun, placing constraints on the DM-nucleon scattering cross section from the null result of the IceCube neutrino telescope. We then compare such constraints with exclusion limits on the same cross section that we derive from XENON1T, PICO and CRESST results. We calculate the cross section for inelastic DM-nucleon scattering within an extension of the effective theory of DM-nucleon interactions which applies to the case of inelastic DM models characterised by a mass splitting between the incoming and outgoing DM particle. We find that for values of the mass splitting parameter larger than about 200 keV, neutrino telescopes place limits on the DM-nucleon scattering cross section which are stronger than the ones from current DM direct detection experiments. The exact mass splitting value for which this occurs depends on whether DM thermalises in the Sun or not. This result applies to all DM-nucleon interactions that generate DM-nucleus scattering cross sections which are independent of the nuclear spin, including the "canonical" spin-independent interaction. We explicitly perform our calculations for a DM candidate with mass of 1 TeV, but our conclusions qualitatively also apply to different masses. Furthermore, we find that exclusion limits from IceCube on the coupling constants of this family of spin-independent interactions are more stringent than the ones from a (hypothetical) reanalysis of XENON1T data based on an extended signal region in nuclear recoil energy. Our results should be taken into account in global analyses of inelastic DM models.
  •  
2.
  • Hellström, Fredrik, 1993, et al. (författare)
  • Evaluated CMI Bounds for Meta Learning: Tightness and Expressiveness
  • 2022
  • Ingår i: Advances in Neural Information Processing Systems. - 1049-5258. - 9781713871088 ; 35
  • Konferensbidrag (refereegranskat)abstract
    • Recent work has established that the conditional mutual information (CMI) framework of Steinke and Zakynthinou (2020) is expressive enough to capture generalization guarantees in terms of algorithmic stability, VC dimension, and related complexity measures for conventional learning (Harutyunyan et al., 2021, Haghifam et al., 2021). Hence, it provides a unified method for establishing generalization bounds. In meta learning, there has so far been a divide between information-theoretic results and results from classical learning theory. In this work, we take a first step toward bridging this divide. Specifically, we present novel generalization bounds for meta learning in terms of the evaluated CMI (e-CMI). To demonstrate the expressiveness of the e-CMI framework, we apply our bounds to a representation learning setting, with $n$ samples from $\hat n$ tasks parameterized by functions of the form $f_i \circ h$. Here, each $f_i \in \mathcal F$ is a task-specific function, and $h \in \mathcal H$ is the shared representation. For this setup, we show that the e-CMI framework yields a bound that scales as $\sqrt{ \mathcal C(\mathcal H)/(n\hat n) + \mathcal C(\mathcal F)/n} $, where $\mathcal C(\cdot)$ denotes a complexity measure of the hypothesis class. This scaling behavior coincides with the one reported in Tripuraneni et al. (2020) using Gaussian complexity.
  •  
3.
  • Hellström, Fredrik, 1993, et al. (författare)
  • Fast-Rate Loss Bounds via Conditional Information Measures with Applications to Neural Networks
  • 2021
  • Ingår i: IEEE International Symposium on Information Theory - Proceedings. - 2157-8095. ; 2021-July, s. 952-957
  • Konferensbidrag (refereegranskat)abstract
    • We present a framework to derive bounds on the test loss of randomized learning algorithms for the case of bounded loss functions. Drawing from Steinke Zakynthinou (2020), this framework leads to bounds that depend on the conditional information density between the output hypothesis and the choice of the training set, given a larger set of data samples from which the training set is formed. Furthermore, the bounds pertain to the average test loss as well as to its tail probability, both for the PAC-Bayesian and the single-draw settings. If the conditional information density is bounded uniformly in the size n of the training set, our bounds decay as 1/n, This is in contrast with the tail bounds involving conditional information measures available in the literature, which have a less benign 1/√n dependence. We demonstrate the usefulness of our tail bounds by showing that they lead to nonvacuous estimates of the test loss achievable with some neural network architectures trained on MNIST and Fashion-MNIST.
  •  
4.
  • Hellström, Fredrik, 1993, et al. (författare)
  • Generalization Bounds via Information Density and Conditional Information Density
  • 2020
  • Ingår i: IEEE Journal on Selected Areas in Information Theory. - 2641-8770. ; 1:3, s. 824-839
  • Tidskriftsartikel (refereegranskat)abstract
    • We present a general approach, based on an exponential inequality, to derive bounds on the generalization error of randomized learning algorithms. Using this approach, we provide bounds on the average generalization error as well as bounds on its tail probability, for both the PAC-Bayesian and single-draw scenarios. Specifically, for the case of sub-Gaussian loss functions, we obtain novel bounds that depend on the information density between the training data and the output hypothesis. When suitably weakened, these bounds recover many of the information-theoretic bounds available in the literature. We also extend the proposed exponential-inequality approach to the setting recently introduced by Steinke and Zakynthinou (2020), where the learning algorithm depends on a randomly selected subset of the available training data. For this setup, we present bounds for bounded loss functions in terms of the conditional information density between the output hypothesis and the random variable determining the subset choice, given all training data. Through our approach, we recover the average generalization bound presented by Steinke and Zakynthinou (2020) and extend it to the PAC-Bayesian and singledraw scenarios. For the single-draw scenario, we also obtain novel bounds in terms of the conditional α-mutual information and the conditional maximal leakage.
  •  
5.
  • Hellström, Fredrik, 1993, et al. (författare)
  • Generalization Error Bounds via mth Central Moments of the Information Density
  • 2020
  • Ingår i: IEEE International Symposium on Information Theory - Proceedings. - 2157-8095. ; 2020-June, s. 2741-2746
  • Konferensbidrag (refereegranskat)abstract
    • We present a general approach to deriving bounds on the generalization error of randomized learning algorithms. Our approach can be used to obtain bounds on the average generalization error as well as bounds on its tail probabilities, both for the case in which a new hypothesis is randomly generated every time the algorithm is used - as often assumed in the probably approximately correct (PAC)-Bayesian literature - and in the single-draw case, where the hypothesis is extracted only once.For this last scenario, we present a novel bound that is explicit in the central moments of the information density. The bound reveals that the higher the order of the information density moment that can be controlled, the milder the dependence of the generalization bound on the desired confidence level.Furthermore, we use tools from binary hypothesis testing to derive a second bound, which is explicit in the tail of the information density. This bound confirms that a fast decay of the tail of the information density yields a more favorable dependence of the generalization bound on the confidence level.
  •  
6.
  • Hellström, Fredrik, 1993 (författare)
  • Guaranteeing Generalization via Measures of Information
  • 2020
  • Licentiatavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • During the past decade, machine learning techniques have achieved impressive results in a number of domains. Many of the success stories have made use of deep neural networks, a class of functions that boasts high complexity. Classical results that mathematically guarantee that a learning algorithm generalizes, i.e., performs as well on unseen data as on training data, typically rely on bounding the complexity and expressiveness of the functions that are used. As a consequence of this, they yield overly pessimistic results when applied to modern machine learning algorithms, and fail to explain why they generalize. This discrepancy between theoretical explanations and practical success has spurred a flurry of research activity into new generalization guarantees. For such guarantees to be applicable for relevant cases such as deep neural networks, they must rely on some other aspect of learning than the complexity of the function class. One avenue that is showing promise is to use methods from information theory. Since information-theoretic quantities are concerned with properties of different data distributions and relations between them, such an approach enables generalization guarantees that rely on the properties of learning algorithms and data distributions. In this thesis, we first introduce a framework to derive information-theoretic guarantees for generalization. Specifically, we derive an exponential inequality that can be used to obtain generalization guarantees not only in the average sense, but also tail bounds for the PAC-Bayesian and single-draw scenarios. This approach leads to novel generalization guarantees and provides a unified method for deriving several known generalization bounds that were originally discovered through the use of a number of different proof techniques. Furthermore, we extend this exponential-inequality approach to the recently introduced random-subset setting, in which the training data is randomly selected from a larger set of available data samples. One limitation of the proposed framework is that it can only be used to derive generalization guarantees with a so-called slow rate with respect to the size of the training set. In light of this, we derive another exponential inequality for the random-subset setting which allows for the derivation of generalization guarantees with fast rates with respect to the size of the training set. We show how to evaluate the generalization guarantees obtained through this inequality, as well as their slow-rate counterparts, for overparameterized neural networks trained on MNIST and Fashion-MNIST. Numerical results illustrate that, for some settings, these bounds predict the true generalization capability fairly well, essentially matching the best available bounds in the literature.
  •  
7.
  • Hellström, Fredrik, 1993 (författare)
  • Information-Theoretic Generalization Bounds: Tightness and Expressiveness
  • 2022
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Machine learning has achieved impressive feats in numerous domains, largely driven by the emergence of deep neural networks. Due to the high complexity of these models, classical bounds on the generalization error---that is, the difference between training and test performance---fail to explain this success. This discrepancy between theory and practice motivates the search for new generalization guarantees, which must rely on other properties than function complexity. Information-theoretic bounds, which are intimately related to probably approximately correct (PAC)-Bayesian analysis, naturally incorporate a dependence on the relevant data distributions and learning algorithms. Hence, they are a promising candidate for studying generalization in deep neural networks. In this thesis, we derive and evaluate several such information-theoretic generalization bounds. First, we derive both average and high-probability bounds in a unified way, obtaining new results and recovering several bounds from the literature. We also develop new bounds by using tools from binary hypothesis testing. We extend these results to the conditional mutual information (CMI) framework, leading to results that depend on quantities such as the conditional information density and maximal leakage. While the aforementioned bounds achieve a so-called slow rate with respect to the number of training samples, we extend our techniques to obtain bounds with a fast rate. Furthermore, we show that the CMI framework can be viewed as a way of automatically obtaining data-dependent priors, an important technique for obtaining numerically tight PAC-Bayesian bounds. A numerical evaluation of these bounds demonstrate that they are nonvacuous for deep neural networks, but diverge as training progresses. To obtain numerically tighter results, we strengthen our bounds through the use of the samplewise evaluated CMI, which depends on the information captured by the losses of the neural network rather than its weights. Furthermore, we make use of convex comparator functions, such as the binary relative entropy, to obtain tighter characterizations for low training losses. Numerically, we find that these bounds are nearly tight for several deep neural network settings, and remain stable throughout training. We demonstrate the expressiveness of the evaluated CMI framework by using it to rederive nearly optimal guarantees for multiclass classification, known from classical learning theory. Finally, we study the expressiveness of the evaluated CMI framework for meta learning, where data from several related tasks is used to improve performance on new tasks from the same task environment. Through the use of a one-step derivation and the evaluated CMI, we obtain new information-theoretic generalization bounds for meta learning that improve upon previous results. Under certain assumptions on the function classes used by the learning algorithm, we obtain convergence rates that match known classical results. By extending our analysis to oracle algorithms and considering a notion of task diversity, we obtain excess risk bounds for empirical risk minimizers.
  •  
8.
  • Hellström, Fredrik, 1993, et al. (författare)
  • New Family of Generalization Bounds Using Samplewise Evaluated CMI
  • 2022
  • Ingår i: Advances in Neural Information Processing Systems. - 1049-5258. ; 35
  • Konferensbidrag (refereegranskat)abstract
    • We present a new family of information-theoretic generalization bounds, in which the training loss and the population loss are compared through a jointly convex function. This function is upper-bounded in terms of the disintegrated, samplewise, evaluated conditional mutual information (CMI), an information measure that depends on the losses incurred by the selected hypothesis, rather than on the hypothesis itself, as is common in probably approximately correct (PAC)-Bayesian results. We demonstrate the generality of this framework by recovering and extending previously known information-theoretic bounds. Furthermore, using the evaluated CMI, we derive a samplewise, average version of Seeger's PAC-Bayesian bound, where the convex function is the binary KL divergence. In some scenarios, this novel bound results in a tighter characterization of the population loss of deep neural networks than previous bounds. Finally, we derive high-probability versions of some of these average bounds. We demonstrate the unifying nature of the evaluated CMI bounds by using them to recover average and high-probability generalization bounds for multiclass classification with finite Natarajan dimension.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-8 av 8

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy