SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Basu Debabrota) "

Sökning: WFRF:(Basu Debabrota)

  • Resultat 1-10 av 12
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Arafat, Naheed Anjum, et al. (författare)
  • Construction and random generation of hypergraphs with prescribed degree and dimension sequences
  • 2020
  • Ingår i: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). - Cham : Springer International Publishing. - 1611-3349 .- 0302-9743. ; 12392 LNCS, s. 130-145
  • Konferensbidrag (refereegranskat)abstract
    • We propose algorithms for construction and random generation of hypergraphs without loops and with prescribed degree and dimension sequences. The objective is to provide a starting point for as well as an alternative to Markov chain Monte Carlo approaches. Our algorithms leverage the transposition of properties and algorithms devised for matrices constituted of zeros and ones with prescribed row- and column-sums to hypergraphs. The construction algorithm extends the applicability of Markov chain Monte Carlo approaches when the initial hypergraph is not provided. The random generation algorithm allows the development of a self-normalised importance sampling estimator for hypergraph properties such as the average clustering coefficient. We prove the correctness of the proposed algorithms. We also prove that the random generation algorithm generates any hypergraph following the prescribed degree and dimension sequences with a non-zero probability. We empirically and comparatively evaluate the effectiveness and efficiency of the random generation algorithm. Experiments show that the random generation algorithm provides stable and accurate estimates of average clustering coefficient, and also demonstrates a better effective sample size in comparison with the Markov chain Monte Carlo approaches.
  •  
2.
  • Arafat, Naheed Anjum, et al. (författare)
  • Topological Data Analysis with ϵ -net Induced Lazy Witness Complex
  • 2019
  • Ingår i: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). - Cham : Springer International Publishing. - 1611-3349 .- 0302-9743. ; 11707 LNCS, s. 376-392
  • Konferensbidrag (refereegranskat)abstract
    • Topological data analysis computes and analyses topological features of the point clouds by constructing and studying a simplicial representation of the underlying topological structure. The enthusiasm that followed the initial successes of topological data analysis was curbed by the computational cost of constructing such simplicial representations. The lazy witness complex is a computationally feasible approximation of the underlying topological structure of a point cloud. It is built in reference to a subset of points, called landmarks, rather than considering all the points as in the Čech and Vietoris-Rips complexes. The choice and the number of landmarks dictate the effectiveness and efficiency of the approximation. We adopt the notion of ϵ -cover to define ϵ -net. We prove that ϵ -net, as a choice of landmarks, is an ϵ -approximate representation of the point cloud and the induced lazy witness complex is a 3-approximation of the induced Vietoris-Rips complex. Furthermore, we propose three algorithms to construct ϵ -net landmarks. We establish the relationship of these algorithms with the existing landmark selection algorithms. We empirically validate our theoretical claims. We empirically and comparatively evaluate the effectiveness, efficiency, and stability of the proposed algorithms on synthetic and real datasets.
  •  
3.
  • Basu, Debabrota, 1992, et al. (författare)
  • BelMan: An Information-Geometric Approach to Stochastic Bandits
  • 2020
  • Ingår i: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). - Cham : Springer International Publishing. - 1611-3349 .- 0302-9743. ; 11908 LNAI, s. 167-183
  • Konferensbidrag (refereegranskat)abstract
    • We propose a Bayesian information-geometric approach to the exploration–exploitation trade-off in stochastic multi-armed bandits. The uncertainty on reward generation and belief is represented using the manifold of joint distributions of rewards and beliefs. Accumulated information is summarised by the barycentre of joint distributions, the pseudobelief-reward. While the pseudobelief-reward facilitates information accumulation through exploration, another mechanism is needed to increase exploitation by gradually focusing on higher rewards, the pseudobelief-focal-reward. Our resulting algorithm, BelMan, alternates between projection of the pseudobelief-focal-reward onto belief-reward distributions to choose the arm to play, and projection of the updated belief-reward distributions onto the pseudobelief-focal-reward. We theoretically prove BelMan to be asymptotically optimal and to incur a sublinear regret growth. We instantiate BelMan to stochastic bandits with Bernoulli and exponential rewards, and to a real-life application of scheduling queueing bandits. Comparative evaluation with the state of the art shows that BelMan is not only competitive for Bernoulli bandits but in many cases also outperforms other approaches for exponential and queueing bandits.
  •  
4.
  • Carlsson, Emil, 1995, et al. (författare)
  • Pure Exploration in Bandits with Linear Constraints
  • 2024
  • Ingår i: Proceedings of the 38th Conference on Uncertainty in Artificial Intelligence, UAI 2022. - 2640-3498. ; 238, s. 334-342
  • Konferensbidrag (refereegranskat)abstract
    • We address the problem of identifying the optimal policy with a fixed confidence level in a multi-armed bandit setup, when the arms are subject to linear constraints. Unlike the standard best-arm identification problem which is well studied, the optimal policy in this case may not be deterministic and could mix between several arms. This changes the geometry of the problem which we characterize via an information-theoretic lower bound. We introduce two asymptotically optimal algorithms for this setting, one based on the Track-and-Stop method and the other based on a game-theoretic approach. Both these algorithms try to track an optimal allocation based on the lower bound and computed by a weighted projection onto the boundary of a normal cone. Finally, we provide empirical results that validate our bounds and visualize how constraints change the hardness of the problem.
  •  
5.
  • Eriksson, Hannes, 1991, et al. (författare)
  • Reinforcement Learning in the Wild with Maximum Likelihood-based Model Transfer
  • 2024
  • Ingår i: Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS. - 1548-8403 .- 1558-2914. ; 2024, s. 516-524
  • Konferensbidrag (refereegranskat)abstract
    • In this paper, we study the problem of transferring the available Markov Decision Process (MDP) models to learn and plan efficiently in an unknown but similar MDP. We refer to it as Model Transfer Reinforcement Learning (MTRL) problem. First, we formulate MTRL for discrete MDPs and Linear Quadratic Regulators (LQRs) with continuous state actions. Then, we propose a generic two-stage algorithm, MLEMTRL, to address the MTRL problem in discrete and continuous settings. In the first stage, MLEMTRL uses a constrained Maximum Likelihood Estimation (MLE)-based approach to estimate the target MDP model using a set of known MDP models. In the second stage, using the estimated target MDP model, MLEMTRL deploys a model-based planning algorithm appropriate for the MDP class. Theoretically, we prove worst-case regret bounds for MLEMTRL both in realisable and non-realisable settings. We empirically demonstrate that MLEMTRL allows faster learning in new MDPs than learning from scratch and achieves near-optimal performance depending on the similarity of the available MDPs and the target MDP.
  •  
6.
  • Eriksson, Hannes, 1991, et al. (författare)
  • Risk-Sensitive Bayesian Games for Multi-Agent Reinforcement Learning under Policy Uncertainty
  • 2022
  • Tidskriftsartikel (övrigt vetenskapligt/konstnärligt)abstract
    • In stochastic games with incomplete information, the uncertainty is evoked by the lack of knowledge about a player's own and the other players' types, i.e. the utility function and the policy space, and also the inherent stochasticity of different players' interactions. In existing literature, the risk in stochastic games has been studied in terms of the inherent uncertainty evoked by the variability of transitions and actions. In this work, we instead focus on the risk associated with the \textit{uncertainty over types}. We contrast this with the multi-agent reinforcement learning framework where the other agents have fixed stationary policies and investigate risk-sensitiveness due to the uncertainty about the other agents' adaptive policies. We propose risk-sensitive versions of existing algorithms proposed for risk-neutral stochastic games, such as Iterated Best Response (IBR), Fictitious Play (FP) and a general multi-objective gradient approach using dual ascent (DAPG). Our experimental analysis shows that risk-sensitive DAPG performs better than competing algorithms for both social welfare and general-sum stochastic games.
  •  
7.
  • Eriksson, Hannes, 1991, et al. (författare)
  • SENTINEL: Taming Uncertainty with Ensemble-based Distributional Reinforcement Learning
  • 2022
  • Ingår i: Proceedings of the 38th Conference on Uncertainty in Artificial Intelligence, UAI 2022. - 2640-3498. ; 180, s. 631-640
  • Konferensbidrag (refereegranskat)abstract
    • In this paper, we consider risk-sensitive sequential decision-making in Reinforcement Learning (RL). Our contributions are two-fold. First, we introduce a novel and coherent quantification of risk, namely composite risk, which quantifies the joint effect of aleatory and epistemic risk during the learning process. Existing works considered either aleatory or epistemic risk individually, or as an additive combination. We prove that the additive formulation is a particular case of the composite risk when the epistemic risk measure is replaced with expectation. Thus, the composite risk is more sensitive to both aleatory and epistemic uncertainty than the individual and additive formulations. We also propose an algorithm, SENTINEL-K, based on ensemble bootstrapping and distributional RL for representing epistemic and aleatory uncertainty respectively. The ensemble of K learners uses Follow The Regularised Leader (FTRL) to aggregate the return distributions and obtain the composite risk. We experimentally verify that SENTINEL-K estimates the return distribution better, and while used with composite risk estimates, demonstrates higher risk-sensitive performance than state-of-the-art risk-sensitive and distributional RL algorithms.
  •  
8.
  • Ghosh, Bishwamittra, et al. (författare)
  • Justicia: A Stochastic SAT Approach to Formally Verify Fairness
  • 2021
  • Ingår i: 35th AAAI Conference on Artificial Intelligence, AAAI 2021. - 9781713835974 ; 35, s. 7554-7563
  • Konferensbidrag (refereegranskat)abstract
    • As a technology ML is oblivious to societal good or bad, and thus, the field of fair machine learning has stepped up to propose multiple mathematical definitions, algorithms, and systems to ensure different notions of fairness in ML applications. Given the multitude of propositions, it has become imperative to formally verify the fairness metrics satisfied by different algorithms on different datasets. In this paper, we propose a stochastic satisfiability (SSAT) framework, Justicia, that formally verifies different fairness measures of supervised learning algorithms with respect to the underlying data distribution. We instantiate Justicia on multiple classification and bias mitigation algorithms, and datasets to verify different fairness metrics, such as disparate impact, statistical parity, and equalized odds. Justicia is scalable, accurate, and operates on non-Boolean and compound sensitive attributes unlike existing distribution-based verifiers, such as FairSquare and VeriFair. Being distribution-based by design, Justicia is more robust than the verifiers, such as AIF360, that operate on specific test samples. We also theoretically bound the finite-sample error of the verified fairness measure.
  •  
9.
  • Grover, Divya, 1992, et al. (författare)
  • Bayesian Reinforcement Learning via Deep, Sparse Sampling
  • 2020
  • Ingår i: Proceedings of Machine Learning Research. - 2640-3498. ; 108, s. 3036-3045
  • Konferensbidrag (refereegranskat)abstract
    • We address the problem of Bayesian reinforcement learning using efficient model-based online planning. We propose an optimism-free Bayes-adaptive algorithm to induce deeper and sparser exploration with a theoretical bound on its performance relative to the Bayes optimal as well as lower computational complexity. The main novelty is the use of a candidate policy generator, to generate long-term options in the planning tree (over beliefs), which allows us to create much sparser and deeper trees. Experimental results on different environments show that in comparison to the state-of-the-art, our algorithm is both computationally more efficient, and obtains significantly higher reward over time in discrete environments.
  •  
10.
  • Jorge, Emilio, 1992, et al. (författare)
  • Inferential Induction: A Novel Framework for Bayesian Reinforcement Learning
  • 2020
  • Ingår i: Proceedings of Machine Learning Research. - 2640-3498. ; 137, s. 43-52
  • Konferensbidrag (refereegranskat)abstract
    • Bayesian Reinforcement Learning (BRL) offers a decision-theoretic solution to the reinforcement learning problem. While “model-based” BRL algorithms have focused either on maintaining a posterior distribution on models, BRL “model-free” methods try to estimate value function distributions but make strong implicit assumptions or approximations. We describe a novel Bayesian framework, \emph{inferential induction}, for correctly inferring value function distributions from data, which leads to a new family of BRL algorithms. We design an algorithm, Bayesian Backwards Induction (BBI), with this framework. We experimentally demonstrate that BBI is competitive with the state of the art. However, its advantage relative to existing BRL model-free methods is not as great as we have expected, particularly when the additional computational burden is taken into account.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 12

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy