↓ Direkt till sidans innehåll
↓ Direkt till sidans sekundära innehåll (sidomenyn)

Träfflista för sökning "WFRF:(Geffner Héctor) srt2:(2020-2024)"

Sökning: WFRF:(Geffner Héctor) > (2020-2024)

Resultat 1-10 av 11

Sortera/gruppera träfflistan

Sortering: Träffar per sida:

Numrering	Referens	Omslagsbild	Hitta
1.	Bonet, Blai, et al. (författare) General Policies, Subgoal Structure, and Planning Width 2024 Ingår i: The journal of artificial intelligence research. - : AI ACCESS FOUNDATION. - 1076-9757 .- 1943-5037. ; 80, s. 475-516 Tidskriftsartikel (refereegranskat)abstract It has been observed that many classical planning domains with atomic goals can be solved by means of a simple polynomial exploration procedure, called IW, that runs in time exponential in the problem width, which in these cases is bounded and small. Yet, while the notion of width has become part of state-of-the-art planning algorithms such as BFWS, there is no good explanation for why so many benchmark domains have bounded width when atomic goals are considered. In this work, we address this question by relating bounded width with the existence of general optimal policies that in each planning instance are represented by tuples of atoms of bounded size. We also define the notions of (explicit) serializations and serialized width that have a broader scope, as many domains have a bounded serialized width but no bounded width. Such problems are solved nonoptimally in polynomial time by a variant of the Serialized IW algorithm. Finally, the language of general policies and the semantics of serializations are combined to yield a simple, meaningful, and expressive language for specifying serializations in compact form in the form of sketches, which can be used for encoding domain control knowledge by hand or for learning it from examples. Sketches express general problem decompositions in terms of subgoals, and terminating sketches of bounded width express problem decompositions that can be solved in polynomial time.
2.	Bonet, Blai, et al. (författare) On Policy Reuse: An Expressive Language for Representing and Executing General Policies that Call Other Policies 2024 Ingår i: Proceedings of the Thirty-Fourth International Conference on Automated Planning and Scheduling (ICAPS 2024). Konferensbidrag (refereegranskat)abstract Recently,a simple but powerful language for expressing and learning general policies and problem decompositions (sketches) has been introduced in terms of rules defined over a set of Boolean and numerical features. In this work, we consider three extensions of this language aimed at making policies and sketches more flexible and reusable: internal memory states, as in finite state controllers; indexical features, whose values are a function of the state and a number of internal registers that can be loaded with objects; and modules that wrap up policies and sketches and allow them to call each other by passing parameters. In addition, unlike general policies that select state transitions rather than ground actions, the new language allows for the selection of such actions. The expressive power of the resulting language for policies and sketches is illustrated through a number of examples.
3.	Drexler, Dominik, 1993-, et al. (författare) Expressing and Exploiting Subgoal Structure in Classical Planning Using Sketches 2024 Ingår i: The journal of artificial intelligence research. - : AI ACCESS FOUNDATION. - 1076-9757 .- 1943-5037. ; 80, s. 171-208 Tidskriftsartikel (refereegranskat)abstract Width-based planning methods deal with conjunctive goals by decomposing problems into subproblems of low width. Algorithms like SIW thus fail when the goal is not easily serializable in this way or when some of the subproblems have a high width. In this work, we address these limitations by using a simple but powerful language for expressing finer problem decompositions introduced recently by Bonet and Geffner, called policy sketches. A policy sketch R over a set of Boolean and numerical features is a set of sketch rules C -> E that express how the values of these features are supposed to change. Like general policies, policy sketches are domain general, but unlike policies, the changes captured by sketch rules do not need to be achieved in a single step. We show that many planning domains that cannot be solved by SIW are provably solvable in low polynomial time with the SIWR algorithm, the version of SIW that employs user-provided policy sketches. Policy sketches are thus shown to be a powerful language for expressing domain-specific knowledge in a simple and compact way and a convenient alternative to languages such as HTNs or temporal logics. Furthermore, they make it easy to express general problem decompositions and prove key properties of them like their width and complexity.
4.	Drexler, Dominik, 1993-, et al. (författare) Expressing and Exploiting the Common Subgoal Structure of Classical Planning Domains Using Sketches 2021 Ingår i: 18th International Conference on Principles of Knowledge Representation and Reasoning, Hanoi, November 3-12, 2021. - : International Joint Conferences on Artificial Intelligence Organization (IJCAI Organization). - 9781956792997 Konferensbidrag (refereegranskat)abstract Width-based planning methods deal with conjunctive goals by decomposing problems into subproblems of low width. Algorithms like SIW thus fail when the goal is not easily serializable in this way or when some of the subproblems have a high width. In this work, we address these limitations by using a simple but powerful language for expressing finer problem decompositions introduced recently by Bonet and Geffner, called policy sketches. A policy sketch R over a set of Boolean and numerical features is a set of sketch rules that express how the values of these features are supposed to change. Like general policies, policy sketches are domain general, but unlike policies, the changes captured by sketch rules do not need to be achieved in a single step. We show that many planning domains that cannot be solved by SIW are provably solvable in low polynomial time with the SIW_R algorithm, the version of SIW that employs user-provided policy sketches. Policy sketches are thus shown to be a powerful language for expressing domain-specific knowledge in a simple and compact way and a convenient alternative to languages such as HTNs or temporal logics. Furthermore, they make it easy to express general problem decompositions and prove key properties of them like their width and complexity.
5.	Drexler, Dominik, 1993-, et al. (författare) Learning Hierarchical Policies by Iteratively Reducing the Width of Sketch Rules 2023 Ingår i: 20th International Conference on Principles of Knowledge Representation and Reasoning, Rhodes, Greece, September 2-8, 2023. Konferensbidrag (refereegranskat)abstract Hierarchical policies are a key ingredient of intelligent behavior, expressing the different levels of abstraction involved in the solution of a problem. Learning hierarchical policies, however, remains a challenge, as no general learning principles have been identified for this purpose, despite the broad interest and vast literature in both model-free reinforcement learning and model-based planning. In this work, we introduce a principled method for learning hierarchical policies over classical planning domains, with no supervision from small instances. The method is based on learning to decompose problems into subproblems so that the subproblems have a lower complexity as measured by their width. Problems and subproblems are captured by means of sketch rules, and the scheme for reducing the width of sketch rules is applied iteratively until the final sketch rules have zero width and encode a general policy. We evaluate the learning method on a number of classical planning domains, analyze the resulting hierarchical policies, and prove their properties. We also show that learning hierarchical policies by learning and refining sketches iteratively is often more efficient than learning flat general policies in one shot.
6.	Drexler, Dominik, 1993-, et al. (författare) Learning Sketches for Decomposing Planning Problems into Subproblems of Bounded Width 2022 Ingår i: Proceedings of the 32nd International Conference on Automated Planning and Scheduling (ICAPS2022). Konferensbidrag (refereegranskat)abstract Recently, sketches have been introduced as a general language for representing the subgoal structure of instances drawn from the same domain. Sketches are collections of rules of the form C -> E over a given set of features where C expresses Boolean conditions and E expresses qualitative changes. Each sketch rule defines a subproblem: going from a state that satisfies C to a state that achieves the change expressed by E or a goal state. Sketches can encode simple goal serializations, general policies, or decompositions of bounded width that can be solved greedily, in polynomial time, by the SIW_R variant of the SIW algorithm. Previous work has shown the computational value of sketches over benchmark domains that, while tractable, are challenging for domain-independent planners. In this work, we address the problem of learning sketches automatically given a planning domain, some instances of the target class of problems, and the desired bound on the sketch width. We present a logical formulation of the problem, an implementation using the ASP solver Clingo, and experimental results. The sketch learner and the SIW_R planner yield a domain-independent planner that learns and exploits domain structure in a crisp and explicit form.
7.	Geffner, Hector (författare) Target Languages (vs. Inductive Biases) for Learning to Act and Plan 2022 Ingår i: THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE. - : ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE. - 9781577358763 ; , s. 12326-12333 Konferensbidrag (refereegranskat)abstract Recent breakthroughs in AI have shown the remarkable power of deep learning and deep reinforcement learning. These developments, however, have been tied to specific tasks, and progress in out-of-distribution generalization has been limited. While it is assumed that these limitations can be overcome by incorporating suitable inductive biases, the notion of inductive biases itself is often left vague and does not provide meaningful guidance. In the paper, I articulate a different learning approach where representations do not emerge from biases in a neural architecture but are learned over a given target language with a known semantics. The basic ideas are implicit in mainstream AI where representations have been encoded in languages ranging from fragments of first-order logic to probabilistic structural causal models. The challenge is to learn from data, the representations that have traditionally been crafted by hand. Generalization is then a result of the semantics of the language. The goals of this paper are to make these ideas explicit, to place them in a broader context where the design of the target language is crucial, and to illustrate them in the context of learning to act and plan. For this, after a general discussion, I consider learning representations of actions, general policies, and subgoals ("intrinsic rewards"). In these cases, learning is formulated as a combinatorial problem but nothing prevents the use of deep learning techniques instead. Indeed, learning representations over languages with a known semantics provides an account of what is to be learned, while learning representations with neural nets provides a complementary account of how representations can be learned. The challenge and the opportunity is to bring the two together.
8.	Rodriguez, Ivan D., et al. (författare) FOND Planning with Explicit Fairness Assumptions 2022 Ingår i: The journal of artificial intelligence research. - : AI Access Foundation ; AAAI Press. - 1076-9757 .- 1943-5037. ; 74, s. 887-916 Tidskriftsartikel (refereegranskat)abstract We consider the problem of reaching a propositional goal condition in fully-observable non deterministic (FOND) planning under a general class of fairness assumptions that are given explicitly. The fairness assumptions are of the form A/B and say that state trajectories that contain infinite occurrences of an action a from A in a state s and finite occurrence of actions from B, must also contain infinite occurrences of action a in s followed by each one of its possible outcomes. The infinite trajectories that violate this condition are deemed as unfair, and the solutions are policies for which all the fair trajectories reach a goal state. We show that strong and strong-cyclic FOND planning, as well as QNP planning, a planning model introduced recently for generalized planning, are all special cases of FOND planning with fairness assumptions of this form which can also be combined. FOND+ planning, as this form of planning is called, combines the syntax of FOND planning with some of the versatility of LTL for expressing fairness constraints. A sound and complete FOND+ planner is implemented by reducing FOND+ planning to answer set programs, and its performance is evaluated in comparison with FOND and QNP planners, and LTL synthesis tools. Two other FOND+ planners are introduced as well which are more scalable but are not complete.
9.	Ståhlberg, Simon, 1988-, et al. (författare) Learning General Optimal Policies with Graph Neural Networks : Expressive Power, Transparency, and Limits 2022 Ingår i: Proceedings of the Thirty-Second International Conference on Automated Planning and Scheduling (ICAPS 2022). - Palo Alto : AAAI Press. - 9781577358749 ; , s. 629-637 Konferensbidrag (refereegranskat)abstract It has been recently shown that general policies for many clas-sical planning domains can be expressed and learned in termsof a pool of features defined from the domain predicates usinga description logic grammar. At the same time, most descrip-tion logics correspond to a fragment of k-variable countinglogic (Ck ) for k = 2, that has been shown to provide a tightcharacterization of the expressive power of graph neural net-works. In this work, we make use of these results to under-stand the power and limits of using graph neural networks(GNNs) for learning optimal general policies over a numberof tractable planning domains where such policies are knownto exist. For this, we train a simple GNN in a supervised man-ner to approximate the optimal value function V ∗(s) of anumber of sample states s. As predicted by the theory, it is ob-served that general optimal policies are obtained in domainswhere general optimal value functions can be defined withC2 features but not in those requiring more expressive C3 fea-tures. In addition, it is observed that the features learned are inclose correspondence with the features needed to express V ∗in closed form. The theory and the analysis of the domainslet us understand the features that are actually learned as wellas those that cannot be learned in this way, and let us movein a principled manner from a combinatorial optimization ap-proach to learning general policies to a potentially, more ro-bust and scalable approach based on deep learning.
10.	Ståhlberg, Simon, 1988-, et al. (författare) Learning General Policies with Policy Gradient Methods 2023 Ingår i: Proceedings of the 20th International Conference on Principles of Knowledge Representation and Reasoning. - 9781956792027 ; , s. 647-657 Konferensbidrag (refereegranskat)abstract While reinforcement learning methods have delivered remarkable results in a number of settings, generalization, i.e., the ability to produce policies that generalize in a reliable and systematic way, has remained a challenge. The problem of generalization has been addressed formally in classical planning where provable correct policies that generalize over all instances of a given domain have been learned using combinatorial methods. The aim of this work is to bring these two research threads together to illuminate the conditions under which (deep) reinforcement learning approaches, and in particular, policy optimization methods, can be used to learn policies that generalize like combinatorial methods do. We draw on lessons learned from previous combinatorial and deep learning approaches, and extend them in a convenient way. From the former, we model policies as state transition classifiers, as (ground) actions are not general and change from instance to instance. From the latter, we use graph neural networks (GNNs) adapted to deal with relational structures for representing value functions over planning states, and in our case, policies. With these ingredients in place, we find that actor-critic methods can be used to learn policies that generalize almost as well as those obtained using combinatorial approaches while avoiding the scalability bottleneck and the use of feature pools. Moreover, the limitations of the DRL methods on the benchmarks considered have little to do with deep learning or reinforcement learning algorithms, and result from the well-understood expressive limitations of GNNs, and the tradeoff between optimality and generalization (general policies cannot be optimal in some domains). Both of these limitations are addressed without changing the basic DRL methods by adding derived predicates and an alternative cost structure to optimize.

Skapa referenser, mejla, bekava och länka

Länka till träfflistan

Resultat 1-10 av 11

Avgränsa träffmängd

Typ av publikation: konferensbidrag (8); tidskriftsartikel (3)

Typ av innehåll: refereegranskat (11)

Författare/redaktör: Geffner, Hector (11); Bonet, Blai (6); Drexler, Dominik, 19 ... (5); Seipp, Jendrik (4); Ståhlberg, Simon, 19 ... (3); Rodriguez, Ivan D. (1); visa fler...; Sardina, Sebastian (1); visa färre...

Lärosäte: Linköpings universitet (11)

Språk: Engelska (11)

Forskningsämne (UKÄ/SCB): Naturvetenskap (10); Samhällsvetenskap (1)

År

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

Copyright © LIBRIS - Nationella bibliotekssystem
LIBRIS.kb.se

pil uppåt

Stäng

Kopiera och spara länken för att återkomma till aktuell vy