A comprehensive survey of multi

... MARL algorithms described in the literature aim—either explicitly or implicitly—at one of these two goals or at a combination of both, in a fully cooperative, fully competitive, or more general setting. A representative selection of these algorithms is discussed in detail in this paper, together wit ...

Optimal Bin Number for Equal Frequency Discretizations in

... discretization in O(n2) time. However, the method can be optimized in O(n.log(n)) time. Let us clearly specify the straightforward Equal Frequency discretization method for a given number of intervals I. After the instances are sorted according to their descriptive values, the interval boundaries mu ...

Artificial Intelligence chap2

... Not totally true (obviously) but more true than you might think. ...

A Case-Based Reasoning View of Automated Collaborative Filtering

... In dialog based CBR there is recognition that a complete problem specification may not be available, and that the order in which the descriptors make themselves available may be important. An example of dialog – based CBR would be the CBRNET system (Doyle & Cunningham, 2000) where the online user is ...

Knowledge Acquisition and Learning by Experience

... approach. An example is the lack of robustness and flexibility in problem solving due to the narrow and tailored scope of the knowledge. Another example is the difficulties in maintaining and updating a system's knowledge over time, to cope with the normal development of the subject field and change ...

Survey of Applications Integrating Constraint Satisfaction and Case

... the possibility to support an ‘anytime’ algorithm: find the first solution quickly, and then refine it as time allows. Increased quality of solutions is another often stated motivation for using CBR in applications. That is, when the principles of a problem domain are ill-defined, or not well unders ...

Module 2

... easily handle them. The storage also presents another problem but searching can be achieved by hashing. The number of rules that are used must be minimised and the set can be produced by expressing each rule in as general a form as possible. The representation of games in this way leads to a state s ...

KBS - teachmath1729

... * GPS did not use specific info about problem at hand in selection of state transition * GPS examined all states leading to exponential time complexity * breakthrough in AI towards more specialised problem-solving system, i.e., Knowledge-based systems ...

Context-Dependent Incremental Intention Recognition through Bayesian Network Model Construction

... In this work, we use Bayesian Networks (BN) as the intention recognition model. The flexibility of BNs for representing probabilistic dependencies and the efficiency of inference methods for BN have made them an extremely powerful and natural tool for problem solving under uncertainty [16, 17]. We p ...

Framework and Complexity Results for Coordinating Non

... in the pre-execution phase, but are not dependent upon each other if the plan is executed. Typical problems in this category are monitoring and multi-modal transportation tasks. Finally, problems where coordination between agents is needed in all three phases are problems that can be solved by tight ...

Efficiently Gathering Information in Costly Domains

... and Krause [2009] and Krause and Guestrin [2005, 2007] suggest several models for selecting which variables to sample in a Bayesian network to minimize uncertainty in the network. Bilgic and Getoor [2007] use graphical models to minimize the cost of the acquisition of information for the purpose of ...

Central Limit Theorems for Conditional Markov Chains

... latent states form a Conditional Markov Chain. Asymptotic statistical properties of Conditional Random Fields and, more specifically, Conditional Markov Chains, have been first investigated by Xiang and Neville (2011), and Sinn and Poupart (2011a). The main focus of this research was on asymptotic p ...

Artificial Intelligence Question Bank 2014

... What is Automated Reasoning? How real world problems are handled? How default logic helps in reasoning? Explain. What are the problems which occur on default reasoning? Define Default reasoning and state all its features. What do we mean by the term Closed World Assumption? Elaborate with an example ...

Fuzzy Genetic Algorithms

... according to the Darwinian evolution rule. In addition, the chromosome which has higher fitness value is having a greater probability of being selected again in the next generation. After several generations, the chromosome value is to converge to a certain value which is the best solution for the p ...

Change the Plan - How Hard Can That Be?

... planning formalism employed by a mixed-initiative planning system should be similar to the one used by its users to ease communication and interaction. In this paper we study mixed-initiative planning systems that employ hierarchical task network (H TN) planning (Erol, Hendler, and Nau 1996). In H T ...

Human-Guided Tabu Search - Computer Science

... the search. All four applications provide a color-coded visualization of the users’ current mobility settings. This same mechanism can be used to display GTABU’s mobilities. We provide several different visualization modes that allow the user to step through the search one iteration at a time or to ...

Contract Types for Satisficing Task Allocation: Results

... supported profitable clustering also by allowing agents to withdraw from their bid--e.g, if an agent did not get the cluster that it strived for. If an agent withdrew, the item was opened for reauctioning. If the new highest bid was lower than the old one, the withdrawing agent had to pay the differ ...

Reinforcement Learning and Automated Planning

... Usually, in the description of domains, action schemas (also called operators) are used instead of actions. Action schemas contain variables that can be instantiated using the available objects and this makes the encoding of the domain easier. The choice of the language in which the planning problem ...

Adding Local Exploration to Greedy Best-First Search in

... uses random walks to explore quickly and deeply. ArvandLS (Xie, Nakhost, and Müller 2012) combines random walks with local greedy best-first search, while Roamer (Lu et al. 2011) adds exploration to LAMA-2008 by using fixedlength random walks. Nakhost and Müller’s analysis (2012) shows that while ra ...

On the complexity of case-based planning

... problems, not of family of solutions. On the other hand, case-based planning always requires solving some specific subproblems, such as, for example, the adaptation of a plan to a different problem. In this paper, we study the complexity of some problems that are related to case-based planning. The ...

Learning Abstract Planning Cases

... found in [1], while the indexing and evaluation mechanisms are reported in [5; 44]. The whole multi-strategy system including the various interactions of the described components will be the topic of a forthcoming article, while ﬁrst ideas can already be found [2; 4]. However, as the target of this ...

ABSTRACT Title of Document: APPLICATION OF ANT COLONY OPTIMIZATION TO THE ROUTING AND

... In an optical network where there is no wavelength conversion or limited conversion ability one encounters the RWA problem under wavelength continuity constraint which means same wavelength should be allocated on all the links along its route. The RWA problem, which is described in detail in the fol ...

Central Limit Theorems for Conditional Markov Chains

... latent states conditional on observable variables. In the special case of a linear-chain graph structure, the latent states form a Conditional Markov Chain. Asymptotic statistical properties of Conditional Random Fields and, more specifically, Conditional Markov Chains, have been first investigated ...

Article 5 - Graduate Program in Neuroscience | UBC

... the behavioral reaction to be performed following the trigger (execution or withholding of movement) and predicting the type of reinforcer (liquid or sound). Each trial contained two delay periods, namely the instruction–trigger delay, during which the animal remembered the type of instruction and p ...

A New Operator for ABox Revision in DL-Lite Sibei Gao Guilin Qi

... the problem of incorporating a new ontology into an old one consistently. This problem is important as DLs underpin W3C standard Web ontology language OWL and ontologies may evolve during their construction. Existing revision operators in DLs are mostly generalizations of belief revision operators i ...

< 1 2 3 4 5 6 7 8 ... 27 >

Multi-armed bandit

In probability theory, the multi-armed bandit problem (sometimes called the K- or N-armed bandit problem) is a problem in which a gambler at a row of slot machines (sometimes known as ""one-armed bandits"") has to decide which machines to play, how many times to play each machine and in which order to play them. When played, each machine provides a random reward from a distribution specific to that machine. The objective of the gambler is to maximize the sum of rewards earned through a sequence of lever pulls.Robbins in 1952, realizing the importance of the problem, constructed convergent population selection strategies in ""some aspects of the sequential design of experiments"".A theorem, the Gittins index published first by John C. Gittins gives an optimal policy in the Markov setting for maximizing the expected discounted reward.In practice, multi-armed bandits have been used to model the problem of managing research projects in a large organization, like a science foundation or a pharmaceutical company. Given a fixed budget, the problem is to allocate resources among the competing projects, whose properties are only partially known at the time of allocation, but which may become better understood as time passes.In early versions of the multi-armed bandit problem, the gambler has no initial knowledge about the machines. The crucial tradeoff the gambler faces at each trial is between ""exploitation"" of the machine that has the highest expected payoff and ""exploration"" to get more information about the expected payoffs of the other machines. The trade-off between exploration and exploitation is also faced in reinforcement learning.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Multi-armed bandit