Neural Mechanisms of Reward in Insects - Chittka Lab

... rewarding properties themselves. Because rewards are so pivotal to the operation of behavior, it is hardly surprising that the study of how rewards are processed by the brain has become a prominent neuroscience discipline. Most research has been performed with vertebrate model systems, but insect be ...

Using Reinforcement Learning to Spider the Web Efficiently

... finite amount of crawling time. Our work is also driven by the WebKB project [Craven et al., 1998]. Here the focus is on automatically populating a knowledge base with information that is available on the World Wide Web. The system is given an ontology consisting of object classes and relations of i ...

A Review of Population-based Meta-Heuristic

... Many meta-heuristic algorithms have been proposed so far as shown in Table 1. Genetic Algorithm (GA) as a population-based meta-heuristic algorithm was suggested by Holland [42]. In the algorithm, a population of strings called chromosomes encodes candidate solutions for optimization problems. Simul ...

High reward expectancy during methylphenidate depresses the

... studies defined the reward-related BOLD response as the difference between the least and the most appealing reward condition, obscuring the parametric relation of the BOLD signal to the RPE. Instead, we created a full-range parametric variation of the RPE from unexpected high punishment to unexpecte ...

Disjunctive Temporal Planning with Uncertainty

... Following the STPU formalism [Vidal and Fargier, 1999], in a DTPU we divide the time-point variables into two classes: the controllable decision variables Vd and the uncontrollable parameters Vu . In an STPU, this induces a partition of the temporal relation constraints C into two types. A constrain ...

PDF

... comparing some of these techniques. These comparisons help us reason about what techniques are most suitable for dynamic scheduling. Advantages and disadvantages of these techniques are provided by previous published work. Dispatching rules are easy and can find reasonable solution rapidly. However, ...

A Framework for Average Case Analysis of Conjunctive Learning

... (Langley, 1989). Some attempt to understand learning algorithms by testing the algorithms on a variety of problems (e.g., Fisher, 1987; Minton, 1987). Others perform formal mathematical analysis of algorithms to prove that a given class of concepts is learnable from a given number of training exampl ...

Solution Manual Artificial Intelligence a Modern Approach

... initial locations), this agent cleans the squares at least as fast as any other agent. This is trivially true when there is no dirt. When there is dirt in the initial location and none in the other location, the world is clean after one step; no agent can do better. When there is no dirt in the init ...

PDF

... non-heavy-tailed distributions characterized by exponential decay. Related to heavy-tailedness is fat-tailedness. The notion of fat-tailedness may be introduced using the concept of kurtosis,2 and comparing the kurtosis of a given distribution with the kurtosis of the standard normal distribution. T ...

PDF

... Backdoor variables are related to the notion of independent variables [20, 5]. In fact, a set independent variables of a problem also forms a backdoor set (provided the propagation mechanism of the SAT solver, after setting the independent variables, can effectively uncover the remaining variable de ...

PNBA*: A Parallel Bidirectional Heuristic Search Algorithm

... allows a simpler and more efficient version of A*, in which a node needs to be expanded at most once. From now on, the heuristic function is assumed to be consistent. Two data structures are generally used during the search to manage node expansions: open and closed lists. The open list contains all ...

Maximising overlap score in DNA sequence assembly problem by

... up the searching process and maximised the similarity or overlaps between given fragments [8]. Alba and Luque presented several methods, including genetic algorithm, a CHC method, scatter search algorithm, and simulated annealing to solve accurately DNA Assembly problem in 2005 [13]. They also propo ...

A New Approach to Classification with the Least Number of Features

... selection techniques try to approximate the optimal feature set, e.g. by Bayesian inference, gradient descent, genetic algorithms, or various numerical optimisation methods. Commonly, these methods are divided into two classes: filter and wrapper methods. First, filter methods completely separate th ...

What Is Approximate Reasoning?

... First of all, we have to come up with a general and generic formalization of the notion of a reasoning task. Intuitively, this is just a question (or query) posed to a system that manages a knowledge base, which is supposed to deliver an answer after some processing time. The (maybe gradual) validit ...

PDF

... distribution of these scores was less shifted than in shams (Fig. 3f versus Fig. 3i; Mann-Whitney U test, P < 0.001) and did not differ from zero (Fig. 3i; Wilcoxon signed-rank test, P = 0.12). There was no significant inverse correlation between changes in firing in response to unexpected reward an ...

Text - ETH E

... as a signal that is one during presentation of this stimulus and zero otherwise. The temporal stimulus representation of this stimulus u(t ) consists of a series of phasic signals x1 ðtÞ; x2 ðtÞ; x3 ðtÞ; … that cover trial duration (only three components are shown). Each component of this temporal r ...

Multi-Agent System

... • E.g., performance measure of a vacuum-cleaner agent could be amount of dirt cleaned up, amount of time taken, amount of electricity ...

Winstanley et al. - Rudolf Cardinal

... The orbitofrontal cortex (OFC) and basolateral nucleus of the amygdala (BLA) share many reciprocal connections, and a functional interaction between these regions is important in controlling goal-directed behavior. However, their relative roles have proved hard to dissociate. Although injury to thes ...

Distributed Stochastic Search for Constraint Satisfaction and Optimization:

... are above a certain threshold at a sensor location will interfere with each other. This constraint, therefore, requires that the signaling actions of two overlapping ping nodes be synchronized so that no interfering signals will be generated at a sensor location at any time. We are developing a larg ...

An Extension of the ICP Algorithm Considering Scale Factor

... Images with Different Scanning Resolutions,” Proc. IEEE Int. Conf. Systems, Man, and Cybernetics, pp, 1495-1500, Oct, 2000. [8] T. Zinßer, J. Schmidt, and H. Niemann, “Point Set Registration with Integrated Scale Estimation,” Int. Conf. on Pattern Recognition and Information Processing, pp, 116-119, ...

A Case-Based Approach To Imitation Learning in Robotic Agents

... “true” learning by imitation to entail not only mimicking a sequence of actions, but also learning about the intentions of the teacher or the leader (sometimes called the model). It is unclear whether animal species other than human have the capacity for “true” learning by imitation. Meltzoff [14] f ...

Automated Modelling and Solving in Constraint Programming

... set, which is given, for instance, as a set of examples of its solutions and non-solutions. This kind of learning is called constraint acquisition (Bessiere et al. 2005). The motivations for constraint acquisition are many. For example, in order to solve partially defined constraints more efficient ...

A Case for a Situationally Adaptive Many

... Using the choices described earlier, we formulate the proposed situational scheduler that caters for software and hardware using input situations. Various inputs and architectural combinations combine into an extremely large number of possible configurations. Given such a high complexity problem, it ...

Universal Artificial Intelligence: Practical Agents and Fundamental

... A good image of a UAI agent is that of a newborn baby. Knowing nothing about the world, the baby tries different actions and experiences various sensations (percepts) as a consequence. Note that the baby does not initially know about any states of the world—only percepts. Learning is essential for i ...

Bayesian Vote Manipulation: Optimal Strategies

... have adopted two assumptions that greatly diminish their practical import. First, it is usually assumed that the manipulators have full knowledge of the votes of the nonmanipulating agents. Second, analysis tends to focus on the probability of manipulation rather than its impact on the social choice ...

< 1 ... 3 4 5 6 7 8 9 10 11 ... 27 >

Multi-armed bandit

In probability theory, the multi-armed bandit problem (sometimes called the K- or N-armed bandit problem) is a problem in which a gambler at a row of slot machines (sometimes known as ""one-armed bandits"") has to decide which machines to play, how many times to play each machine and in which order to play them. When played, each machine provides a random reward from a distribution specific to that machine. The objective of the gambler is to maximize the sum of rewards earned through a sequence of lever pulls.Robbins in 1952, realizing the importance of the problem, constructed convergent population selection strategies in ""some aspects of the sequential design of experiments"".A theorem, the Gittins index published first by John C. Gittins gives an optimal policy in the Markov setting for maximizing the expected discounted reward.In practice, multi-armed bandits have been used to model the problem of managing research projects in a large organization, like a science foundation or a pharmaceutical company. Given a fixed budget, the problem is to allocate resources among the competing projects, whose properties are only partially known at the time of allocation, but which may become better understood as time passes.In early versions of the multi-armed bandit problem, the gambler has no initial knowledge about the machines. The crucial tradeoff the gambler faces at each trial is between ""exploitation"" of the machine that has the highest expected payoff and ""exploration"" to get more information about the expected payoffs of the other machines. The trade-off between exploration and exploitation is also faced in reinforcement learning.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Multi-armed bandit