* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Real-time decision problems: An operational research perspective
Vehicle infrastructure integration wikipedia , lookup
Artificial intelligence in video games wikipedia , lookup
Knowledge representation and reasoning wikipedia , lookup
Ethics of artificial intelligence wikipedia , lookup
Computer Go wikipedia , lookup
Embodied cognitive science wikipedia , lookup
Existential risk from artificial general intelligence wikipedia , lookup
Multi-armed bandit wikipedia , lookup
Real-Time Decision Problems: An Operational Research Perspective Author(s): R. Seguin, J-Y. Potvin, M. Gendreau, T. G. Crainic, P. Marcotte Source: The Journal of the Operational Research Society, Vol. 48, No. 2 (Feb., 1997), pp. 162174 Published by: Palgrave Macmillan Journals on behalf of the Operational Research Society Stable URL: http://www.jstor.org/stable/3010356 Accessed: 19/12/2008 08:39 Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use. Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at http://www.jstor.org/action/showPublisher?publisherCode=pal. Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission. JSTOR is a not-for-profit organization founded in 1995 to build trusted digital archives for scholarship. We work with the scholarly community to preserve their work and the materials they rely upon, and to build a common research platform that promotes the discovery and use of these resources. For more information about JSTOR, please contact [email protected]. Palgrave Macmillan Journals and Operational Research Society are collaborating with JSTOR to digitize, preserve and extend access to The Journal of the Operational Research Society. http://www.jstor.org Journal of the Operational Research Society (1997) 48, 162-174 ?) 1997 Operational Research Society Ltd. All rights reserved. 0160-5682/97 $12.00 Real-time decision problems:an operationalresearch perspective R Seguin1, J-Y Potvin 12, M Gendreaul 2, TG Crainic 13 and P Marcotte 12 'Centre de Recherche sur les Transports,2Universit6de Montreal, and 3Universitedu Quebec a'Montreal, Canada This paper is concernedwith a class of dynamic and stochastic problems known as real-time decision problems. The objective is to provideresponses of a requiredquality in a continuouslyevolving environment,within a prescribedtime frame,using limited resourcesand informationthat is often incomplete or uncertain.Furthermore,the outcome of any particulardecision may also be uncertain. This paper provides an overview of this class of problems, reviews the relevant Artificial Intelligence literature,proposes a dynamic programmingframework, and assesses the potential usefulness of Operational Research approaches for their solution. Throughout the paper, a vehicle dispatching applicationillustratesthe relevant concepts. Keywords: real-time decision problems; artificialintelligence; heuristics;dynamic programming;vehicle dispatching Introduction When studyingreal-timedecision systems (RTDS), the first thing that comes to mind is probably the notion of quick reaction to external events. Accordingly, the main characteristic of a RTDS is its ability to analyze an irregularflow of incoming data that describes a dynamically evolving environmentand to take appropriateactions underresource limitations such as time availability, hardware specifications or other problem specific features'. Furthermorethe informationavailable is often incomplete or uncertainand the outcome of any particularaction may also be uncertain. Thus, real-timedecision problems(RTDP)typically include stochastic as well as dynamic components. In this context, system correctnessdependsnot only on the appropriateness of the response but also on timeliness2.Ultimately,a tradeoff between the quality of the response and its computation time must be achieved. A precise and concise taxonomy of RTDPs can hardlybe provided due to their extreme diversity. However, it is possible to sketch a list of attributesof RTDPs which may lead to a classification scheme. This list should help to furtherunderstandthe natureof RTDPs and provide practitioners with insights about the requirementsof their application. Note that the elements shown in Table 1 may not apply, or only partially,to some problems or specific cases of a problem. Most of these elements will be discussed in the remainderof the paper. The role of RTDSs is to support, assist or replace human operators for real-time decision making. In this paper, the scope is limited to semi-autonomous systems that rely on operators for general guidance and major decisions. Realtime decision problems are routinely found in a growing number of complex applications that arise in manufacturing, medicine, communication, aeronautics, robotics, transportation and the military. Several of these applications involve technologies, such as global positioning systems, that generate inputs at a very fast rate. The efficient use and treatment of these data put stress on the decision-making process and are dependent on the availability of highly qualified personnel, such as dispatchers or air traffic Table 1 List of attributesof RTDPs Desired solution Objective Response time Planning and/or forecast horizon Course of action Action outcome Input data Failurerecoverymechanisms Futureevents Environmentand working conditions Correspondence:Dr J-Y Potvin, Centre de recherche sur les transports, Universite de Montr&el,CP. 6128, succursale Centre-ville, Montreal, Canada H3C 37J Optimalor approximate Simple, multiple, conflicting Extremelyfast to loose variableor unique Long, medium, short term Unique or multiple (reactive, incremental,deliberative) Certainor uncertain Certainor uncertain Complete or incomplete Numerous or scarce Requiredor not Extrapolated,simulated, predicted,ignored Time invariantto highly timevariant Highly predictableto highly unstable RS6guin etal.-Real-time decision problems163 controllers,that are both expensive and increasingly difficult to hire. Consequently,companies are looking more and more towards computer assistance. One good example is the use of simulation. It is not a solution approachbut it may be useful to calibrate models and parameters,to test algorithms,and to determinethe specification,performance or qualityof solutionmethodsand prototypes.In fact, every RTDS should be evaluated at some point, if not continuously, throughsimulation. In most applications,quick execution time is a prerequisite. Accordingly, heuristic or approximatemethods will most likely be the class of algorithmsto rely upon. In some cases, time can be so limited that only extremely fast solution methods can be considered;these include greedy heuristics, precompiled default solutions, queries into knowledge bases. Usually, this urge for quick response is due to short deadlines. Although it is highly desirable to meet these deadlines, deliberationtime may be available. The system must be able to decide by itself whetherthere is time to deliberateand if so, to what extent. To facilitatethis process, a desirable feature of a decision algorithm is its ability to provide, within the required time frame, a guaranteedlevel of performance. Other importantcharacteristicsof RTDS are the ability to * Realize that the state of the world has changed. * Focus its attentionon key issues. * Dynamically adapt itself to changing priorities and resources. * Provideand implementa response despite minor failures of the system. * Handleinterruptsto acceptnew and importantinputsthat could change the execution prioritiesof tasks; typically, RTDSs are 'connected'to the externalworld (via sensors) and must also interactwith human operatorsto gather inputs and provide outputsthat help improve the operators'decision-makingabilities. * Cooperatewith other systems. The purpose of this paper is to present an overview of real-time decision problems and show how operational research can be applied in this context. The presentation is organizedas follows. The next section illustratesa RTDP in the transportationdomain, namely a vehicle dispatching application. The third section describes the functional componentsof RTDSs. The fourth section reviews recent and relevantdevelopmentsachieved by the Artificial Intelligence (Al) community in the field of RTDPs. The fifth sectionintroducesa mathematicalframeworkthat will help to better understandthe nature of RTDPs by formalizing their description. This section also describes generic problem-solvingmethodologies developed within the field of OperationalResearch (OR) that may yield promising avenues for solving RTDPs. The final section provides concludingremarks. An illustration: vehicle dispatching in a courier service company In this section, we illustrate the main features of a RTDP by introducing a specific transportation application, namely the real-time operation of a courier service company. This illustrative application will be used as a 'running' example throughout the paper. Let us consider a courier operator that receives calls for pick-up and delivery of priority mail. Each request consists of a location and a preferred or due date for the pick-up and the delivery. Since most customers want quick service, dispatching and scheduling of the requests must be done in real time. At eveiy instant, requests are divided into two sets: a set of serviced requests, which are no longer considered, and a set of requests that have been assigned to vehicles and are either waiting to be picked-up or heading to their delivery location. In this context, the drivers' planning schedules are known. Each new request must be dispatched to a particular vehicle and scheduled in the route of this vehicle. The objective is to achieve a tradeoff between operating costs and customer satisfaction (service quality). In allocating a new request, the following factors must be considered: the current location, planned route and schedule of each vehicle, the location and due date of the pick-up and delivery points of the new request, the travel time between the locations, the topology of the underlying road network and the service policy of the company. Other types of events must also be considered, such as unexpected service delays due to road congestion or vehicle breakdowns. Also, additional considerations or constraints are often associated with each company's specific operations. A more detailed analysis will be given later, when a general mathematical model will be adapted to the dispatching problem. Functional components of a real-time decision system Real-time decision systems typically involve numerous functional components. Each component represents a distinct problem-solving phase which is either related to the analysis of the problem or to the decision process itself. In the following, we provide a description of the basic functional components of a RTDS and discuss the nature of planning under uncertainty. 1) Informationmanagementand datafusion In order to react appropriately to any new situation, a real-time system must be fed with a continuous flow of information about the current state of the outside world. This information is obtained through sensors, radar, global positioning systems (GPS), etc., and can be both incomplete and uncertain. Since each sensor has a very limited view of the world, different data sets issued from 164 Journal oftheOperational Research Society Vol.48,No.2 different sensors are often merged to yield usable and hopefully unambiguous statements about the current state of the world. 2) Situation assessment The occurrence(or non occurrence)of a particularevent does not necessarily call for a reaction from the system. The significanceof a particularevent must be interpreted with respectto the currentstateof the world, as well as its past history and predictedfuture.For example, if service is delayed at a given customer in a vehicle dispatching system, this does not necessarily imply that customers alreadyassigned to this route, but not serviced yet, must be reassigned to other vehicles. The situation must be assessed by evaluatingthe magnitudeof the currentdelay and its potential impact on the remainderof the route. 3) Evaluation of alternatives The system evaluates the available alternatives for responding to the situation that has been identified at the previous step. The feasibility of each action is determined by preconditions, while postconditions describe the state of the world that is likely to prevail after the execution of a given action. Preconditionsand postconditions are an integral part of the evaluation process. 4) Decision A decision consists in either reacting to the current situationor doing nothing.Usually, decisions concerning resourceallocation are taken, such as assigning a vehicle to a service request.If the decision is to react, one must decide how and when to react. Accordingly, a decision does not necessarily result in an immediate action. Rather,the action is incorporatedinto a plan of actions that extends over a rolling time period known as the planninghorizon.The plan specifies all actions thatmust take place as well as their possible outcomes over the horizon, using informationprovidedby the currentstate of the world and projections made over the future (for example estimation of the travel times between two service points in the vehicle dispatchingexample). Broadly speaking, a plannercan provide three different levels of responses to a specific situation: b) Incrementalplanning This level involves slightly more elaboratemodifications of the currentplan and is appropriatewhen the current state of the world does not depart too much from its expected state at the time the plan was first devised. c) Deliberativeplanning This higher level calls for a mandatoryrevision of the initial plan whenever the current state of the world departs significantly from its predicted state, hence making the initial plan less effective or even inapplicable. Actions that are planned but have not been implementedyet must be revised. This revision may imply that actions will be cancelled or rescheduled. Since the outcome of any particularaction is uncertain, even a short term plan can lead to many different states of the world at the end of its execution. By associating a probabilitywith each possible state (using the probabilities associated with the outcome of each action leading to this state)andby consideringa utility functionfor evaluatingthe quality of each state, a naturalobjective is to determinea plan that provides the maximum expected utility at the end of its execution. However, deliberationswill vary according to the time and computationalresourcesavailable, so thatan optimal plan is unlikely to be found. Note also that the qualityof the proposedplan is expected to increasewith the amountof time allocated for its determination. The effectiveness of a plan depends on the level of uncertainty about the external world. For planning purposes, the environmentin which the plan is executed must be predictableto some extent. Planningis ineffective in highly uncertain situations, where an agent can only blindly react to events. Hence, replanning over a short horizon makes more sense if a new plan is likely to be needed very shortly. For example, in the vehicle dispatching problem, if a vehicle breaks down or is significantly delayed due to road congestion, the requests that have not been servicedyet by this vehicle may have to be reassigned to other vehicles. If such a situation occurs frequently (highly uncertain environment),the planner should focus on requests having a due date close to the currenttime. In the following, the four functional components of the generic frameworkare illustratedfor the vehicle dispatching problem. Only a few examples are provided for each component. a) Reactive planning This lower level is concerned with short-termactions, whose effects are local in nature.These actions may be adequateif the modificationsin the state of the world are small. Quite often, it is driven by precompiledstimulusresponse knowledge. 1) Informationmanagementand data fusion Informationis provided via on-board tracking systems for automatic location of vehicles. Sensor devices located along the roads allow the evaluation of the RSeguin etal.-Real-time decision problems 165 congestion levels of the network and the estimation of travel times between service points. 2) Situation assessment Several elements could be considered:the priorityclass of each new request, the proximity of vehicles from request locations, the movement of vehicles towards request locations, the position of idle vehicles and the impact of service delays due to road congestion. 3) Evaluation of alternatives In an uncertainenvironment,it may be wise to delay the assignment of requests until furtherinformationabout the state of the world is available (new service requests, for instance). This might result in economies due to request pooling. If a request priority is high, possible insertion locations for this request along the planned routes must be evaluated. 4) Decision An example of each of the three levels of response is given. a) Reactive: Ordera vehicle to modify its current destinationto service a new requestwhose location is in the vicinity of the vehicle's currentposition (diversion). b) Incremental: Assign a new requestto some vehicle, and insertthis requestat a particularlocation within the plannedroute and schedule of this vehicle. c) Deliberative: Inprove the value of the utility function by exchangingrequestsbetweenroutesand by reordering the requestswithin the routes. The artificial intelligence perspective In reviewing the Al literature,we observe that researchon RTDPs is roughly divided into two streams:the first stream explores different alternativesfor structuringthe problem space and the search process in order to reduce problemsolving variabilityand, at the same time, produce satisfactory solutions within the available time frame. In the second stream,researchershave identifieddesirableproperties and characteristicsof potential real-time algorithms, thus defining new classes of algorithms. In the current section, we review both approaches and their related concepts. section is devoted to the description of an architectureof general applicabilityand provides a possible specialization to the vehicle dispatchingproblem.The interestedreaderis referred to the survey of Strosnider and Paul2 for more details on the first streamof research,and to the survey of Garvey and Lesser for the second stream. The survey on knowledge based systems' constitutes another source of relevantinformation.The next two subsectionsare inspired from these documents. Theproblem space and the search process The first approachis based on the 'formulation'of Newell and Simon4where problem-solvingis definedas a searchin a problemspace. The authorsdefine a problemspace as a set of states and a set of moves (operators)between states. An instance of a specific problem is a problem space with an initial state and some terminalstates or goals. In the context of real-time decision problems, it corresponds to timeconstrained searches in a sequence of problem spaces. Given time restrictions,the objective is then to structure the problem spaces and the search processes in order to produce satisfactory solutions, namely solutions whose values exceed some predeterminedthreshold quality or solutions that improvewith elapsed time. Each problem is partitioned into a set of subproblems that are in turn decomposed into simpler problems,until the level of basic subproblemsis reached.Higherlevel subproblemsare more complex and describethe problemmore accuratelythan the lower-levelsubproblems.Cross-subproblemoperators(such as branching) control the transitionsbetween the subproblems. For example, 'AND/OR' graphs may be used to represent the problem-solvingstrategies.In these graphs,a node represents a problem or a subproblem, and the search strategy exploits two types of links: (i) 'OR' links that representalternativeapproachesto handle the parentnode from which they emanateand (ii) 'AND' links that connect a problem node to its subproblems5.For a problem to be solved, all subproblemsthat are connected via an 'AND' link must be solved, as opposed to a single subproblemfor an 'OR' link. Usually, some knowledge of the problem space is available and this knowledge generally increases as the search progresses. This knowledge can be used to reduce the number of states generated in the worst cases, thus making the algorithm's running time more predictable. Another option consists in structuringthe search space and the search process by using appropriateknowledge for pruning,ordering,approximatingand scoping purposes. A brief explanationof these terms is provided below. Another subject of interest to the Al community is the design of conceptualarchitectures,either of general applicability or aimed at specific applications.The last partof this * Pruning is a techniquethat uses knowledge to eliminate partsof the problemspace thatareknown, or are likely, to 166 Journal oftheOperational Research Society Vol.48,No.2 be useless to the searchprocess. Pruningcan be static if a priori knowledge is available, or dynamic if the knowledge becomes availableonly at run time. This technique is also exploited by operational research practitioners within branch-and-boundmethods, where a bound is computed at each node so as to determinewhether this node is promisingor can be safely eliminated(fathomed). Pruningis the only techniquethat guaranteesa reduction of worst-case execution times without compromising solution quality. However, it relies on some form of a prioriperfect knowledge. An example of pruningfor the vehicle dispatchingproblemis to eliminatesolutions that assign to the samevehicle two requeststhatare locatedfar away from each otherand requireservice almost simultaneously. a Ordering is a technique that first looks at states and/or subproblemsthat are more likely to lead quickly to a terminal state (goal). It corresponds,within the branchand-bound context, to the classical best-first approach, where the node achieving the best objective value is the next one to be expanded(explored).The well known A* algorithm5in Al is yet anotherillustrationof the ordering concept. Orderingmay be used with imperfectknowledge only but it will reduce only the average-case execution time. It has no effect on worst-caseperformances.In the vehicle dispatchingproblem, an orderingstrategymight be to examine first those solutions in which a request is assigned to the nearestvehicle. * Approximatingallows for a decrease in the evaluation quality of any solution characteristicduring the search process. The problem definition is implicitly modified when approximationsare used. Potentiallyless accurate solutions are deemed acceptablein exchange for a reduction in variability and lower worst- and average-case execution times. Approximatingin vehicle dispatching can be used, for example, to estimatethe traveltimes. In branch-and-boundschemes, one form of approximation could be the use of upper or lower bounds that are not strictlyvalid (namely they may be wrong in some cases) but that can be computedvery quickly. * Scoping controlshow far (in time and space) the process is allowedto explorethe searchspace in orderto select the nextmove;thewidertheexplorationin space,thesmallerthe time intervalallocated to each search direction. Scoping determinesthe fractionof the searchspace to be explored, based on available time. Scoping is useful when some knowledge about the current situation is available; in highly unpredictableenvironmentsthe explorationshould be limited while in fairly stable situations,it may be a lot moreinvolved.This techniqueplays a key role in problems with infiniteplanninghorizonsby establishinga threshold forthe acceptabledegreeof lookahead.Invehicle dispatching, scopingmaybe used to controlhow farin timewe tryto predictfuturerequests,or how muchin advancewe wish to assign requeststo vehicles. As mentioned earlier, every algorithm,method or solution concept proposed for solving RTDPs makes use (to some extent) of the techniques aforementioned.These are common tools that help practitionersto cope with extremely difficult problems. They may present themselves under various names but the techniques are essentially the same. Korf6 and Ishida and Korf7 use 'time constrained node expansion' for limiting the degree of lookaheadwhen evaluating a node in real-time search algorithms. Using approximationfunctions to evaluatenode quality, Chu and Wah8reduce the search space by eliminatingnodes that are far from the currentbest solution. The same authors(Chu and Wah9) propose methods that restrict the breadth and depth of the search process by limiting the list of active nodes. Winston10and Korf11proposedifferentforms of time scoping thatfall underthe nameProgressive Deepening and where the search proceeds to depth 1, depth 2, and so on until a deadline is reached. Otherproblem-solvingapproaches In their literature survey, Strosnider and Paul2 consider different classes of algorithms and explain how the four techniquesof the precedingsubsectionare exploited within each class. Garveyand Lesser3also propose a classification of their own. Quite often, the different class descriptions overlap.Accordingly,we will restrictourselvesto a brief and general descriptionof the most importantclasses of algorithms found in the Al literature. (1) Wright et al12 describe a technique called Progressive Reasoning which involves many levels of reasoning. Problem solving becomes more time-consumingfrom one level to the next as the informationbecomes more difficult to retrieveand process. Thus the method does not waste much time in extraanalysis at the lower level. Progressive Deepening, as introducedin the previous subsection, can be recognized as a special case of ProgressiveReasoning. (2) Multiple Methods, sometimes referredto as Approximate Processing, involves a set of methodsthat achieve trade-offsbetween execution time and solution quality. Compromisescan be nMadeon completeness, accuracy or reliability of the solutions. Hence, certain problem objectives may not be completely fulfilled. One advantage of MultipleMethods is that they do not rely on the existence of iterative refinementalgorithms that must monotonously improve solution quality. Furthermore, these methodscan be associatedwith completely different solution approachesto the problem. Their behavior and usage may be linked to specific environmental situations.At each step, a single method within the set is selected. The choice depends on the tasks' status,the resources, the currentgoal, etc. This requiresthat the behavior of the various methods be fairly predictable. decision problems167 etal.-Real-time RS6guin Contributionsto this area are found, under the term ApproximateProcessing,in Lesseret a113 and Decker et al11415. These authors use a mixture of data, knowledge and controlapproximationto achieve a satisfactorylevel of quality. Also related to Multiple Methods is the concept of Design-to-Time.Design-to-Timeis a generalization of Approximate Processing where the existence of various task-specificmethods is assumed and where methods are chosen in relationto availabletime. Accordingly,success is highly dependenton the predictability of the time and quality performanceof these methods16-18 (3) Chung et al19 have introducedthe techniqueknown as Imprecise Computationwhere solutions of poorer but acceptablequality are deemed acceptablewhen there is not enough deliberation time to achieve the desired quality level. It is usually assumed that tasks are composed of a mandatorypart and an optimal part. The result can only be acceptedif the mandatorypartis successfully completed.Whenevertime permits,fulfillment or partialfulfillmentof the optionalpart provides improvement to the solution20-22. (4) The class of AnytimeAlgorithms, introducedby Dean and Boddy23,involves algorithmsthatalwaysproducea result, no matterhow much deliberationtime is available. Within this class, solution quality is expected to increase when more time is allocated (up to a certain limit). Typically,only one algorithmper task is used, as opposed to multiple methods. This reduces the design and coding effort, thus potentialerrors.A performance profile that maps solution quality versus runtime is associated with each Anytime Algorithm. These profiles, along with the computationalrequirementsof each algorithm,areused to select the algorithmwhich is best suited to a specific situation.Note that the performances profiles may be difficultto generatewhen large variances in execution times are involved. The above definitionof Anytime Algorithmsis broadand includes algorithmsthat have been extensively studied outside the Al community, where they are probably known under different names. In the Al community alone, Anytime Algorithms occur under different forms and names, such as Deliberation Scheduling24.This latter approachinvolves the explicit assignmentof resources to tasks to optimize an objective function over some time interval. In this context, a task corresponds to solving a problem, or part of it, with an Anytime Algorithm. Deliberation Scheduling assumes that knowledge about future events is available. Also, the tasksare often assumedto be independent.Furthermore, some time has to be set aside for scheduling the resources (before problem-solving effectively takes place).The techniqueknown as Compilationof Anytime Algorithms is based on managing a set of Anytime Algorithmswhose results are used as inputs to another 26. The aim of the approachis to Anytime Algorithm25 allocatetime to individualalgorithmsso as to maximize the expected value of the 'master' Anytime Algorithm. Here, dependencies between tasks may be explicitly represented.The term Contract Anytime Algorithm is used to denote methods that produce solutions whose qualityincreaseswith runtime,and where the amountof availabletime must be known in advance,as in Designto-Time algorithms.If less time is given to the algorithm, it may half without returning a solution; this contrasts with interruptiblealgorithms which can be stopped at any time and yet provide an answer. Obviously, contract algorithms are easier to design than interruptibleones. Note also that interruptible algorithms may be constructed from contract algorithms, but more computation time may then be needed to achieve the same solution quality. Finally, Horvitz27,28 uses Anytime Algorithms, called Flexible Computations, in decision theory settings where he seeks to assign utility values to the incomplete results of an Anytime Algorithm. He is also interested in designing Anytime Algorithms that will reason about the action to be performed next, and in finding an optimal time balance between the 'reasoning' and 'problem solving' phases29'30. The problem-solvingapproachesdescribedabove are all potentiallyuseful for designing algorithmsaimed at solving RTDPs. Multiple Methods, Imprecise Computation and Anytime Algorithms are well suited for RTDPs, but may be difficult to apply in practice. For example, the behavior of Multiple Methods needs to be fairly predictableand this can be hard to achieve. Furthermore,in Imprecise Computation, the amout of deliberationtime may be so small that even the fulfillment of the mandatorypart of the solution process may be impossible. The necessity of always providing a result with an Anytime Algorithms, no matter how much time is available, representsa very hardrequirement. As pointed out by Garvey and Lesser3,real-timeAl has made significantprogress,but a lot of work still need to be done. One possible research direction may be the combinationof these differentapproachesto take advantage of their strengthsand reducethe effects of theirweaknesses. This strategylooks superiorto algorithmdesigns that stick to a single philosophy. One way to achieve this goal is to design layeredarchitectures,where differentapproachesare used dependingon the layer and/or the currentsituation. A general high level conceptual architecture The design of conceptual architecturesfor large applications involving many real-time components has been an area of active researchin the real-timeAl community.The objective is to achieve overall real-timeperformance,even though some elements of the system might not really be Vol.48,No.2 Society Research oftheOperational 168 Journal concerned with this issue. A system architecture is a framework that describes the interconnections between the various components of a system designed to solve complex problems. The architecture usually combines extremely fast reactive elements with more complex deliberative elements. Architecturesare generally organized in layers related to the level of abstractionof their components. Each specific architecture is closely linked to a particular application, although it is possible to design high level architecturesof wide applicability. Figure 1 is adapted from Chalmers et a131 and Morin et a132. It illustratesthe flow of informationbetween the components of a high level architecture.Let us briefly describe its components (note that the four functional components of RTDSs, as describedin the previous section, may be found in this description): * The characterizerdetects significantevents (such as new requests, vehicle breakdowns or weather conditions, in the vehicle dispatching example) that are physically measurable. It is through the characterizerthat human operatorscan interactwith the system and send information to the relevantcomponents. * The verbalizer analyses the informationgatheredby the characterizerand the effector (see below) throughinformation managementand data fusion, performs situation assessment and triggers the projector or selector (see below) when required. * The projector performs extrapolations, simulations, predictions or estimationsrelated to uncertainevents in orderto gatheradditionalknowledgethatcould impacton the decision process. * The planner evaluatesalternativesand generatesplans or actions using the information gathered by the lower layers. * The selector controlsthe global strategy,based on user's inputs, and selects the actions (decisions) adapted to specific situations. * The effector monitorsthe environment(via the characterizer) for possible fast reactive actions, coordinatesplans received from the selector and sends order for their implementation.It also sends informationto the verbalizer. The architecture shown in Figure 1 is a three-layer architecture.The interface layer consists of the characterizer, the verbalizerand the effector, all of which are closely relatedto the externalenvironment,which includes human operators.The choice layer consists of the projectorand the selectorand is concernedwith the decision-makingprocess. Finally, the generation layer consists of a planner that issues plans or actions. The loop consisting of the environment, the characterizerand the effector is used when a fast reactive action is needed. The largerloop that also includes the verbalizerand the selector is used when the analyses of the verbalizerreveal thatthe currentplan of actions can still PLANN Generation PRJECTOR SELECTOR Layer Choice Layer VERBALIZER _ -4 - EFFECTOR Interface Layer CHARACTERIZER (ENVIRONMENT l Figure 1 Framework of a general architecture. address the current situation in a satisfactory manner. Finally, the largest loop that includes all components is used when deliberationis necessary. In each component,some actions or decisions have to be made. The level of reasoning and the time pressure vary greatly from one component to the other and different techniques may be used in their design. For example, knowledge-based and rule-based modules are often used for this purpose. Many examples of knowledge-base or rule-base expert systems for real-time applicationsmay be found in the book of Waterman33and in the papers of 3 Laffey et all and Garvey and Lesser3. Such an architecturefor the vehicle dispatchingproblem is illustrated in Figure 2. It provides a more detailed description of each component and shows what may be expected from each of them. Potential operational research contribution Recently, the OR community has directed its attention towards real-time aspects of decision problems. Their nature suggests the use of heuristic methods for their solution. Fortunately, in the last decade or so, the OR communityhas developed a class of powerful and versatile metaheuristics, like simulated annealing, tabu search, genetic algorithms and GRASP methods (to be described later) which have all been applied successfully to the solution of real-worldproblems. Prior to the metaheuristics' description, we want to emphasize the role of mathematical modeling which is crucial in understandingthe fundamentalsof a problem. It provides insights and establishes a rigorous framework for its algorithmicsolution. Furthermore,it is frequentthat efficient proceduresfor generatinglower or upper bounds on the objective value can be designed at the modeling stage. These bounds can be used to assess the quality of a solution method; this is useful when facing difficult RS6guin etal.-Real-time decision problems169 PLANNER * Evaluate the opportunityof delaying the assignment of a new request. * Evaluate the insertion places of a new request along a route. * Evaluate the possibility of: - dispatching a new vehicle, - reschedulingthe requests, - exchangingrequestsbetween routes. PROJECTOR * Extrapolate vehicle movements and work loads. * Predict congestion and future requests. * Estimate travel times. SELECTOR Control the overall strategy and select the best plan in relation with the current situation. VERBALIZER * Analyze the information:is there anythingnew? what is the vehicles' status? is efficiencydecreasing? are deadlines approaching?how many requests to schedule? what is their priority? what is the influence of breakdownsand congestion on the planningschedules? * Performsituation assessment. * Trigger the projector and/or the planner. * Monitor the environmentfor reactive actions: ask a vehicle to serve a request "on the fly". . Coordinatethe plans: send orders to vehicles, follow scheduledactions and check deadlines. CHARACTERIZER * Detect requests, incidents, congestion, arrivalof a vehicle at a pick-up or a delivery point. * Get instructions from operators. Fr 2 EFFECTOR v NVIRONMENT Figure 2 An architecturefor the vehicle dispatchingproblem. problems such as RTDPs for which the optimal solution can hardlybe generatedin reasonablecomputationtime. A mathematicalmodel In this subsection, a mathematical model for vehicle dispatching will be presented within the framework of dynamic programming.Its formulationcan be adaptedto a wide range of specific applications. At this stage, its purpose is purely descriptive, and we do not intend to propose algorithms for its solution. Due to its generality, detailed features of the model have been omitted. At a given instant t of the planninghorizon, we observe the natureof the outsideworld,wt. The symbol wt, usually a vector in n-space, embodies a description of the external inputs and includes information about travel times, as provided by some external system that takes into account trafficcongestion,weatherconditions,accidentsand time of the day. At time t there exists a set of At of scheduled actions, eitherin progressor pending, and a set Rt of active requests;the sets At andRt may be empty.Scheduledactions result from previous decisions and correspondto the set of vehicles, while requests (the set of pick-up and delivery locations that have not been serviced yet) are issued by the outside world. A status is associated with each action and request. The vector of scheduled actions' status is denoted by st = (sa(aj)), where ai is a member of At. The status st(ai) of action (vehicle) ai consists of the vehicle's schedule (plannedroute) and currentlocation at time t, togetherwith any relevantinformationsuch as: driver'sefficiency,vehicle capacity,driver'slunch hours, work load, etc. In a similar fashion, we denote by str= (s'(rj)) the vector of requests' status for each request rj. The status str(rj) of request rj consists of the priority level, the pick-up and delivery 170 Journal oftheOperational Research Society Vol.48,No.2 locations, the scheduledservice time and the time windows for pickup and delivery.Note thatsa(ai) and Sr(rj) may also be vectors, as an action or requestmay be characterizedby many components, each one having its own status. The above vectors provide a full descriptionof the actions and requests, whereas the sets At and Rt are only used for identification purposes. We also include in the vehicle dispatching model the resources' status ct that describes the available computer resources; this symbol usually denotes a vector, since more than one resource is likely to be involved in other applications. At time t, the state of the system is entirely defined by S = (wt, sa, Sr, Ct) which concatenates the outside world vector and the status' vectors and contains all relevant information about the outside world, the vehicles, the requests and the resources. Situation assessment is performedusing the informationheld in St. Each component of St, especially the vectors of scheduledactions and active requests, must contain all necessary data. These vectors may also include past actions or requests whose inactivity status is recent, because this information may also be helpful in assessing the currentsituation.Thus, the situation assessment can be represented as a function ft) of the currentstateSt. The resultingsituationct will usuallybelong to a set of predeterminedsituations.It may indicatethat no new event has occurred,that one or more new requestshave been issued and must be dealtwith, thata vehicle is stuck in a trafficjam or is down, thatproductivityis too low, that the work load of some vehicle is too high, that a given request has been picked up or delivered, etc. Once the situation assessmenthas been completed,a decision dt is takenfrom a set Dt of availabledecisions. The alternativesmay include: do nothing, book a new request for futuredispatch,reoptimize the currentroutes, assign and schedule a new request. The set Dt depends on the current situation ct and the resources status vector ct. The problems that we investigate are probabilistic in nature. Indeed the state of the world and the outcomes of actions are not known with certainty, requests arise in a randomfashion, and accidents on the network, travel time modificationsor vehicle breakdownsmay happen. Assuming that the state space is finite, we will denote by P(@t, ut, dt) the probability that the system currently in state St at time t evolves to state St, at time t', given that a decision dt, or absencethereof,has been takenat time t. This probability is difficult to calibrate, as it combines many uncertainparameters,and is frequentlyestimated through simulation.In other applications,some components of the statevector St may be continuous;in this case we replacethe probabilitymass function, as describedby the probabilities P('t, St, dt), by a probabilitydistributionfunction defined on the (measurable)system state space. We attachto each pair (ot,dt) a randomutility p? that depends on the outcomes of actions still in progress, on future actions, and on new requests that have not been handledyet (note that the informationabout futureevents is contained in the transitionprobabilities).The objective, at time t, is to select the best decision dt, based on the current system state and the expected future. This decision must maximize the expected utility over the set of available decisions, namely, dt = argmaxE[p.,,d]. As noted earlier, this model provides a high level of abstractionand may be generalized to other applications. While, for a specific application,the variables (decisions) of the problems can frequentlybe agreed upon, this is not necessarily the case for some parametersof the problem, such as transition probabilities, that may be extremely difficult to evaluate when the system is highly unpredictable. Generic OR algorithms In recent years, operationalresearchershave invested much time and effort into the study and design of metaheuristics. To be successful, these methods must be tailored to each specific application. This task may be straightforward,or may not! Some insight and knowledge of a particular problem greatly help in designing efficient heuristics issued from a generic class. In addition to mathematical modeling expertise and specialized mathematicalprogramming algorithms, generic methods are surely the most promising contributionsthat OR can offer in the context of RTDPs. In the sequel, we will briefly review some of these generic methods. Various extensions and customized versions can be found in the literature. The metaheuristicsdescribed in this subsection provide general principles to direct the evolution of the search in the solution space. Typically, these methods exploit local search heuristics which can be described as follows: (1) Generatean initial solution. Set the currentsolution to the initial solution. (2) Generate the neighborhood of the current solution throughlocal modifications. (3) Select the best solution in this neighborhood. If this solution is betterthan the currentsolution, the selected solution becomes the new currentsolution. Otherwise, exit the process with the currentsolution. (4) Returnto step 2. In the following, we proceed with a brief descriptionof each method. (1) Simulated annealing (Kirkpatricket al34) has been inspired by the metallurgic annealing process which consists in elevating the temperatureof a metal to reach the melting point, and then slowly reducing the temperatureto achieve a thermodynamicequilibrium decision problems171 RSeguin etal.-Real-time (stable solid state) while avoiding the weak structure states (local optima). Necessary conditions for its successful realization are a high initial temperature and a slow cooling schedule. When stimulatedannealing is appliedto an optimizationproblem,a solution of low qualityis progressivelyimprovedthrougha process similarto metallurgicannealing.To this end, a temperatureparameteris defined,whose value is slowly reduced through a cooling schedule that mimics the physical process. A move that improves the currentsolution is alwaysaccepted,while a move to a neighboringsolution of lower quality can be accepted with some positive probability (to escape from bad local optima). The probabilityof accepting a degradationis higher when the temperatureis higher and the magnitude of the degradationis smaller.Accordingly,by slowly reducing the value of the temperatureparameter,degradationwill occur more frequentlyat the startbut will be unlikely at the end of the algorithmicprocess. The methodwill halt with an optimum, hopefully a global one. Simulated annealingin its basic version is a very simple androbust methodology that has produced fairly good results for very difficultproblems.Anotherappealingaspect of the methodis the existence of a theoreticalproof of convergence. Unfortunately,running times are usually long and many successes rely on fine tuning and design 35 decisions that are not found in the basic version (2) Tabu search (Glover36'37) also accepts neighboring solutions of lower quality. The algorithmuses a local searchheuristicto detect and select the best neighborof the currentsolution,even if it resultsin a degradationof the currentsolution. To avoid cycling, moves that are likely to lead to previouslyvisited solutions are forbidden duringa certainnumberof iterations.These forbidden moves (tabumoves) give the name to the method.A fundamentalcharacteristicof the method concerns the systematic gathering of information about previously explored areas. Intensificationand diversificationstrategies are among the most widespreadtechniquesbased on this information.Intensificationis aimed at focusing the search on the neighborhood of the best solutions visited duringthe search,while diversificationis aimed at drivingthe searchin a new, yet unexplored,region of the searchspace. Tabusearchis a powerfuland versatile methodthathas producedgood resultson many kind of problems.Its strengthlies in the mechanisms available for designing algorithms tailored to specific applications. Conversely,this can be seen as a disadvantageof tabu searchsince it usually implies a fair amountof fine tuning and problemknowledge. (3) The metaheuristic known as GRASP (Feo and Resende38 and Feo et a139), for Greedy Randomized Adaptive Search Procedure, combines greedy heuristics, randomization, and local search techniques. It proceeds in two distinct phases. In the first phase, feasible solutions are constructed.Each move or step towardsthe constructionof a solution, is evaluatedwith an adaptivegreedy function that takes into account the current state of the partial solution. A move is then randomly selected from a restrictedlist of candidates moves. In the second phase, a local search heuristic is applied to the solution constructedby the greedy heuristic. The entire process is repeatedseveral times. The second phase may be appliedto the whole populationof solutions, or only to a subset of selected solutions obtained at the end of multiple runs of the first phase. GRASP methods are very simple, easy to implement and providemany alteruativesolutions. They need very efficient (fast) constructionalgorithmsand good local search heuristics. They work well if the feasible solutions constructed during the first phase are of good quality40. (4) Genetic algorithms (Holland41and Goldberg42)rely on an analogy with the evolution of species, namely the 'survivalof the fittest'. It is expected that a population can be improved by favoring the reproductionof its fittest individuals,while allowing for occasional mutations. As opposed to most search heuristics, genetic algorithmswork with a populationof solutions.At each iteration(or 'generation'),a new population is created from the preceding one by applying operatorsinspired by genetics to the fittest individualsin the population. These operatorsare: recombination(or crossover) and mutation. The first operator combines two solutions (individuals) by merging their best components (genes). The mutation operator stochastically alters small parts of a solution. Throughthe action of recombination and mutation, it is expected that an initial population of randomly generated solutions will progressivelyimprove(on average)from one generation to the next. Basic versions of genetic algorithms are context or domain independent.It means that a single computerprogrammay be appliedto differentkinds of problems. Thus they are very robust: efficient and generally applicable, even for complex multi-modal objectivefunctions.Furthermore,they arereadilyamenable to parallelization,since they work on a population of solutions. On the other hand, solutions must be encoded in a suitable form (typically,bit strings) to fit the general program.Also, specialized methods often outperformgenetic algorithmson specific problems so that specialized crossoverand mutationoperatorsmust 43 be developedfor genetic algorithmsto be competitive The above methods possess (to some extent) some of the characteristicsof the problem-solvingclasses found in the Al literature,as describedearlier.Each of the four methods exhibits the behaviorof Anytime Algorithmssince solution quality increases with time. However, strictly speaking, they cannotbe consideredas Anytime Algorithms,not even 172 Journal oftheOperational Research Society Vol.48,No.2 contract ones, since they may fall short of producing a solution if the allocated computing time is insufficient. In some specific situations though, they may be designed to possess the ContractAnytime Algorithm property. These methods can also be specialized or designed to fit the class of Multiple Methods. For example, they may include a solution construction algorithm, a solution improvement algorithm,an algorithmthat seeks feasibility, etc. One may also claim that they possess properties associated with Imprecise Computation,if one defines the mandatorypart as finding an initial feasible solution and the optional part (if time permits) as the improvementphase, using a local search algorithm. Elements of the Progressive Reasoning approachcan also be recognized when the searchprocess is divided into phases that require increasing computation time. For example, such phases could be associated with neighborhoodsof increasing sizes. The above metaheuristic approaches can easily be adapted to real-time decision problems. For example, in the case of a new service request for a vehicle dispatching system, we can envision the quick insertion of a new request, based on the optimization of a simple cost measure. Tabu search or simulated annealing can then be applied to this initial solution, in the hope that a better solution will be found by exchanging requests within a route or between routes. Since the currentbest solution is always kept in memory,the quality of the final solution can only improve with the amount of computationalresources available. Genetic algorithms can be applied in a similar fashion by considering an initial population of solutions produced through the quick insertion of a new request at many differentlocations within the planned routes. Different recombinationand mutationoperatorsfor vehicle routing problems can be applied to modify this initial population44'45.GRASP can also be appropriatein this context since its success relies on the quick generationof many feasible solutions,each one being processedby a local searchheuristic. The methodology of neural networks (Rumelhartand McClelland46),which is quite differentfrom the preceding metaheuristics,is increasinglypopular in OR and can also be applied to RTDPs. Neural networks are composed of layersof simple interconnectedprocessingunits and weights are assigned to the connections. During the learningphase, the networkis fed with severalinput-desiredoutputpairs for trainingpurposes,and the weights are adjustedto minimize the error between the current outputs and the desired outputs. These models can be used to learn different behaviors.For example, in a time constrainedenvironment, they may provide reactive solutions that are based on knowledge acquired off-line during the learning phase. If the time constraintis less critical, neural network models may hint at initial solutions that will be used as starting points for more sophisticatedsolution techniques. Experimentswith neuralnetworkshave alreadybeen performedfor the vehicle dispatchingproblem47.In this work, the learning module suggested allocation decisions to a human dispatcher.Neural networks, like genetic algorithms, are readily amenable to parallelization. However, a suitable encoding of solutions is necessary as well as an appropriate choice of errorfinction48. Potential benefitsof parallelism In a real-time environmentit is not only natural,but often necessary, to use the technology of parallel computing to speed up computations.Hence the importanceof developing fast parallel implementationsof heuristic search techniques. This developmentbecomes even more significantif we wish to apply these methods concurrentlyin order to select or combine the best solutions produced. Such a parallelizationis not always straightforward.For instance, tabu search is inherently a sequential algorithm. However, synchronous and asynchronousparallel implementationsof tabu search on Multiple InstructionMultiple Data (MIMD) computersor on a network of workstations have already been realized49-51. Parallelism can be exploited by allowing each processor to explore only a small fraction of the current solution' neighbourhood.In this scheme, parallelismcan also be exploited throughthe simultaneous application of different modifications to the currentsolution at a given iteration.Powerfulasynchronous implementationscan also be obtained by running different tabu search processes independently on processors that communicatethrougha common memory space. Whenever the best overall solution is improvedby one of the processors, the new best solutionis storedin the commonmemory. Conversely,a processorthathas been unableto improvethe best known solution for a long period of time can use the best solution stored in the common memory to restartits own searchprocess. Similarideas also apply to the parallelization of simulatedannealing.Genetic algorithmsconcurrently follow differentsearchpaths througha populationof solutions and consequentlylend themselves easily to parallel implementations5255* It is also possible to envision hybrid approaches. For example, assume that several tabu search processes run concurrentlyon different processors. It would be possible to take the best solution found by two differenttabu search processes and merge them into a new solution. This new solution could be used as a startingpoint for anothertabu search process that has not made any recent progress. In a sense, this recombination of two solutions in a manner reminiscentof genetic algorithmsintroducesan element of diversificationby allowing the search to move into a new region of the search space. Ideas along this line have been exploited by Rochat and Taillard56. Finally, parallelismmay also be beneficial to GRASP by allowing the parallelgenerationof many differentsolutions during the solution constructionphase. RS6guin etal.-Real-time decision problems173 Conclusion The study of RTDPs is not yet well established in the OR community, although agencies, companies and governments are facing increasing pressure to work and to react in real time. This area of research will expand in the near future, as observed in the transportationdomain with the spreadof real-time vehicle guidance systems and the need to provide drivers with updated instructions. Real-time military and telecommunication applications are also numerous:missile guidance, message routing, etc. A clear understandingof the dynamics of RTDPs is important.We have to keep in mind thatRTDPs requiredifferentproblemsolving approaches,as comparedto problemswhich do not have the same resource constraints. Promising avenues exist and are just waiting to be explored and exploited. Conceptswhich may help to formalize the solution process of RTDPs have been reportedin the Al literature.Practically, we think that the metaheuristics developed for complex operationalresearch problems may prove useful in solving RTDPs as well. Most of these solution methods are readily amenable to parallelization.This constitutes an importantasset in the context of RTDPs. Acknowledgements-This work was partly supported by the Natural Sciences and EngineeringResearch Council of Canadaunder grant CRD 177440 and by Loral CanadaInc. Thanksare also due to Dale Blodgett of Loral Canadaand Bruce A. Chalmersof the Defense ResearchEstablishment, Valcartierfor their valuable comments. References 1 Laffey TJ, Cox PA, Schmidt JL, Kao SM and Read JV (1988). Real-time knowledge-based systems AIMagazine 9 (1): 27-45. 2 Strosnider JK and Paul CJ (1993). A structured view of realtime problem solving AI Magazine 14 (2): 45-66. 3 Garvey A and Lesser V (1993). A survey of research in deliberative real-time artificial intelligence. Department of Computer Science, University of Massachusetts, Report 93-84. 4 Newell A and Simon H (1972). Human Problem Solving. Prentice-Hall: Englewood Cliffs, NJ. 5 Pearl J (1985). Intelligent Search Strategies for Computer Problem Solving. Addison-Wesley: Reading, MA. 6 Korf RE (1990). Real-time heuristic search Artificial Intelligence 42 (2-3): 189-211. 7 Ishida T and Korf RE (1991). Moving Target Search. Proceedings of the Twelfth International Joint Conferences on Artificial Intelligence, pp 204-210. International Joint Conferences on Artificial Intelligence, San Mateo, CA. 8 Chu L and Wah B (1991). Optimization in Real Time. Proceedings of the Twelfth Real-Time System Symposium, pp 150159. Washington, DC: IEEE Computer Society. 9 Chu L and Wah B (1992). Solution of Constrained Optimization Problems in Limited Time. Proceedings of the IEEE Workshop on Imprecise and Approximate Computation, pp 40-44. Washington, DC: IEEE Computer Society. 10 Winston PH (1984). Artificial Intelligence, 2nd edn. AddisonWesley: Reading, MA. 11 Korf RE (1985). Depth-first iterative-deepening: An optimal admissible tree search Artificial Intelligence 27 (1): 97-109. 12 Wright M, Green M, Fiegl G and Cross P (1986). An expert system for real-time control. IEEE Software March: 16-24. 13 Lesser VR, Pavlin J and Durfee E (1988). Approximate processing in real-time problem solving AI Magazine 9 (1): 49-62 14 Decker KS, Lesser VR and Whitehair RC (1990). Extending a blackboard architecture for approximate processing J RealTime Systems 2 (1/2): 47-79. 15 Decker KS, Garvey AJ, Humphrey MA and Lesser VR (1993). A real-time control architecture for an approximate processing blackboard system Int J Pattern Recognition and Artificial Intelligence 7 (2): 265-284. 16 Garvey A and Lesser V (1992). Scheduling Satisficing Tasks with a Focus on Design-to-Time Scheduling. Proceedings of the IEEE Workshop on Imprecise and Approximate Computation, pp 25-29. Washington, DC: IEEE Computer Society. 17 Garvey A and Lesser V (1993). Design-to-time real-time scheduling IEEE Trans. Sys. Man Cybernet. 23 (6): 14911502. 18 Garvey A, Humphrey M and Lesser V (1993). Task Interdependencies in Design-to-Time Real-Time Scheduling. Proceedings of the Eleventh National Conference on Artificial Intelligence, pp 580-585. American Association for Artificial Intelligence, Menlo Park, CA. 19 Chung J-Y, Liu JWS and Lin KJ (1990). Scheduling periodic jobs that allow imprecise results IEEE Trans on Computers 39 (9): 1156-1174. 20 Liu JWS et al (1991). Algorithms for scheduling imprecise computations Computer 24 (5): 58-68 21 Shih W-K, Liu JWS and Chung J-Y (1991). Algorithms for scheduling imprecise computations with timing constraints SIAM J Comput 20 (3): 537-552. 22 Dey JK, Kurose J and Towsley D (1993). On-line processor scheduling for a class of IRIS (increasing reward with increasing time) real-time tasks. CS Technical Report 93-09, University of Massachusetts. 23 Dean T and Boddy M (1988). An Analysis of Time-Dependent Planning. Proceedings of the Seventh National Conference on Artificial Intelligency, pp 49-54. Menlo Park, CA: American Association for Artificial Intelligence. 24 Boddy M and Dean T (1994). Deliberation scheduling for problem solving in time-constrained environments Artificial Intelligence 67: 245-285. 25 Zilberstein S (1993). Operational rationality through compilation of anytime algorithms. Ph.D. Dissertation, Departrnent of Computer Science, University of California at Berkeley, Berkeley, CA. 26 Zilberstein S and Russell SJ (1992). Constructing Utility-Driven Real-Time Systems Using Anytime Algorithms. Proceedings of the IEEE Workshop on Imprecise and Approximate Computation, pp 6-10. Phoenix, AZ: IEEE Computer Society. 27 Horvitz EJ (1988). Reasoning Under Varying and Uncertain Resource constraints. Proceedings of the Seventh National Conference on Artificial Intelligence, pp 111-116. American Association for Artificial Intelligence, Menlo Park, CA. 28 Horvitz EJ (1989). Reasoning About Beliefs and Actions Under Computational Resource Constraints. In: Kanal LN, Levitt TS and Lemmer JF (eds).Uncertainty in Artificial Intelligence 3, Elsevier Science Publishers: Amsterdam, pp 301-324. 29 Horvitz EJ, Cooper GF and Heckerman DE (1989). Reflection and Action Under Scarce Resources: Theoretical Principles and Empirical Study. Proceedings of the Eleventh International Joint Conference on Artificial Intelligence, pp 1121-1127. American Association for Artificial Intelligence, Menlo Park, CA. 174 Journal oftheOperational Research Society Vol.48,No.2 30 Horvitz EJ and Rutledge G (1991). Time-Dependent Utility and Action Under Uncertainty. Proceedings of the Sixth Conference on Uncertainty in Artificial Intelligence, Los Angeles, CA. 31 Chalmers BA, Da Ponte P and Qiu K (1994). Generating Missile Defense Actions for an Adaptive Planner in a Layered RealTime Architecture for a Naval Engagement Manager. Proceedings of the 22nd Meeting of Subgroup W, Technical Panel W-6 Generic Weapon System Effectiveness, The Technical Cooperation Program (TTCP) Elgin Air Force Base: USA. 32 Morin M, Nadjm-Tehrani S, Osterling P and Sandewall E (1992). Real-time hierarchical control. IEEE Software September, 51-57. 33 Waterman DA (1986). A Guide to Expert Systems. AddisonWesley: Reading, MA. 34 Kirkpatrick S, Gelatt CD Jr and Vecchi MP (1983). Optimization by simulated annealing Science 220 (4598): 671-680. 35 Dowsland KA (1993). Simulated Annealing. In: Reeves CR (ed). Modern Heuristic Techniques for Combinatorial Problems. Halsted Press: John Wiley, New York, NY. 36 Glover F (1989). Tabu search, Part I ORSA J Computing 1: 190-206. 37 Glover F (1990). Tabu search, Part II ORSA J Computing 2: 4-32. 38 Feo TA and Resende MGC (1984). A probabilistic heuristic for a computationally difficult set covering problem Opns Res Lett 8: 67-71. 39 Feo TA, Venkatraman K and Bard JF (1991). A GRASP for a difficult single scheduling problem Comps and Opns Res 18 (8): 635-643. 40 Kontoravdis G and Bard JF (1995). A GRASP for the vehicle routing problem with time windows ORSA J. Computing 7 (1): 10-23. 41 Holland JH (1975). Adaptation in Natural and Artificial Systems. The University of Michigan Press, Ann Arbor, MI. Reprinted by MIT Press. 42 Goldberg DE (1989). Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley: Reading, MA. 43 Reeves CR (1993). Genetic Algorithms. In: Reeves CR (ed). Modern Heuristic Techniques for Combinatorial Problems. Halsted Press: John Wiley, New York, NY. 44 Potvin J-Y (1996). Genetic algorithms for the traveling salesman problem. Ann. of Opns Res 63: 339-370. 45 Potvin J-Y and Bengio S (1996). The vehicle routing problem with time windows, part II: Genetic search INFORMS J. Computing 8 (2): 165-172. 46 Rumelhart DE and McClelland JL (1986). Parallel DistribuFted Processing. Explorations in the Micro Structure of Cognition, vols. 1 and 2. The MIT Press: Cambridge, MA. 47 Shen Y, Potvin J-Y, Roy S and Rousseau J-M (1993). A computer assistant for vehicle despatching with leaming capabilities Ann. of Opns Res 61: 189-211. 48 Peterson C and Soderberg B (1993). Artificial Neural Networks. In: Reeves CR (ed.). Modem Heuristic Techniques for Combinatorial Problems. Halsted Press: John Wiley, New York, NY. 49 Garcia BL, Potvin J-Y and Rousseau J-M (1994). A parallel implementation of the tabu search heuristic for vehicle routing problems with time window constraints Comps and Opns Res 21 (9): 1025-1033. 50 Crainic TG, Toulouse M and Gendreau M (1995). Parallel asynchronous tabu search for multicommodity location-allocation with balancing requirements. Annals of Opns Res 60: 277-299. 51 Crainic TG, Toulouse M and Gendreau M (1995). Synchronous tabu search parallelization strategies for multicommodity location-allocation with balancing requirements OR Spektrum 17 (2/3): 113-123. 52 Braun H (1991). On solving traveling salesman problems by genetic algorithms. In: Schwefel HP and Manner R (eds). Parallel Problem-Solving from Nature. Lecture Notes in Computer Science 496, Springer-Verlag: Berlin, pp 129133. 53 Muhlenbein H, Gorges-Schleuter M and Kramer 0 (1987). New solutions to the mapping problem of parallel systems-The evolution approach Parallel Computing 4: 269-279. 54 Muhlenbein H (1991). Evolution in time and space-The parallel genetic algorithm. In: Rawlins GJE (ed). Foundations of Genetic Algorithms. Morgan Kaufinann: San Mateo, CA, pp 316-337. 55 Whitley D, Starkweather T and Shaner D (1990). Traveling salesman and sequence scheduling: Quality solutions using genetic edge recombination. In: Davis L (ed). Handbook of Genetic Algorithms. Van Nostrand Reinhold: New York, NY, pp 350-372. 56 Rochat Y and Taillard E (1995). Probabilistic diversification and intensification in local search for vehicle routing J. Heuristics 1: 147-167. Received November1995; accepted August 1996 after one revision