* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download 2. Model for Composition Analysis - The University of Texas at Dallas
Survey
Document related concepts
Factorization of polynomials over finite fields wikipedia , lookup
Theoretical computer science wikipedia , lookup
Natural computing wikipedia , lookup
Inverse problem wikipedia , lookup
Gene expression programming wikipedia , lookup
Multidimensional empirical mode decomposition wikipedia , lookup
Data assimilation wikipedia , lookup
Numerical continuation wikipedia , lookup
Algorithm characterizations wikipedia , lookup
Simplex algorithm wikipedia , lookup
Expectation–maximization algorithm wikipedia , lookup
Multi-objective optimization wikipedia , lookup
Mathematical optimization wikipedia , lookup
Transcript
A Model and Methodology for Composition QoS Analysis of Embedded Systems Hui Ma, I-Ling Yen, Dongfeng Wang, Farokh Bastani (hxm012600, ilyen, dxw018000, bastani)@utdallas.edu Department of Computer Science The University of Texas at Dallas Abstract Component-based development (CBD) techniques have been widely used to enhance the productivity and reduce the cost for software systems development. However, applying CBD techniques to embedded software development faces additional challenges. For embedded systems, it is crucial to consider the Quality of Service (QoS) attributes, such as timeliness, memory limitations, output precision, battery constraints, etc. Frequently, multiple components implementing the same functionality with different QoS properties can be used to compose a system. Also, software components may have parameters that can be configured to satisfy different QoS requirements. Composition analysis, which is used to determine the most suitable component selections and parameter settings to best satisfy the system QoS requirement, is very important in embedded software development process. In this paper, we present a model and the methodologies to facilitate composition analysis. We define QoS requirements as constraints and objectives. Composition analysis is performed based on the QoS properties and requirements to find solutions (component selections and parameter settings) that can optimize the QoS objectives while satisfying the QoS constraints. We use a multi-objective concept to model the composition analysis problem and use an evolutionary algorithm to efficiently determine the Pareto-optimal solutions. Keywords: Embedded software systems, Component composition, Pareto-optimal, Quality of service (QoS), Evolutionary Algorithm. 1. Introduction Recent advances in hardware technology have dramatically improved hardware productivity and made it economically feasible to extend the reach of automation to a wide variety of services, such as intelligent vehicles, patient monitoring systems, handheld devices, sensor networks, etc. However, the lack of commensurate gains in software productivity is a major hurdle in developing more sophisticated embedded applications. This is unfortunate since software is crucial to the successful realization of -1- many application systems. Domain-specific knowledge is usually embodied in the software. Also, software is frequently expected to enhance the robustness of application systems by monitoring the environment and adapting the system to tolerate hardware failures, network congestion, and security attacks. To enhance the productivity of developing complex applications, software technology is rapidly shifting away from low-level programming issues to automated code synthesis and the integration of systems from components. Component-based development (CBD) techniques can significantly reduce software development time and cost, which can benefit the software development process for embedded systems as well as other application domains [15]. However, CBD approaches for embedded software systems face additional challenges due to the stringent QoS requirements for these systems. For example, it is crucial to consider real-time, security, reliability, and resource and power constraints in embedded systems. Thus, the integration of embedded systems from components must consider the satisfaction of the functional requirements as well as the QoS requirements. Frequently, multiple components with different QoS tradeoffs can be used to achieve the same functionality. Also, components may be configurable, i.e., some of the program parameters of a component can be configured to achieve different QoS tradeoffs [5]. It can be computationally intensive to determine the most suitable set of components to use with the best parameter settings. For example, consider a small system consisting of 10 program units. Assume that there are two possible components that match the functional requirements of each program unit. Also, assume that each component has a single parameter with 10 potential settings to achieve various QoS tradeoffs. In an exhaustive search, there are (2*10)10 choices to be considered in order to find out the choices for a satisfactory QoS property of the system. Thus, an efficient and effective decision making mechanism is needed for QoS analysis in a CBD approach. In this paper, we present a model for composition analysis based on synchronous data flow (SDF) [14]. In our model, the system specification includes the modules that compose the system, the data flow among the modules, and the QoS requirements of the system. A module is a virtual unit defined by functional requirements. Each module can be instantiated by some configurable components that all satisfy the functional requirement of the module. Based on the system specification, the composition analysis determines which components can be used and how to configure the selected components to optimize the QoS objectives. We formulate the composition analysis as a multi-objective optimization problem and use an evolutionary algorithm to solve it. Pareto-optimal solutions are obtained to determine suitable component selections and parameter settings. The remainder of this paper is organized as follows. In the next section, we introduce the system model that forms the basis of composition analysis. In Section 3, the composition analysis problem, formulated as a multiobjective problem with Pareto-optimal solutions, is presented. Section 4 discusses the evolutionary algorithm used for composition analysis. An example and the experimental study are then presented in Section 5. Section 6 finally states the conclusion of the paper. 2. Model for Composition Analysis We are developing tools and techniques to assist embedded software development. The architecture of a repository-based embedded software development platform is shown in Fig. 1. Each block in the diagram involves a major tool set and technique. An online repository for embedded software (ORES) forms the foundation of the system. It provides effective component retrieval and sophisticated components descriptions, including functional and QoS properties of the components. Many components in the repository are configurable components. Also, we have developed a Component Parameterizer Tool Set [5] which can be used to parameterize the components in ORES. The QoS properties in terms of various settings of the configurable parameters of a component are measured and stored in the repository for future analysis. When developing an application, the designers interact with the Composition Interface to prepare the system specification, which includes the modules, the data flow among modules, and QoS requirements of the system. Based on the system specification, various components satisfying the functional requirements are identified by the Component Identifier. The Composition Analyzer tool set performs QoS analysis based on the QoS requirement specification and the QoS properties of the individual components. The set of components to be used and the settings of their configurable parameters are determined from composition analysis so that the system requirements can be satisfied. Finally, the code generator generates the glue code and composes the system from the selected -2- components. In this paper, we focus on the Composition Analyzer. System Designer Composition Interface (modules, data flow, QoS requirements) Component Composition Code Component Identifier Analyzer Generator Parameterizer ORES (Components) Figure 1. Repository based embedded software development platform. 2.1 System Specification Model We consider a synchronous dataflow (SDF) model for specifying an embedded system from existing components [14]. Dataflow models handle regular computations that operate on streams, which is popular in signal processing system specifications. Each process in a dataflow model is constructed as a sequence of atomic actors [15]. An actor has an interface, which includes communication ports and parameters that are used to configure the function of the actor. With the synchronous feature, the SDF model is predictable and can be scheduled statically. Thus, it is extremely useful for the formalism of embedded real-time software. Our composition specification is based on the SDF model. We replace the notion of actors by a virtual functional unit, called a module. A module is specified by its functional requirement and the descriptions for each data input and output. It has to be instantiated by an actual component in the repository. Since there may exist multiple components that can satisfy the functional requirement of a module, the module can be instantiated by any of them based on the desired QoS properties or other considerations. Using a module instead of the actual component in the composition specification yields flexibility and reusability of the system specification and provides the potential for system reconfiguration. Definition 1. [System Specification Model] The system specification model is based on the SDF model and defined as G = (, , ) where: = {l | 1 l L} is the set of L modules (actors). Each module l in is defined by a functional requirement with which the actual components in the repository can be selected to instantiate it [24]. is the set of directed edges that denote the data flows among the modules and the flow directions. The number of data blocks produced or consumed by each I/O port (input/output port of the data flow) is also defined on E. is the set of overall QoS requirements of the system, which will be elaborated later in Definition 2. A system designer can prepare the system specification G that defines how to compose the components in the SDF model to achieve a goal function. The components can be selected from the repository and associated with the modules in . Given the specification of a system G, the set of components Cl = {cl,m | for all m} are identified, where cl,m is the m-th component that can satisfy the functional requirement of module l in . We define the set of all components that instantiate the system as C = {Cl | 1 l L}. Note that dependency between some components may affect component selection and configurable parameter setting. For example, an encryption function may have to be used with its corresponding decryption function together. ORES allows the specification of component dependencies [24], and these dependency constraints will be observed during component selection. 2.2 QoS Attributes and Properties We consider each measurable system property as a QoS attribute. For example, in an audio processing system, there may exist attributes such as execution time, memory requirement, and voice quality, etc. Let A = (a1, a2, …, aN) denote the vector of N QoS attributes for a system specified by G, where each ai in A is a measurable QoS attribute that has quantifiable values. A component can be described by its functional specifications and QoS properties. The QoS property of a component is defined by the measurements of its QoS attributes. Let Ac denote the set of QoS attributes of a component c. Assume that c is a component selected to instantiate a module in G. Generally, Ac is a subset of A. Some QoS attributes of the system may have nothing to do with some of the individual components that compose the system. Also, a QoS attribute that is not relevant to G is not useful for any component in G. In this paper, we consider Ac = A for simplicity. The QoS property of component c can be represented by a vector Vc = (vc1, vc2, …, vcN), where vci is the QoS measurements of c in terms of attribute ai. If the component c does not have a certain measurable attribute ai, a NULL value is assigned to vci. We assume that a module can only be instantiated by one component. The QoS property V of a module is defined to be the QoS properties Vc of the component c that instantiates it. In a similar way, the QoS properties V of the system G are defined to be the measurements of the QoS attributes in A. The QoS properties of a system can be derived from the QoS properties of the components that compose the system. The QoS measurements of a system G can be determined by the execution environment, input domain, and configurable parameters. A configurable parameter is -3- a parameter in a component that, when adjusted, can impact the measurements of one or more of the QoS attributes. We assume that the execution environment and the input domain are given. Different QoS properties of a component and the composed system can be obtained by tuning the configurable parameters. The configurable parameters of each component form a K-dimensional parameter set X = (x1, x2, …, xK) (where K is the total number of configurable parameters). The QoS property vci for attribute ai of a component c can be measured and plotted against the K-dimensional parameter space. Let fci(Xc) denote a property function of component c in term of QoS attribute ai, where Xc is the parameter set of the component c. A property function fci(Xc) is the relation that maps the values of the parameter set Xc to a unique value of vci for attribute ai. The property function vector Fc(Xc) of a component c is defined as a vector of property functions for all QoS attributes (fc1(Xc), fc2(Xc), …, fcN(Xc)). For a system G, we define property function set as F = { Fc | c C } which is the set of property functions of all components that comprise G. 2.3 QoS Requirements, Constraints Objectives, and As described in Definition 1, a system specification includes the QoS requirements specification . For embedded systems, it is crucial to consider the QoS requirements in terms of the QoS attributes. For example, the memory constraint can be defined on the memory attribute of the system, and the requirement to maximize the computation precision is defined on the precision attribute of the system. We divide into two types of requirements: constraints and objectives, according to the requirements for being satisfied or optimized respectively. In our model, each objective or constraint is based on one or more QoS attributes. They are defined in the following. Definition 2. [QoS requirements] For a system specification G, the set of QoS requirements is defined as = (O, R), where O = (o1, o2, …, oJ) is the set of objectives, and R = (r1, r2, …, rI) is the set of constraints. The objective o j, 1 j J, is an optimization function denoted as j j(V) where j is an optimization operator, such as maximize or minimize, and j(V) is an objective function over QoS property V. Each QoS objective is to optimize the system QoS properties over one or more QoS attributes. A constraint ri is an inequality defined over V specifying the bound for the system QoS property. It can be expressed as follows: i(V) i i (1 i I), (1) where i is the constraint function of G in terms of QoS properties V, i is the comparison operator, and i is the bound for i(V). The goal of the system is to satisfy all the QoS constraints while obtaining optimal solutions in terms of the QoS objectives. All objective functions and constraint functions of the system specification G are in terms of QoS attributes A of G and can be computed from the QoS properties V of G. Example 1. Consider an IP phone system. Assume that the execution time bound and the memory limitation are the constraints that must be satisfied. This system may include components such as a voice codec and echo canceller. The voice quality and echo canceling quality are the QoS objectives of the system. The QoS attributes for a voice codec may be execution time, required memory, and perceptual speech-quality measurement (PSQM). Here PSQM is defined in ITU-T Recommendation P.861 to measure the clarity of the voice. The QoS attributes for echo canceller may include the execution time, required memory, and Terminal Coupling Loss (TCL), an echo canceling quality measure. So they have different QoS attribute sets although they are in the same system. We can define the QoS attribute set for the system as A = (a1, a2, a3, a4) = (execution time, required memory, PSQM, TCL). The QoS attribute sets for the voice codec and echo canceller are the same but the TCL value will be set as NULL for voice codec and the PSQM value will be set as NULL for echo canceller. 2.4 Aggregation The system QoS properties are the cumulative effect of the QoS properties of the components that instantiate the modules of the system. Here we express QoS properties of a system G in terms of the QoS properties of the glue code cg and the selected components {cl | 1 l L}, where cl instantiates the module l of the system. Definition 3. [Aggregate Operator] The Aggregate Operator i is for computing the composition effect of the properties of multiple modules of the system specification G. It is presented as vi = i(vgi, v1i, v2i, …, vLi), 1 i N, where L is the number of the modules, N is the number of the attributes, vi is the QoS property of G in terms of attribute ai, vgi is the property of the glue code in terms of attribute ai, and vl i is the property of module l in terms of attribute ai. For a system specification G, we define aggregate operator set as Σ = { i | 1 i N }. Aggregate operator computes the QoS properties of a system from the QoS properties of the modules. Notice that a module is a virtual unit. Its QoS properties are determined by the component that instantiates it. In most situations, we can assume the effect of the glue code is small and can be omitted, so the aggregate operator can be represented as vi = i(v1i, v2i, …, vLi). Aggregate operator is highly application dependent. For some common models, standard aggregate operators can be defined. For example, in the SDF model, the aggregate operator involves the computation of a schedule. Based on -4- the selected components, a schedule can be determined by the algorithm discussed in [3]. With a fixed schedule the aggregate system QoS properties, such as end-to-end latency and memory requirement, can be computed. 3. Composition Specification Analysis Problem In general, an embedded system has multiple QoS objectives and multiple constraints. As shown in Example 1, the IP phone system has requirements to optimize voice and echo canceling quality, and the constraints on the execution time and the required memory. Thus, during the composition analysis, the system configuration problem is mapped to a constrained multi-objective optimization. Given the specification of a system G = (, , ), the set of components C that can satisfy the functional requirements of each module in are identified. We assume that each component c C is parameterized and has configurable parameter set Xc. Special mechanisms to make a component parameterizable have been discussed [5]. The configurable parameters of the entire system can be defined as X = {Xc | c C}. Let F be the set of property functions and the aggregate operators of G. Notice that C, X, F, are known factors of G for the composition analysis. Let T denote all these known factors of G, where T = (C, X, F, ). The composition analyzer can analyze the composition based on the QoS constraints R and objectives O of G with given T. The goal of composition analysis is to determine the best selection of components to instantiate the modules and find the best configurable parameter settings to the components to optimize the QoS properties of the system G. A feasible solution has to satisfy constraints in R. Based on these feasible solutions, the goal is to get the optimal benefits in terms of QoS objectives in O. Here we discuss the multi-objective problem of the composition analysis. 3.1 Multi-Objective Optimization Problem for Composition Analysis Here we formerly defined the multi-objective optimization problem for composition analysis for a given system specification G. Definition 4. [Multi-Objective Optimization for Composition Analysis] optimize O = (o1, o 2, …, o J) (J 2) subject to QoS property set: V = (v1, v2, …, vN ) Constraints R: i(V) i i (1 i I) Objectives O: o j = j j(V) (1 j J) Each QoS property: vi = i(v1i, v2i, …, vLi) Component selection: 1 sel(l) Mlwhere sel( l) is the index of the selected component Module QoS property: vli = m=1Ml((sel( l) − m) * fl,mi(Xl,m)) where i f x=0 x 0,otherwise Our goal is to optimize all the objectives O of G. Note that O is the objective set of the system with J objectives and G has I constraints. As defined in Section 2.3, both objectives and constraints of the system are in term of the QoS attributes A. As described in Section 2.2, QoS property of the components vl,m i can be derived from the property function fl,mi and configurable parameters Xl,m. Then the module QoS property vli can be determined by component selection bl and the QoS property of the selected component. As defined in Section 2.4, the QoS property of the system vi can be computed from the QoS properties of the modules with the aggregate operator i. 3.2 Pareto-optimal Solutions Optimizing a system with multiple objectives may result in conflicts [20]. The tradeoff between the multiobjectives must be considered and a decision can be made based on it. Frequently the decision can be left to the system designer with the help of the composition analysis in a form of a solution set. Pareto-optimal solutions [20] [4] are commonly used model for multi-objective optimization problems. In our model, the goal for composition analysis is to find the feasible Pareto-optimal solutions. For a problem with J objectives (o1, o 2, …, o J), a solution s = (os1, os2, …, osJ) dominates another solution s′ = (os′1, os′2, …, os′J) if both of the following conditions are satisfied: s is no worse than s′ in any attributes, s is strictly better than s′ in at least one attribute. It can be denoted as s ≻ s′ or s′ ≺ s. A solution s is defined as covering another solution s′ if s is no worse than s′ in any attributes. It can be denoted as s ≽ s′ or s′ ≼ s. If a solution s cannot be dominated by another solution s′, it can be said that s is non-dominated by s′. If a solution s is non-dominated by all other solutions in a solution set , it is called the Pareto-optimal solution in . The set of all the non-dominated solutions of is called the Paretoset of . Recall that the first goal of the composition analysis mentioned in Section 3.1 is to satisfy the QoS constraints of the system specification and then to optimize the QoS attributes, so the goal of the composition analysis is to find the Pareto-set within the feasible solutions. 4. Evolutionary Algorithm for Composition Analysis In this section, we discuss how to apply evolutionary algorithm to the composition analysis problem discussed in Sections 2 and 3. In Section 4.1, we survey and compare several multi-objective evolutionary algorithms and -5- discuss the advantages of NSGA-II, which is selected as the basis algorithm for composition analysis. Mapping of the composition analysis problem into the evolutionary algorithm model is discussed in Section 4.2. Finally in Section 4.3, we introduce our NSGA-CAA algorithm for composition analysis. 4.1 Multi-Objective Evolutionary Algorithms As described in Section 3.2, for composition analysis, we need to find Pareto-optimal solutions of the embedded system based on multiple QoS objectives and constraints. Some features of the composition can make the problem very hard, which include the conflicting multiple objectives, the exponentially increasing search space, and the mix of continuous and discrete configurable parameters of the components. The classical search algorithms, such as linear programming and gradient search, are not efficient for the multi-objective problems [7]. Randomized algorithms can be used to deal with the problems, such as local minimal trap, but they are not always efficient. An evolutionary algorithm is a partially randomized exploratory procedure based on an analogy with the biological evolution. Multi-Objective Evolutionary Algorithms (MOEAs) have been shown to converge quickly to the true Pareto-optimal solutions with a widely spread solution set. Multi-Objective Genetic Algorithm (MOGA) [9], Non-dominated Sorting Genetic Algorithm (NSGA) [18], Niched Pareto Genetic Algorithm (NPGA) [12] are early MOEAs using non-dominated sorting with a niching mechanism to improve the diversity of the solutions. Elitism has been shown to be an important factor for improvement on the MOEAs [26]. Elitism is a strategy to preserve better solutions in an external archive. Subsequent algorithms, such as, the improved version of Non-dominated Sorting Genetic Algorithm (NSGA-II) [8], Pareto-archived Evolution Strategy (PESA) [6], and the improved version of Strength-Pareto Evolutionary Algorithm (SPEA2) [27], have incorporated strategies to realize elitism to gain significant performance improvements. An elitism algorithm NSGA-II has been introduced [8] that yields significant improvement in time complexity, diversity preservation, and constraint satisfaction. It is shown that the algorithm outperforms PESA and SPEA [25] in terms of finding a diverse set of solutions and in converging near the true Pareto-optimal set. V. Khare et al. [13] have compared the scalability of the algorithms with respect to the number of objectives (two to eight): NSGAII, PESA, and SPEA2. The result shows that PESA is the best algorithm in terms of converging to the Paretooptimal front, but is poor in diversity maintenance. SPEA2 and NSGA-II perform equally well on convergence and diversity maintenance, but NSGA-II converges faster than SPEA2. Thus, we use NSGA-II as the basis algorithm for the composition analysis and make improvements to it for the specific problem we have defined. 4.2 Mapping the Composition Analysis Problem to the Evolutionary Algorithm The goal of the composition analysis is to find Paretooptimal solutions of the system based on multiple objectives O while satisfying the QoS constraints R (as defined in Definition 2). Here we map this problem to the evolutionary algorithm paradigm. 4.2.1 Individual Representation Each individual in the population represents a solution of the embedded system, which includes the component selections and the configurable parameter settings. Fig. 2 illustrates how an individual solution is represented in a hierarchical view. The vector consists of rep( l), for all l, which represents the characteristics of the set of modules that comprise the system (as shown in Fig. 2a). Each rep( l) is further represented by sel( l) and rep(cl,m), 1 m Ml, where rep(cl,m) is the characteristics of the component cl,m and sel( l) is the index of the component selected to instantiate the module (as shown in Fig. 2b). The value for sel( l) is in the range [1, Ml] where Ml is the number of components. If there is only one component to instantiate l, then sel( l) is not needed and rep( l) is the same as rep(cl,1). If the module has more than one candidate component, then rep(cl,m), for all m, are listed one by one after sel( l). A component representation, rep(cl,m), is further characterized by its configurable parameters xi(cl,m) (as shown in Fig. 2c). Configurable parameters may have different data types, such as float, integer, and discrete values. For the convenience of mutation operator described later in Section 4.2.2, we require each configurable parameter to have a fixed range. For configurable parameters without lower and upper bounds, a relatively large range is assigned. Due to the inherited limitations imposed by hardware on the data values, these extended bounds have no significant impact on the evolutionary process. Note that all the component selection parameters sel( l) and configurable parameters xi(cl,m) are basic elements in an individual representation and are called the genes of the individual. 4.2.2 Mutation In a mutation process, a new individual is generated from a randomly picked individual by modifying it slightly. The probability of mutation of one individual is controlled by a mutation rate, which is normally set as 1/K where K is the population size. Here we set it as 0.01 for our experimental study. For the individual to be mutated, the mutation point is chosen randomly and the gene of the mutation point is modified. We use a random value in a predetermined range to replace the existing gene. An individual has -6- sel( l) and configurable parameters as its genes. sel( l) has clear upper and lower bounds. As described in Section 4.2.1, each configurable parameter has a fixed range. Thus, we define the mutation function mu(x) as follows: x = mu(x) = U(xl, xu), (2) where x is a gene in the individual, x is the new gene value, xl and xu are the lower and upper bounds of x, respectively, and U(xl, xu) is a uniform random value between xl and xu. Since the mutation point is chosen randomly, it may correspond to a configurable parameter of a component that is not selected. In this case, the mutation operator is a. Whole individual representation of the system with L modules l (1 l L): rep(1) rep(2) … rep(L) b. The representation of module rep(l) (1 l L): [sel(l)] rep(cl,1) [rep(cl,2)] … [rep(cl,Ml)] c. The representation of component rep(cl,m) (1 l L, 1 m Ml): x1(cl,m) x2(cl,m) … xK(cl,m) Figure 2. Individual representation. ineffective. We call this problem selection redundancy problem. It is caused by the multiple candidate components for instantiating one module. Our solution is to change the component selection as well. When the mutation point locates on the representation of component cl,i but sel(l) i, then we set sel( l) = i. 4.2.3 Recombination While mutation generates a new individual from one parent, recombination process exchanges the genes between more than one parent to reproduce new individuals. We use one-point recombination due to its effectiveness and simplicity. Similar to the mutation process, the crossover point for recombination of two individuals is generated randomly. The one-point recombination process exchanges the genes of two parents on and after the crossover point to reproduce two offsprings. The selection redundancy problem introduced in Section 4.2.2 also occurs in the recombination process. When the crossover point locates on the component selection value sel( l), the genes for module l of two parents are exchanged totally. When the crossover point is on a configurable parameter of a component cl,m (1 m Ml), the situations are different for m sel(l) and m > sel( l). The exchange of the genes has no effect on module l in the second situation. In the situation m > sel( l) , we set sel( l) = m. We improve the one-point recombination operator to One-Point Composition Analysis Recombination (OPCAR) operator to take the selection redundancy problem account. The algorithm for OPCAR operator is given as follows. Algorithm OPCAR Input: s1, s2: two parents Output: s1′, s2′: two offspring 1 s1′ := s1 2 s2′ := s2 3 set j as length of s1 4 i := U(1, j) 5 let i be in the representation of module l 6 if s1′.sel(l) < I 7 s1′.sel(l) := I 8 if s2′.sel(l) < I 9 s2′.sel(l) := I 10 while (i j) 11 s1′(i) = s2(i) 12 s2′(i) = s1(i) 13 i := i + 1 The probability of recombination of two individuals is generally bounded. We set it to 0.8 in our model. 4.3 NSGA-CAA Algorithm First, we discuss the special techniques used in NSGAII [8]. NSGA-II uses the crowding pick approach. It first uses non-dominated sort to give each solution a priority. Then the crowding distance sort is used to further sort the solutions with the same priority decided by the nondominated sort. Non-dominated sort first gives the nondominated solutions of a set the highest priority, then removes them and gives the non-dominated solutions of the remaining set the second highest priority, and so on, until all the solutions are given priorities. The crowding distance computation first sorts the same priority solutions according to each objective. The solutions with highest or lowest value are then assigned an infinity distance value. The distance values of all other intermediate solutions are the amount of the absolute normalized differences between the objective values of two adjacent solutions for each objective. The solution with a larger distance value gets a higher priority. With non-dominated sort and crowding distance sort, we can pick K highest priority solutions from the union of previous population and elite set as new elite set. Elite set is an external set different with the population. The algorithm maintains it to store a fixed number of solutions that are non-dominated among all solutions ever generated. For the population with size K of a J-objective problem, NSGA-II has a storage requirement O(K 2) and time complexity O(JK2). The algorithm NSGA-CAA based on NSGA-II for composition analysis is shown in the following. Algorithm NSGA-CAA Input: K: population and initial elite set size T: generation number Output: Pr: result non-dominated set 1 Initialize population P0 2 fast-non-dominated-sort(P0) -7- 3 Create empty elite set P0′ 4 t := 0 5 while (t < T) do 6 P t+1′ := NSGA-crowding-pick (Pt, Pt′, K) 7 Select Pt+1 from P t+1′ 8 Recombine Pt+1 9 Mutate Pt+1 10 fast-non-dominated-sort(Pt+1, Pt+1′) 11 t := t + 1 12 end while 13 Pr := non-dominated-set (Pt′) The Evolutionary Algorithm utilizes a learning process on the population (the collective of the individuals that represent the solutions of a problem). Each individual consists of genes that are the decision variables of the solution. In NSGA-CAA, an elite set is maintained in each generation by picking the first N individuals from the population and the elite set of previous generations. In steps 7, 8, and 9 of the algorithm, we use binary tournament selection, one-point recombination, and onepoint mutation to generate the new population. By binary tournament selection (BTS), in each generation, we select K solutions from elite set. Each time two solutions are picked from the elite set randomly and one solution with higher priority according to non-dominated sort and crowding distance sort is selected. It continues till we get K solutions. The BTS algorithm is shown as follows. Algorithm BTS Input: K: population size P: elite set Output: P′: result new population 1 k := 0 2 while (k < K) do 3 randomly pick up two individual s1, s2 from P 4 if s1 has higher priority 5 Put s1 in P′ 6 else 7 Put s2 in P′ 8 end while In the system requirements, there are constraints (R) as well as objectives (O). Thus, we need to also consider R in the evolutionary process. We use the constraint handling approach discussed in [8]. The approach handles constraints without destroying the modularity in NSGA-II. It modifies the definition of “dominate” to “constraineddominate”. A solution s is said to constrained-dominate a solution s′ if any of the following conditions are true: (a) solution s is feasible and solution s′ is not, (b) solutions s and s′ are both infeasible, but solution s has a smaller overall constraint violation, or (c) solution s and s′ are feasible and solution s dominates solution s′. We replace the operation “dominate” by “constrained-dominate” in NSGA-CAA. The overall constraint violation of a solution in condition (b) is computed by adding the normalized distance between solution value and constraint for each violated constraint. With this constraint handling approach, the result solutions may converge to satisfy the QoS constraints (R). But in some cases, the evolutionary algorithm cannot obtain feasible solutions. The algorithm then presents the solutions nearest to the constraints bounds and provides options for user to choose, such as rerun the algorithm with more generation rounds, relax the QoS constraints, or insert more candidate components, etc. When the solutions satisfy the QoS constraints, the result of the composition analysis is a small set of feasible Pareto-optimal solutions. Comparing to the original exponential exploration space, the result set is much smaller. This set can be further reduced to a fixed number of solutions using the crowding-distance rating. On each round, the algorithm removes the individual with the smallest crowding-distance. This step can be repeated until the preset size of result is reached. 5. Case Study In this section, we go through the whole composition analysis process by analyzing an example system G composed of 9 modules l, 1 l 9. The system G has two QoS objectives O = (o1, o2) and two constraints R = (r1, r2) in terms of four QoS attributes A = (a1, a2, a3, a4). We simulate property functions of each component using test functions for multi-objective evolutionary algorithm. These two-objective functions are introduced or cited by Zitzler et al. [26], [27] to compare the evolutionary algorithms. We choose some of them, listed in Table 1, which include general situations of the twoobjective optimization problem. These functions are both for minimization problems and each one has two property functions, which are given by fi, 1 i 2. In Table 1, the domain column presents the range of the parameters and K denotes the number of parameters. Note that K is not fixed for each test function. Here the variables xi, for all i, simulate the configurable parameters of the components. The characteristics column lists the Pareto-optimal front characteristic of each function. We use these test functions to represent the property functions of the components. The number of configurable parameters |Xl,m| and property functions fl,mj (1 j 4) of six components cl,m are shown in Table 2. Due to page -8- limitation, we do not show all of them here. The ranges of the configurable parameters are the same as the domain of the functions assigned to the components. Table 3 shows the aggregate operators of the system G based on the nine modules l, 1 l 9. The objective function j of o j, 1 j 2, and the constraint function i of r i, 1 i 2, for G are presented in Table 4. We set both QoS objectives as minimization problems and each QoS constraint has an upper bound. After preparing the system specification G, we set NSGA-CAA to analyze the composition of G. Here the population size is set to 100 and generation number 200. The result set can be reduced by removing the individuals with the smallest crowding-distance. That is, in each round an individual with the smallest distance to its neighbors is found and removed from the set. It keeps on running until the result set is reduced to a fixed number, which is set to 20 here. The result of composition analysis is shown in Fig. 3. Figures 3a and 3b show the solutions in the objective and constraint space, respectively. Each small red rectangle in the figures represents a solution. There are 20 solutions. Fig. 3a demonstrates the tradeoff between objectives o1 and o2. Fig. 3b shows the same 20 solutions, but in the constraint space. In Sections 4.2.2 and 4.2.3, we improved the general mutation and recombination operators of the evolutionary algorithm to resolve the selection redundancy problem. The effect of the improvement was studied by comparing the Pareto-optimal solutions obtained by the algorithm NSGA-CAA with different operators. Fig. 3 shows that improved operators can provide better convergence to the real Pareto-optimal front and yield better constraint. 6. Related Work Previous research work regarding software component selection for satisfying QoS specification is limited. An NFR-Assistant tool has been presented [22] that assists Table 1. Test problems for simulating the property functions. Name KUR Type min Domain [5, 5]K QV min [5, 5]K ZDT1 min [0, 1]K ZDT2 min [0, 1]K ZDT3 min [0, 1]K ZDT6 min [0, 1]K Test functions f1 = i=1K ( |xi|0.8 + 5sin3(xi)) + K f2 = i=1K 1(1 exp(0.2 (xi2 + xi+12)1/2 )) f1 = ( (1/n) · i=1K (xi2 10cos(2xi) + 10) )0.25 f2 = ( (1/n) · i=1K ((xi 1.5)2 10cos(2(xi 1.5)) + 10) )0.25 f1 = x1 g = 1 + 9·i=2K xi / (K 1) f2 = g · (1 (f1 / g) ½) f1 = x1 g = 1 + 9·i=2K xi / (K 1) f2 = g · (1 (f1 / g) 2) f1 = x1 g = 1 + 9·i=2K xi / (K 1) f2 = g · (1 (f1 / g) ½ (f1 / g) · sin(10f1)) + 1 f1 = 1 exp(4 x1) · sin6(6x1)) g = 1 + 9 · ((i=2m xi ) / (m 1))0.25 f2 = g · (1 (f1 / g)2 ) Characteristics Both convex and non-convex Non-convex front Convex front Non-convex front Noncontiguous convex Non-uniformly, lower density Table 2. Configurable parameters and attribute functions of the components. (a) Comparison in the objective space. (b) Comparison in the constraint space. Figure 3. NSGA-CAA with improved operators vs. normal operators. -9- cl,m |Xl,m| c1,1 c1,2 c1,3 c2,1 c2,2 c3,1 3 4 3 4 5 4 fl,m1 ZDT3. f1 ZDT1. f1 ZDT1. f1 ZDT3. f1 ZDT3. f1 QV. f1 Property functions for cl,m fl,m2 fl,m3 fl,m4 ZDT3. f2 ZDT6. f1 ZDT6. f2 ZDT1. f2 ZDT2. f1 ZDT2. f2 ZDT1. f2 ZDT2. f1 ZDT2. f2 ZDT3. f2 ZDT6. f1 ZDT6. f2 ZDT3. f2 ZDT6. f1 ZDT6. f2 QV. f2 KUR. f1 KUR. f2 with the nonfunctional (QoS) requirement analysis and exploration of design alternatives through a graphical interface. The approach can be used for automated design decision-making, but it is based on exhaustive search and only considers one nonfunctional attribute at a time. The real-time requirement has been addressed [21] and tools have been provided for the selection of components satisfying real-time constraints. The approach uses an exhaustive search and only considers real-time aspects. The trade-off problem for a software agent pipeline system has been formulated [23] and the expression for optimized time-quality attributes has been derived. However, a continuous quality and time function is considered which is not always applicable for the selection of components and their parameter settings. Along another direction, code synthesis techniques have been used to facilitate the component integration process. Ptolemy [15] provides a framework for simulating Table 3. Aggregate operators of the system. QoS attributes a1 Aggregate function v1 = 2.0 * v11 + v21 + v31 + 2.0 * v41 + v51 + v61 + 2.0 * v71 + v81 + v91 v2 = v12 * v22 * v52 3 v = max(v33 , v43 , v63 , v73, v83, v93) v4 = v14 + 0.8 * v24 + 2.0 * v34 + 2.0 * v44 + v54 + 0.5 * v64 + v74 a2 a3 a4 component based embedded software development. The component composition analysis of the system is based on the QoS requirements of the system and can provide a set of non-dominated solutions, which indicate the component selections and parameter settings of the system. Evolutionary algorithm is used to search in the solution space and obtain solutions with good performance in terms of multiple QoS objectives. We use an example to go through all the steps of composition analysis and provide a result set of feasible solutions. Table 4. Objective / constraint functions of the system. QoS requirements o1 2 o r1 r2 Objective / constraints functions 1 = v1 + v4 2 = v2 + v3 1 = v3 2 = v4 Operators Min Min 14 9 and synthesizing embedded systems. A user can compose a system from components. Code generation in Ptolemy focuses on buffer management and component scheduling [1] [16]. Chinook [2], a successor of Ptolemy, supports code synthesis for multiple platforms and various component interfaces. These approaches are successful in modeling embedded software at an abstract level to facilitate code synthesis for various platforms, but they do not address the problem of selection of components to satisfy the QoS requirements of the system. Users have to specify specific components that are used to compose a system or a subsystem. VEST [19] is a toolset for constructing component-based embedded systems by analyzing the component dependencies and nonfunctional requirements. However, this toolset does not provide optimization search on the feasible configurations and does not consider the components with parameters that can be configured to achieve different QoS properties. Several research works consider QoS issues in component-based software development. Most of them focus on prediction of system QoS properties. Hissam et al. [11] presented a prototype prediction-enabled component technology (PECT) to integrate a software component technology with one or more analysis technologies. The prediction of end-to-end latency of an assembly has been provided as an example. The theoretical and empirical validities of prediction show that the composition analysis is reasonable. The prediction of other properties, such as resource consumption [17], reliability [10], also has been presented. These works form a solid foundation of composition analysis and can be embedded in our model. 7. Conclusion We have developed a composition analysis model for - 10 - References [1] S.S. Bhattacharyya, P.K. Murthy, and E.A. Lee, "Synthesis of Embedded Software from Synchronous Dataflow," Journal of VLSI Signal Processing Systems, vol. 21, no. 2, June 1999. [2] G. Borriello, P. Chou and R. Ortega, “Embedded System Co-Design Towards Portability and Rapid Integration,” Hardware/Software Co-Design: Proc. of the 1995 NATO Advanced Study Institute, M. Sami and G.D. Micheli, eds., Kluwer Academic Publishers, pp. 243-264, 1995. [3] J.T. Buck, “Scheduling Dynamic Dataflow Graphs with Bounded Memory Using the Token Flow Model,” PhD thesis, Univ. of California, Berkeley, 1993. [4] V. Chankong and Y.Y. Haimes, Multiobjective Decision Making Theory and Methodology, New York: NorthHolland, 1983. [5] K. Cooper, J. Zhou, H. Ma, I-L. Yen, and F. B. Bastani, “Code Parameterization for Satisfaction of QoS Requirements in Embedded Software,” Proc. of the Int’l Conf. on Eng. of Reconfigurable Systems and Algorithms, pp. 58-64, June 2003. [6] D.W. Corne, J.D. Knowles, and M.J. Oates, “The Pareto Envelop-based Selection Algorithm for Multiobjective Optimization,” Parallel Problem Solving from Nature – PPSN VI, M. Schoenauer et al., eds, Berlin. Springer, pp. 839-848, 2000. [7] K. Deb, Optimization for Engineering Design: Algorithms and Examples, New Delhi: Prentice-Hall, 1995 [8] K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan. “A Fast and Elitist Multiobjective Genetic Algorithm: NSGA-II,” IEEE Trans. Evolutionary Computation, vol. 6, no. 2, pp. 182-197, Apr. 2002. [9] C.M. Fonseca and P.J. Fleming, “Multiobjective Optimization and Multiple Constraint Handling with Evolutionary Algorithms—Part I: A Unified Formulation,” IEEE Trans.Systems, Man, and Cybernetics-Part A: Systems and Humans, vol. 28, no. 1, pp 26-37, Jan. 1998. [10] D. Hamlet, D. Mason, and D. Woit, “Theory of Software Reliability Based on Components,” Proc. of the 23rd Int’l Conf. on Software Eng., pp. 361-370, May 2001. [11] S. A. Hissam, G. A. Moreno, J. Stafford, K. C. Wallnau, “Packaging Predictable Assembly with Prediction-Enabled Component Technology,” CMU/SEI-2001-TR-024, Nov. 2001. [12] J. Horn, N. Nafpliotis, and D.E. Goldberg, “A Niched Pareto Genetic Algorithm for Multiobjective [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] Optimization,” Proc. of the First IEEE Conf. on Evolutionary Computation, vol. 1, pp. 82-87, 1994. V. Khare, X. Yao, and K. Deb, “Performance Scaling of Multi-objective Evolutionary Algorithms,” Proc. Of the Second Int’l Conf. on Evolutionary Multi-Criterion Optimization, pp. 376-390, 2003. E.A. Lee and D.G. Messerschmitt, “Synchronous Data Flow,” Proc. of the IEEE, vol. 75, no. 9, pp. 1235-1245, Sept. 1987. E.A. Lee, "Overview of the Ptolemy Project," Technical Memorandum UCB/ERL M01/11, University of California, Berkeley, Mar. 2001. T. Miyazaki and E.A. Lee, "Code Generation by Using Integer-Controlled Dataflow Graph," Proc. of ICASSP 97, pp.703-706, Apr. 1997. J. Muskens, M. Chaudron, “Prediction of Run-Time Resource Consumption in Multi-task Component-Based Software Systems,” Proc. of the 7th Int’l Symposium on Component-Based Software Eng., May 2004. N. Srinivas, K. Deb, “Multiobjective Optimization Using Nondominated Sorting in Genetic Algorithms,” Evolutionary Computation, vol. 2, no. 3, pp. 221-248, 1994. J.A. Stankovic, “VEST – A Toolset for Constructing and Analyzing Component Based Embedded Systems,” Proc. of the First Int’l Workshop on Embedded Software, pp. 390-402, Oct. 2001. E. Steuer, Multiple Criteria Optimization: Theory, Computation, and Application, Wiley, 1986. R.A. Steigerwald, “Reusable Component Retrieval for Real-time Applications,” Proc. IEEE Workshop on RealTime Applications, pp. 118-120, May 1993. Q. Tran and L. Chung, “NFR-Assistant: Tool Support for Achieving Quality,” IEEE Symp. Application-Specific Systems and Software Engineering and Technology, pp. 284-289, Mar. 1999. I-L. Yen and I.-R. Chen, “Reliability Assessment of Multiple-agent Cooperating Systems,” IEEE Trans. Reliability, Sep. 1997. I-L. Yen, L. Khan, B. Prabhakaran, F.B. Bastani, J. Linn, “An On-line Repository for Embedded Software,” 13th IEEE Int’l Conf. on Tools with Artificial Intelligence, pp. 314-320, Nov. 2001. E. Zitzler and L. Thiele, “Multiobjective Evolutionary Algorithms: A Comparative Case Study and the Strength Pareto Approach,” IEEE Trans. Evolutionary Computation, vol. 3, no. 4, pp. 257-271, 1999. E. Zitzler, K. Deb, and L. Thiele. “Comparison of Multiobjective Evolutionary Algorithms: Empirical Results,” Evolutionary Computation, vol. 8, no. 2, pp. 173195, 2000. E. Zitzler, M. Laumanns, and L. Thiele, “SPEA2: Improving the Strength Pareto Evolutionary Algorithm for Multiobjective Optimization,” Proc. of the EUROGEN2001 Conference, pp. 95-100, 2001. - 11 -