Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Institut f. Statistik u. Wahrscheinlichkeitstheorie 1040 Wien, Wiedner Hauptstr. 8-10/107 AUSTRIA http://www.statistik.tuwien.ac.at Fuzzy data and statistical modeling R. Viertl Forschungsbericht SM-2007-1 März 2007 Kontakt: [email protected] FUZZY DATA AND STATISTICAL MODELING Reinhard Viertl Department of Statistics and Probability Theory Vienna University of Technology Wiedner Hauptstraße 8-10, 1040 Wien, Austria Phone: +43 1 58801-10720 email:[email protected] Abstract Data are frequently not precise numbers but more or less non-precise, also called fuzzy. Before analyzing such data the mathematical description of fuzzy data is necessary. This is possible using fuzzy models. Based on this, descriptive data analysis as well as statistical modeling have to be adapted. Basic methods for this are described in this contribution. 1 Introduction Statistical modeling is used to describe variability of quantities and errors in observations. But these models assume the observations to be numbers or vectors. This assumption is often not realistic because measurement results of continuous quantities are always not precise numbers but more or less non-precise. This kind of uncertainty is different from errors and variability. Whereas errors and variability can be modelled by stochastic variables and probability distributions, imprecision is another kind of uncertainty, called fuzziness. For a quantitative description of such data the most up-to-date method is to use fuzzy numbers and fuzzy vectors which are special fuzzy models (compare [1]). Based on this data description, data analysis has to be adapted. This is possible and explained in the paper. Another kind of fuzziness in statistical modeling is uncertainty of a-priori information in Bayesian analysis. Based on histograms for fuzzy data, which are generalizations of classical histograms, called fuzzy histograms, so-called fuzzy probability distributions can be used to model a-priori information in a more realistic way. Last but not least it is necessary to have corresponding software available. A package for related procedures is under development, called AFD (analysis of fuzzy data). Procedures for descriptive statistical analysis are already available in the programming language C++. 2 Fuzzy Data, Fuzzy Numbers, and Fuzzy Vectors One-dimensional fuzzy data are obtained by measurement results of continuous quantities like length, volume, time, mass, concentrations, and so on. The results of such measurement procedures are not precise numbers but more or less fuzzy. The simplest case are results as decimal numbers with a finite number of digits, these are mathematically intervals [x0 , x0 ], which are subsets of IR. 1 Looking at oscilloscope equipments results are light ”points” on a screen. These ”points” are characterized by light intensities. In one dimension this light intensity is a function f (·) of one real variable. Normalizing this function, i.e. g(x) := f (x) max {f (x) : x ∈ IR} ∀ x ∈ IR a function g(·) is obtained obeying (1) 0 ≤ g(x) ≤ 1 ∀ x ∈ IR (2) ∀ δ ∈ (0, 1] the so-called δ-cut Cδ [g(·)] := {x ∈ IR : g(x) ≥ δ} 6= ∅. Therefore g(·) is a generalization of indicator functions. Such functions were used by K. Menger in 1951 to define gneralized sets, later called fuzzy sets by L. Zadeh. Specialized fuzzy subsets of IR are called fuzzy numbers. Remark 1: This concept is more general than fuzzy numbers in current fuzzy set literature. But for realistic description of fuzzy data this is necessary. The formal definition of fuzzy numbers is the following: Definition 1: A fuzzy number x? is determined by its so-called characterizing function ξ(·) which is a real function of one real variable x obeying the following: (1) ξ : IR → [0, 1] (2) ∀ δ ∈ (0, 1] the so-called δ-cut Cδ (x? ) := {x ∈ IR : ξ(x) ≥ δ} is a finite union of compact intervals [aδ,j , bδ,j ], i.e. Cδ (x? ) = k Sj j=1 [aδ,j , bδ,j ] 6= ∅ (3) The support of ξ(·), defined by supp [ξ(·)] := {x ∈ IR : ξ(x) >} is bounded Definition 2: A special kind of fuzzy numbers are so-called fuzzy intervals, i.e. fuzzy numbers for which all δ-cuts are compact intervals. Remark 2: Precise numbers x0 ∈ IR are represented by its characterizing function ξ(·) = I{x0 } (·), i.e. a one-point indicator function. Data in form of intervals [x, x] are represented by its indicator function I[x,x] (·). The following lemma is basic for storing fuzzy data in data bases. Lemma 1: For the characterizing function ξ(·) of a fuzzy number x? the following holds: n o ∀ x ∈ IR ξ(x) = max δ · ICδ (x? ) (x) : δ ∈ [0, 1] The proof is given in [4]. Remark 3: In applications a finite number of δ-cuts are stored in data bases. For vector quantities realistic data are also not precise vectors for continuous quantities. Examples are positions on radar screens, or results of two-dimensional quantities on oscilloscopes. In this case the resulting ”point” is a light point with fuzzy boundary whose light intensity h(x, y) is a function of two real variables x and y. Similar to the one-dimensional case, normalizing this function, i.e. g(x, y) := n h(x, y) max h(x, y) : (x, y) ∈ IR 2 2 o ∀ (x, y) ∈ IR2 yields a function characterizing the fuzzy light point. This is a special so-called fuzzy vector x? = (x, y)? . In general for k ∈ IN so-called fuzzy vectors are defined in the following way. Definition 3: A k-dimensional fuzzy vector x? is determined by its so-called vectorcharacterizing function ζ(·, · · · , ·) which is a real function of k real variables x1 , · · · , xn obeying the following: (1) ζ : IRk → [0, 1] (2) The support of ζ(·, · · · , ·) is a bounded set n o (3) ∀ δ ∈ (0, 1] the so-called δ-cut Cδ (x? ) := x ∈ IRk : ζ(x) ≥ δ bounded, and a finite union of simply connected and closed sets 3 is non-empty, Observation Space, Sample Space, and Combined Fuzzy Samples Let X be a random variable and MX its observation space, i.e. MX is the set of all possible values for X. For a sample X1 , · · · , Xn of X the set of possible values for the sample is the Cartesian product MX × · · · × MX = MXn of n copies of the observation space MX . The set MXn is called sample space. In standard statistics the combination of n observations x1 , · · · , xn with xi ∈ MX into an element (x1 , · · · , xn ) ∈ MXn of the sample space is trivial. Statistical functions like estimators or test functions are functions defined on the sample space MXn , i.e. measurable functions ϑ : MXn → N , for a measurable space (N, A). In case of fuzzy samples x?1 , · · · , x?n the generalization of functions, i.e. ϑ(x?1 , · · · , x?n ) becomes a fuzzy element in N . In order to obtain the characterizing function of this fuzzy value the so-called extension principle from fuzzy set theory is applied (compare [2]). But therefore the fuzzy sample x?1 , · · · , x?n , which is a vector of fuzzy numbers, has first to be combined into a fuzzy element (x1 , · · · , xn )? of the sample space. Contrary to standard statistics this combination is not trivial in case of fuzzy samples. The reason for that is the fact that a vector of characterizing functions is not at all a vectorcharacterizing function. The combination is possible using the so-called minimum-tnorm. Let ξ1 (·), · · · , ξn (·) be the characterizing functions of the fuzzy observations x?1 , · · · , x?n . In order to obtain the vector-characterizing function ζ(·, · · · , ·) of the so-called fuzzy combined sample x? , which is a n-dimensional fuzzy vector, the minimum-t-norm is used: ζ(x1 , · · · , xn ) := min {ξi (xi ) : i = 1(1)n} ∀ (x1 , · · · , xn ) ∈ IRn . This combined fuzzy sample, whose imprecision is described by the vector-characterizing function ζ(·, · · · , ·), is the basis for the propagation of the fuzziness in the sample to estimations of parameters or other characteristic quantities of underlying statistical models. This propagation of fuzziness is provided by the so-called extension principle from fuzzy set theory. 3 Extension principle: Let f : M → N be an arbitrary function. For a fuzzy subset x? of M with membership function ζ(·) the membership function η(·) of the fuzzy value f (x? ) is defined by ( η(y) = 4 sup {ζ(x) : f (x) = y} 0 if f −1 ({y}) 6= ∅ if f −1 ({y}) = ∅ ) ∀ y ∈ N. Estimation based on Fuzzy Samples Let X ∼ f (· | θ), θ ∈ Θ be a statistical model with parameter space Θ and observation space MX . In order to estimate the true parameter in standard statistics point estimators as well as confidence estimations are used. In order to generalize classical point estimators θ̂ = ϑ(x1 , · · · , xn ) to the situation of fuzzy data x?1 , · · · , x?n the vector-characterizing function ζ(·, · · · , ·) and the extension principle from section 3 are applied. The characterizing function η(·) of the generalized (fuzzy) estimate θ̂? = ϑ(x?1 , · · · , x?n ) is given in the following way: Using the notation x = (x1 , · · · , xn ) ∈ IRn values η(θ) of η(·) are obtained by ( η(θ) = sup {ζ(x) : ϑ(x) = θ} 0 if ϑ−1 ({θ}) 6= ∅ if ϑ−1 ({θ}) = ∅ ) ∀ θ ∈ Θ. Remark 4: For continuous function ϑ(·, · · · , ·) and fuzzy observations x?1 , · · · , x?n , which are fuzzy intervals, also the obtained fuzzy value θ̂? is a fuzzy interval. An example of a fuzzy sample and the estimation for the expectation is given in figure 1. Figure 1: Characterizing functions of fuzzy data and fuzzy estimator xi (x) 1 0 1 2 3 4 5 2 3 4 5 x h (q) 1 0 1 4 q The generalization of confidence sets for parameters θ in statistical models based on fuzzy data is also possible. This generalization is not based on the extension principle but on a more general concept. Let κ(X1 , · · · , Xn ) be a confidence function for θ with confidence level 1 − α. Then for every concrete sample x1 , · · · , xn a subset κ(x1 , · · · , xn ) ⊆ Θ of the parameter space Θ is obtained. A generalization of confidence sets to the situation of fuzzy samples has to generate the classical confidence sets in case of precise samples. This generalization is based on the vector-characterizing function ζ(·, · · · , ·) of the combined fuzzy sample x? from section 3. Definition 4: Let X ∼ f (· | θ), θ ∈ Θ be a statistical model and κ : MXn → P(Θ) be a confidence function for θ with confidence level 1 − α. Then for fuzzy sample x?1 , · · · , x?n with combined fuzzy sample x? whose vector-characterizing function is ζ(·, · · · , ·), the generalized confidence set κ(x?1 , · · · , x?n ) is the fuzzy subset Θ?1−α of Θ whose membership function ϕ(·) is given by ( ϕ(θ) := sup {ζ(x) : θ ∈ κ(x)} 0 if ∃ x ∈ MXn : θ ∈ κ(x) if ∃| x ∈ MXn : θ ∈ κ(x) ) ∀ θ ∈ Θ. Remark 5: Fuzzy confidence sets are typical examples of fuzzy sets. In figure 2 an example of a fuzzy confidence set for the two-dimensional parameter of the Weibulldistribution W ei(τ, β) with θ = (τ, β) is given. Figure 2: Fuzzy sample and membership function of a fuzzy confidence set xx*i (x) 1 0 1 2 3 4 5 x j (t, b) 1 10 9 b t 5 5 Generalized Bayesian Models Bayesian statistical models consider the parameters θ also as stochastic quantities θ̃ with corresponding a-priori distribution π(·). For classical precise data x1 , · · · , xn of X ∼ f (·| θ), θ ∈ Θ, the a-posteriori density π(·| x1 , · · · , xn ) of the parameter is obtained using the likelihood function `(θ; x1 , · · · , xn ) = n Y f (xi | θ) i=1 by Bayes’ theorem π(θ | x1 , · · · , xn ) = R θ π(θ) · `(θ; x1 , · · · , xn ) . π(θ) · `(θ; x1 , · · · , xn )dθ For fuzzy data Bayes’ theorem has to be generalized.The generalization has to take care of the sequential nature of the updating procedure in Bayes’ theorem. Moreover precise a-priori densities are a critical point in Bayesian inference. This can be overcome by using so-called fuzzy probability distributions π ? (·) as a-priori distributions. For details compare [3]. The resulting a-posteriori distributions are also fuzzy probability distributions. Such fuzzy probability distributions P ? on measurable spaces (M, A) are generalized probability distributions obeying the following: (1) P ? (A) is a fuzzy interval for every A ∈ A (2) P ? (∅) = 0 and P ? (M ) = 1 (3) Denoting the δ-cut of P ? (A) by Cδ [P ? (A)] = [P δ (A), P δ (A)] ∀ δ ∈ (0, 1], for disjoint events A and B the following holds: P δ (A ∪ B) ≤ P δ (A) + P δ (B) and P δ (A ∪ B) ≥ P δ (A) + P δ (B) Remark 6: Fuzzy probability distributions are justified by histograms based on fuzzy data. For details compare [5]. Moreover using fuzzy a-priori distributions makes Baysian statistics more attractive and justified for applications. References [1] H. Bandemer: Mathematics of Uncertainty - Ideas, Methods, and Applications, Springer, Berlin, 2006 [2] G. Klir, B. Yuan: Fuzzy Sets and Fuzzy Logic - Theory and Applications, Prentice Hall, Upper Saddle River, N.J. 1995 6 [3] R. Viertl: Univariate statistical analysis with fuzzy data, Computational Statistics & Data Analysis 51 (2006) 133-147 [4] R. Viertl: Statistical Methods for Non-Precise Data, CRC Press, Boca Raton, Florida, 1996 [5] R. Viertl, D. Hareter: Beschreibung und Analyse unscharfer Information - Statistische Methoden für unscharfe Daten, Springer, Wien, 2006 7