Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Principle of Maximum Entropy www.AssignmentPoint.com www.AssignmentPoint.com The principle of maximum entropy states that, subject to precisely stated prior data (such as a proposition that expresses testable information), the probability distribution which best represents the current state of knowledge is the one with largest entropy. Another way of stating this: Take precisely stated prior data or testable information about a probability distribution function. Consider the set of all trial probability distributions that would encode the prior data. Of those, the one with maximal information entropy is the proper distribution, according to this principle. History The principle was first expounded by E. T. Jaynes in two papers in 1957 where he emphasized a natural correspondence between statistical mechanics and information theory. In particular, Jaynes offered a new and very general rationale why the Gibbsian method of statistical mechanics works. He argued that the entropy of statistical mechanics and the information entropy of information theory are principally the same thing. Consequently, statistical mechanics should be seen just as a particular application of a general tool of logical inference and information theory. Overview[edit] In most practical cases, the stated prior data or testable information is given by a set of conserved quantities (average values of some moment functions), associated with the probability distribution in question. This is the way the www.AssignmentPoint.com maximum entropy principle is most often used in statistical thermodynamics. Another possibility is to prescribe some symmetries of the probability distribution. The equivalence between conserved quantities and corresponding symmetry groups implies a similar equivalence for these two ways of specifying the testable information in the maximum entropy method. The maximum entropy principle is also needed to guarantee the uniqueness and consistency of probability assignments obtained by different methods, statistical mechanics and logical inference in particular. The maximum entropy principle makes explicit our freedom in using different forms of prior data. As a special case, a uniform prior probability density (Laplace's principle of indifference, sometimes called the principle of insufficient reason), may be adopted. Thus, the maximum entropy principle is not merely an alternative way to view the usual methods of inference of classical statistics, but represents a significant conceptual generalization of those methods. It means that thermodynamics systems need not be shown to be ergodic to justify treatment as a statistical ensemble. In ordinary language, the principle of maximum entropy can be said to express a claim of epistemic modesty, or of maximum ignorance. The selected distribution is the one that makes the least claim to being informed beyond the stated prior data, that is to say the one that admits the most ignorance beyond the stated prior data. www.AssignmentPoint.com Testable information The principle of maximum entropy is useful explicitly only when applied to testable information. Testable information is a statement about a probability distribution whose truth or falsity is well-defined. For example, the statements the expectation of the variable x is 2.87 and p2 + p3 > 0.6 (where p2 + p3 are probabilities of events) are statements of testable information. Given testable information, the maximum entropy procedure consists of seeking the probability distribution which maximizes information entropy, subject to the constraints of the information. This constrained optimization problem is typically solved using the method of Lagrange multipliers. Entropy maximization with no testable information respects the universal "constraint" that the sum of the probabilities is one. Under this constraint, the maximum entropy discrete probability distribution is the uniform distribution, www.AssignmentPoint.com Applications The principle of maximum entropy is commonly applied in two ways to inferential problems: Prior probabilities The principle of maximum entropy is often used to obtain prior probability distributions for Bayesian inference. Jaynes was a strong advocate of this approach, claiming the maximum entropy distribution represented the least informative distribution. A large amount of literature is now dedicated to the elicitation of maximum entropy priors and links with channel coding. Maximum entropy models Alternatively, the principle is often invoked for model specification: in this case the observed data itself is assumed to be the testable information. Such models are widely used in natural language processing. An example of such a model is logistic regression, which corresponds to the maximum entropy classifier for independent observations. www.AssignmentPoint.com