* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Entropy: Measure of Diversity?
Survey
Document related concepts
Transcript
Entropy: Measure of Diversity? David Lee Baker David E. Booth William Acar Management & Information Systems Working Paper MIS2007-08:1 (Do not cite without permission) 1 Entropy—In Management Strategy I • Managers are interested in knowing how diversified are the firm’s lines of business • It has been considered (and aptly discussed and hotly debated) that more diverse businesses are more profitable – May not always be the case though 2 Entropy—In Management Strategy II • In diversification we need to consider related (similar to the firm’s core business) versus unrelated (dissimilar) diversification • In related diversification the firm’s several lines of business, even though distinct, still possess some kind of “fit” • In unrelated diversification there is no common linkage or element of fit among the firm’s lines of business – In this sense unrelated diversification may be considered as “pure” diversification 3 Diversity—An Example • If we take a beaker of water (H2O) and a very concentrated solution of red food coloring and we then add a drop of the coloring to the water we will see the red color diffusing throughout the water and thus, we go from concentration to diversification • Economists, as well as chemists and physicists, want to define concentration vs. diversification • Concentration and diversification are two ends of the spectrum 4 Entropy—Background & History I • Introduction of a mysterious entity called the H-function, or statistical entropy, by Ludwig Boltzmann (1896) – Defined as the mean value of the logarithm of a probability density function (p.d.f.) • Measured the amount of uncertainty about the possible states of a physical system • Because there was still disagreement about the existence of atoms his statistical entropy generated much debate 5 Entropy—Background & History II • Claude E. Shannon (1948) generalized Boltzmann’s entropy to information theory and proved that it had the properties that allow it to be taken as the average amount of information conveyed by a discrete random variable about another • Mathematicians have further refined the Shannon entropy, and new tools, such as the relative or conditional entropies have been developed • Norbert Wiener and Claude E. Shannon along with others extended Boltzmann’s earlier theories to more general cases – Shannon had studied under Wiener at MIT in the late 1930s, graduating with both a master’s and doctorate in mathematics 6 Entropy—Background & History III • Mathematically this reduces to the amount of uncertainty contained in a probabilistic experiment A as measured by the function: Hm (p1, . . . , pm) = – i = 1 ∑m pi log pi, 7 Entropy—Background & History IV • The Entropy (inverse) measure of industry concentration weights each pi by the logarithm (log) of 1/pi, e.g.: E = i = 1 ∑n pi log 1/pi, Notice that we have replaced H by E and used the fact that –log(A)=log(1/A) 8 Herfindhahl’s Measure • Herfindahl’s contribution to diversification measures was the suggestion that the share of each firm be weighted by itself, i.e..: (using H for Herfindahl) H= n p p ∑ i=1 i i 9 Decomposability of Entropy • Entropy is a decomposable measure (Khinchin’s Decomposition Theorem) • Herfindahl is decomposable because • in fact, Herfindahl is an approximation to Entropy • 2 & 4 digit SIC code is compatible with these decompositions but is NAICS? • Further research is needed 10 Diversification-Score Anomalies Based on Entropy Decomposition proposed in this paper. Probably violates the Decomposition Theorem Note that columns 17 & 18 do not sum to column 16. Source: Ragunathan (1995), Journal of Management, 21(5), June, excerpts of p. 992. 11 Corporate Diversification—Correct Totals Note that columns 8 & 9 add up to column 7, as they should. Source: Jacquemin & Berry (1979), The Journal of Industrial Economics, 27(4), June, p. 362. 12 Triangular Numbers I These are the first 100 triangular numbers: Source: http://www.mathematische-basteleien.de/triangularnumber.htm 13 Triangular Numbers II You can illustrate the name triangular number by the following drawing: Source: http://www.mathematische-basteleien.de/triangularnumber.htm 14 Sample Triangular Load Distribution—Graph Triangular Load 0.3500 1; 0.3333 0.3000 2; 0.2667 0.2500 3; 0.2000 0.2000 0.1500 4; 0.1333 0.1000 5; 0.0667 0.0500 0.0000 0 1 2 3 4 5 6 15 Triangular Distributions—Examples Position in Pascal's Triangle top ... ... Pascal's triangle makes a contribution to many fields of the number theory. The red numbers are triangle numbers. You even can find the sum of the triangular numbers easily. Example: 1+3+6+10+15=35 You can express the triangular numbers as binomial coefficients Source: http://www.mathematische-basteleien.de/triangularnumber.htm 16 Triangular Distributions—Analyses I Triangular Samples, Figure # Range of Values n 3.1 5/15, 4/15, 3/15, 2/15, 1/15 5 3.2 10/55, 9/55, 8/55, 7/55, . . . 1/55 10 17 Calibrated A2, Acar-Troutt Single-Sum Formula Calibrated A1, Acar-Bhatnagar Calibrated Entropy n Uncalibrated Entropy Range of Values Calibrated Herfindahl TRIANGULAR SAMPLES Figure # Uncalibrated Herfindahl Triangular Distributions—Analyses II 3.1 5/15, 4/15, 3/15, 2/15, 1/15 5 0.75556 0.94444 1.48975 0.92563 0.50000 0.66667 3.2 10/55, 9/55. 8/55, 7/55, . . . 1/55 10 0.87273 0.96970 2.15128 0.93429 0.51279 0.66667 18 Concluding Remarks and Future Directions II • In their seminal article, Jacquemin and Berry (1979) have specified how the decomposition of the Entropy measure can be related to SIC codes by breaking down the diversity measurement between the 2-digit and the 4-digit codes. We now need to see if that is still true for NAICS – We will be following up on their work and further examining statistical properties 19 20 Triangular Number Theory A triangular number is the sum of the n natural numbers from 1 to n. Triangular numbers are so called because they describe numbers of balls that can be arranged in a triangle. The nth triangular number is given by the following formula: Tn = k=1∑n k = 1+2+3+ . . . +(n-2)+(n-1)+n = n(n+1) = n2+1 = (n+1) 2 2 ( 2 ) As shown in the rightmost term of this formula, every triangular number is a binomial coefficient: the nth triangular is the number of distinct pairs to be selected from n + 1 objects. In this form it solves the 'handshake problem' of counting the number of handshakes if each person in a room shakes hands once with each other person. The sequence of triangular numbers (sequence A000217 in OEIS) for n = 1, 2, 3... is: 1, 3, 6, 10, 15, 21, 28, 36, 45, 55, ... 21 Thermodynamics • The first law of Thermodynamics which states that energy is neither created or destroyed directs us to a world where energy is lost • The second law says that entropy always tends to increase in a closed system, forecasting a universe that is constantly winding down • The tension between the first and second laws runs like a recurring theme between turn-of-the century cultural formations 22 NAICS vs. SIC Codes • The North American Industry Classification System (NAICS) has replaced the U.S. Standard Industrial Classification (SIC) system. NAICS will reshape the way we view our changing economy. • NAICS was developed jointly by the U.S., Canada, and Mexico to provide new comparability in statistics about business activity across North America. 23 NAICS • The official 2007 US NAICS Manual North American Industry Classification System--United States, 2007 includes definitions for each industry, tables showing correspondence between 2007 NAICS and 2002 NAICS for codes that changed, and a comprehensive index--features also available on this web site. To order the 1400-page 2007 Manual, in print, call NTIS at (800) 553-6847 or (703) 605-6000, or check the NTIS web site. The 2002 Manual, showing correspondence between 2002 NAICS and 1997 NAICS, and the 1997 Manual, showing correspondence between 1997 NAICS and 1987 SIC, are also available. 24