Download Homework 1

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Homework 1
F0612006 陈旻
5061209173
1. Load data “cgss05.csv” into EViews. Obtain descriptive statistics for “income”,
“edu”, and “expr”. The statistics should include number of observations, min, max,
mean, median, std, skewness, kurtosis, quantile(0.25), quantile(0.75).
Solution:
Step1: Select “file”→“new”→“workfile range”
Step2: Select “file” →“import” → “trad text-lotus excel”
Step3: Exercise the command operation “income.hist”, “edu.hist”, and
“expr.hist”. Then we can get the graphs as follows:
(1) Income:
6000
Series: INCOME
Sample 1 5778
Observations 5778
5000
4000
3000
2000
1000
Mean
Median
Maximum
Minimum
Std. Dev.
Skewness
Kurtosis
10165.39
6000.000
400000.0
30.00000
14259.29
7.549850
132.1436
Jarque-Bera
Probability
4070135.
0.000000
0
0
100000
200000
300000
400000
From the graph, we can know that the skewness of income is 7.549850 » 0, and
the kurtosis of the income is 132.1436 » 3. Then we can tell that the data of
income does not follow normal distribution.
(2) Education:
1600
Series: EDU
Sample 1 5778
Observations 5778
1200
800
400
Mean
Median
Maximum
Minimum
Std. Dev.
Skewness
Kurtosis
8.696262
9.000000
20.00000
0.000000
4.234267
-0.386088
2.563698
Jarque-Bera
Probability
189.3775
0.000000
0
0
2
4
6
8
10
12
14
16
18
20
From the graph, we can know that the skewness of education is -0.386088 ‹ 0,
and the kurtosis of the income is 2.563698 ‹ 3. Then we can tell that the data
of the education roughly follows the normal distribution. Also it skewed
left, and is a peaked distribution.
(3) Experience:
800
Series: EXPR
Sample 1 5778
Observations 5778
600
400
200
Mean
Median
Maximum
Minimum
Std. Dev.
Skewness
Kurtosis
20.14503
20.00000
47.00000
1.000000
10.40013
0.075313
2.214287
Jarque-Bera
Probability
154.0881
0.000000
0
0
5
10
15
20
25
30
35
40
45
From the graph, we can know that the skewness of experience is 0.075313 › 0,
and the kurtosis of the income is 2.214287 ‹ 3. Then we can tell that the data
of experience roughly follows normal distribution. Also it skewed right,
and is a peaked distribution.
2. Generate a new variable that is the log of income, say, “logy”. Obtain histograms,
densities, and QQ-plots (versus the standard normal distribution) for both “income”
and “logy”. Compare the distributions of these two variables.
Solution:
Step1: Exercise the command operation “genr logy = log(income)”
Step2: Select “view”→“descriptive statistics”→“histogram and stats” both
in “income” and “logy”.
Step3: Select “view”→“distribution graphs”→“kernel density” both in
“income” and “logy”.
Step4: Select “view”→“ distribution graphs”→“quantile-quantile plot”
both in “income” and “logy”.
Then we can get the graphs as follows:
(1) Histograms:
(i)
Income
6000
Series: INCOME
Sample 1 5778
Observations 5778
5000
4000
3000
2000
1000
Mean
Median
Maximum
Minimum
Std. Dev.
Skewness
Kurtosis
10165.39
6000.000
400000.0
30.00000
14259.29
7.549850
132.1436
Jarque-Bera
Probability
4070135.
0.000000
0
0
100000
200000
300000
400000
From the graph, we can know that the skewness of income is 7.549850 » 0, and
the kurtosis of the income is 1321436 » 3. Then we can tell that the data of
income does not follow normal distribution.
(ii)
Logy
800
Series: LOGY
Sample 1 5778
Observations 5778
600
400
200
Mean
Median
Maximum
Minimum
Std. Dev.
Skewness
Kurtosis
8.636962
8.699515
12.89922
3.401197
1.130683
-0.196602
2.833913
Jarque-Bera
Probability
43.86347
0.000000
0
3 .7 5 5 .0 0 6 .2 5 7 .5 0 8 .7 5 1 0 .0 0 1 1 .2 5 1 2 .5 0
From the graph, we can know that the skewness of logy is -0.196602 ‹ 0, and
the kurtosis of the logy is 2.833913 ‹ 3. Then we can tell that the data of logy
roughly follows normal distribution. Also it skewed left, and is a peaked
distribution.
(2) Densities:
(i)
Income
Kernel Density (Epanechnikov, h = 2498.5)
0.00008
0.00006
0.00004
0.00002
0.00000
0
100000
200000
INCOME
(ii)
Logy
300000
400000
Kernel Density (Epanechnikov, h = 0.3984)
0.4
0.3
0.2
0.1
0.0
4
6
8
10
12
LOGY
(3) QQ-plots:
(i)
Income
4
Normal Quantile
2
0
-2
-4
0
100000 200000 300000 400000 500000
INCOME
We can see that the trends of the points shown in the graph is not a
line. Then we can say that the data of the income does not follow normal
distribution.
(ii)
Logy
4
Normal Quantile
2
0
-2
-4
2
4
6
8
10
12
14
LOGY
We can see that the trends of the points shown in the graph is roughly
a line. Then we can say that the data of the logy follows normal
distribution.
3. Obtain scatter plots between “logy” and “edu”, between “logy” and “expr”.
Solution:
Step1: Select “quick”→“graph”→“logy edu” →“scatter diagram”.
Step2: Select “quick”→“graph”→“logy expr” →“scatter diagram”.
Then we can get the scatter diagrams as follows:
(1) Between “logy” and “edu”
25
20
EDU
15
10
5
0
2
4
6
8
10
12
14
LOGY
We can tell the trends from the graph that the longer the subjects are
educated, the larger logy is. That is to say the longer the subjects are educated, the
more income they earn.
(2) Between “logy” and “expr”
50
40
EXPR
30
20
10
0
2
4
6
8
LOGY
10
12
14
We can not tell the obvious trends from the graph. That is to say experience is not
directly related to logy (or say income).
4. Select males from the sample. Obtain descriptive statistics and graphs for the
subsample. (Note: you can use menu: Sample in the Workfile window to do the
sample selection.)
Solution:
Step1: Select “sample”→input “female = 0” into the “if condition” box.
Step2: Select “view”→“descriptive statistics”→“histogram and stats” both
in “income”, “edu”, and “expr”. Then we can get the graphs as follows:
(1) Income:
2000
Series: INCOME
Sample 1 5778
Observations 2937
1500
1000
500
Mean
Median
Maximum
Minimum
Std. Dev.
Skewness
Kurtosis
11961.98
8000.000
200000.0
100.0000
14441.47
4.680376
41.24098
Jarque-Bera
Probability
189680.8
0.000000
0
0
40000
80000
120000
160000
200000
From the graph, we can know that the skewness of income is 4.680375 » 0, and
the kurtosis of the income is 41.24098 » 3. Then we can tell that the data of
male’s income does not follow normal distribution.
(2) Education:
1000
Series: EDU
Sample 1 5778
Observations 2937
800
600
400
Mean
Median
Maximum
Minimum
Std. Dev.
Skewness
Kurtosis
9.433776
9.000000
20.00000
0.000000
3.769200
-0.390185
2.966716
Jarque-Bera
Probability
74.65924
0.000000
200
0
0
2
4
6
8
10
12
14
16
18
20
From the graph, we can know that the skewness of education is -0.390185 ‹ 0,
and the kurtosis of the income is 2.966716 ‹ 3. Then we can tell that the data
of the male’s education roughly follows the normal distribution. Also
it skewed left, and is a peaked distribution.
(3) Experience:
400
Series: EXPR
Sample 1 5778
Observations 2937
300
200
100
Mean
Median
Maximum
Minimum
Std. Dev.
Skewness
Kurtosis
20.29213
21.00000
47.00000
1.000000
10.46353
0.033172
2.154916
Jarque-Bera
Probability
87.93481
0.000000
0
0
5
10
15
20
25
30
35
40
45
From the graph, we can know that the skewness of experience is 0.033172 › 0,
and the kurtosis of the income is 2.154916 ‹ 3. Then we can tell that the data
of male’s experience roughly follows normal distribution. Also it skewed
right, and is a peaked distribution.
5. Do the same things for the female sample.
Solution:
Step1: Select “sample”→input “female = 1” into the “if condition” box.
Step2: Select “view”→“descriptive statistics”→“histogram and stats” both
in “income”, “edu”, and “expr”. Then we can get the graphs as follows:
(1) Income:
3000
Series: INCOME
Sample 3 5777
Observations 2841
2500
2000
1500
1000
500
Mean
Median
Maximum
Minimum
Std. Dev.
Skewness
Kurtosis
8308.094
4600.000
400000.0
30.00000
13827.69
11.29411
256.2755
Jarque-Bera
Probability
7653971.
0.000000
0
0
100000
200000
300000
400000
From the graph, we can know that the skewness of income is 11.29411 » 0, and
the kurtosis of the income is 256.2755 » 3. Then we can tell that the data of
female’s income does not follow normal distribution.
(2) Education:
800
Series: EDU
Sample 3 5777
Observations 2841
600
400
200
Mean
Median
Maximum
Minimum
Std. Dev.
Skewness
Kurtosis
7.933826
9.000000
20.00000
0.000000
4.543048
-0.248314
2.182541
Jarque-Bera
Probability
108.2987
0.000000
0
0
2
4
6
8
10 12 14 16 18 20
From the graph, we can know that the skewness of education is -0.248314 ‹ 0,
and the kurtosis of the income is 2.182541 ‹ 3. Then we can tell that the data
of the education roughly follows the normal distribution. Also it skewed
left, and is a peaked distribution.
(3) Experience:
400
Series: EXPR
Sample 3 5777
Observations 2841
300
200
100
Mean
Median
Maximum
Minimum
Std. Dev.
Skewness
Kurtosis
19.99296
20.00000
46.00000
1.000000
10.33383
0.119463
2.282415
Jarque-Bera
Probability
67.71220
0.000000
0
0
5
10
15
20
25
30
35
40
45
From the graph, we can know that the skewness of experience is 0.119463 › 0,
and the kurtosis of the income is 2.282415 ‹ 3. Then we can tell that the data
of experience roughly follows normal distribution. Also it skewed right,
and is a peaked distribution.
Related documents