Download Using Excel to do Statistics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Types of artificial neural networks wikipedia , lookup

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Time series wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
Statistics: Using Excel with Assessing Data
Excel: About statistical analysis tools
Analysis ToolPak add-in Microsoft Excel provides a set of data analysis tools — called the
Analysis ToolPak — that you can use to save steps when you develop complex statistical or
engineering analyses. You provide the data and parameters for each analysis; the tool uses the
appropriate statistical or engineering macro functions and then displays the results in an output
table. Some tools generate charts in addition to output tables.
Available tools To view a list of available analysis tools, click Data Analysis on the Tools
menu. If the Data Analysis command is not on the Tools menu, you need to install the Analysis
ToolPak. The instructions for loading the ToolPak is available online here.
Related worksheet functions Excel provides many other statistical, financial, and engineering
worksheet functions. Some of the statistical functions are built-in and others become available
when you install the Analysis ToolPak.
Descriptive Statistics It is easy to analyze data using descriptive statistics in Excel because
Excel includes all common statistics such as mean, median, mode, standard deviation, etc.
1. Open Excel, then open the 800Buffer.xls and Bobolink_comps.xls [or whatever you have
named the] files.
2. Click once on a blank cell.
3. Click on Tools, then on Data Analysis. When you do this, you will see the following
screen.
4. Click once on Descriptive Statistics, then click on OK. You will see the following screen.
5. Enter the cell reference for the range of data you want to analyze. The reference must
consist of two or more adjacent ranges of data arranged in columns or rows. For our
example, we are going to choose the values for Bobolink Lane properties. You should
type the range, (example $B$3:$G$3), into the Input Range box. Or, you can select the
range from your spreadsheet by dragging the mouse to highlight the values. Do not
include the header (which includes text). Note: Make sure the values are right-justified
and are ‘numbers’.
6. Next, you need to indicate whether the data in the input range is arranged in rows or in
columns, click Rows or Columns.
If the first row of your input range contains labels, select the Labels in First Row check box. If
the labels are in the first column of your input range, select the Labels in First Column check
box. This check box is clear if your input range has no labels; Microsoft Excel generates
appropriate data labels for the output table.
7. For our example, we will check the box Grouped by Rows.
8. Select if you want to include a row in the output table for the confidence level of the
mean. In the box, enter the confidence level you want to use. For example, a value of 95
percent calculates the confidence level of the mean at a significance of 5 percent. For this
example, we are going to use a confidence level of 95. With the choices we have made
thus far, you should see the following screen.
9. Kth Largest. Select if you want to include a row in the output table for the kth largest
value for each range of data. In the box, enter the number to use for k. If you enter 1, this
row contains the maximum of the data set. We will leave this blank.
10. Kth Smallest. Select if you want to include a row in the output table for the kth smallest
value for each range of data. In the box, enter the number to use for k. If you enter 1, this
row contains the minimum of the data set. We will leave this blank.
11. Output Range. Enter the reference for the upper-left cell of the output table. This tool
produces two columns of information for each data set. The left column contains statistics
labels, and the right column contains the statistics. Microsoft Excel writes a two-column
table of statistics for each column or row in the input range, depending on the Grouped
By option selected. If you don't enter an output range, Excel might overwrite your data
table. For this example, we will enter $J$3 into the Output Range box.
Option: New Worksheet Ply
Click to insert a new worksheet in the current workbook and paste the results starting at cell A1
of the new worksheet. To name the new worksheet, type a name in the box.
Option: New Workbook
Click to create a new workbook and paste the results on a new worksheet in the new workbook.
12. Click on Summary statistics to have Microsoft Excel produce one field for each of the
following statistics in the output table: Mean, Standard Error (of the mean), Median,
Mode, Standard Deviation, Variance, Kurtosis, Skewness, Range, Minimum, Maximum,
Sum, Count, Largest (#), Smallest (#), and Confidence Level.
13. When you are finished with your choices, this option block should look like the
following.
14. Now, for the last step, click on OK. After some calculation time, a table will appear in
your spreadsheet with all your descriptive statistics. Open MS Word, cut and paste the
summary statistics the print (only) the table which looks like this for both spreadsheets.
Compare the statistics and discuss the impact of using boundary data versus comparable
specific data to establish real estate and/or assessing values.
Column1
Mean
Standard Error
Median
Mode
Standard Deviation
Sample Variance
Kurtosis
Skewness
Range
Minimum
Maximum
Sum
Count
Confidence
Level(95.0%)
113976.9
2313.874
116000
107400
14450.14
2.09E+08
1.115168
-0.17015
75500
75300
150800
4445100
39
4684.194
END OF EXERCISE
Source: http://www.mastep.sjsu.edu/learn/descriptivestatistics.htm; updated and modified by Michelle M.
Thompson ©2004.