Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Solution: a. Data analysis using descriptive statistics 1) Calculate the measures of central tendency, dispersion, and skew for your data The descriptive statistics for Price is given below Using Microsoft Excel 2007 Add- Ins Megastat Descriptive Statistics count mean sample variance sample standard deviation minimum maximum range sum sum of squares deviation sum of squares (SSX) population variance population standard deviation standard error of the mean skewness kurtosis coefficient of variation Price 105 221.103 2,218.919 47.105 125 345.3 220.3 23,215.800 5,363,847.300 230,767.589 2,197.787 46.881 4.597 0.474 -0.277 21.30% (CV) 1st quartile median 3rd quartile interquartile range mode 187.000 213.600 251.400 64.400 188.300 The descriptive statistics for Size Using Microsoft Excel 2007 Add- Ins Megastat Descriptive Statistics count mean sample variance sample standard deviation minimum maximum range sum sum of squares deviation sum of squares (SSX) population variance population standard deviation standard error of the mean skewness kurtosis coefficient of variation (CV) Size 105 2,223.81 61,831.50 248.66 1600 2900 1300 233,500.00 525,690,000.00 6,430,476.19 61,242.63 247.47 24.27 0.32 0.60 11.18% 1st quartile median 3rd quartile interquartile range mode 2,100.00 2,200.00 2,400.00 300.00 2,100.00 The descriptive statistics for Bedrooms Using Microsoft Excel 2007 Add- Ins Megastat Descriptive Statistics count minimum maximum range Bedrooms 105 2 8 6 1st quartile median 3rd quartile interquartile range mode The descriptive statistics for Distance Using Microsoft Excel 2007 Add- Ins Megastat Descriptive Statistics 3.00 4.00 5.00 2.00 4.00 count mean sample variance sample standard deviation minimum maximum range sum sum of squares deviation sum of squares (SSX) Distance 105 14.63 23.75 4.87 6 28 22 1,536.00 24,940.00 2,470.51 population variance population standard deviation 23.53 4.85 standard error of the mean 0.48 skewness kurtosis coefficient of variation (CV) 0.40 -0.17 33.32% 1st quartile median 3rd quartile interquartile range mode 11.00 15.00 18.00 7.00 16.00 The descriptive statistics for Twnship Using Microsoft Excel 2007 Add- Ins Megastat Descriptive Statistics Twnship count minimum maximum range 1st quartile median 3rd quartile interquartile range mode 105 1 5 4 count minimum maximum range 2.00 3.00 4.00 2.00 4.00 Baths 105 1.5 3 1.5 1st quartile median 3rd quartile interquartile range mode 2.000 2.000 2.000 0.000 2.000 2) Display your descriptive statistical data using graphic and tabular techniques a) Line graph The line diagram for the bedrooms in the given data b) Bar graph The bar diagram for the price of the house is shown below The bar diagram for the size of the house is shown below 3) based on your measures of central tendency and graphs, discuss the best measures of central tendency and dispersion of your data. Justify your selection From the measures of central tendency there is no need for justification because the calculation which are important for the variables are analyzed in the descriptive statistics part. b. Conclusions Discuss whether your research findings answered your problem statement (research question), or whether more research might be necessary The regression analysis will be valid and apt test for our research Where The dependent variable Y = Price The independent variables X1: Bedrooms X2: Size X3: Pool X4: Distance X5: Twnship X6: Garage X7: Baths So the corresponding regression coefficients as 1 , 2, 3 , 4, 5 , 6 and 7 Hypothesis: Null hypothesis: H0: The regression coefficients of Bedrooms, Size, Pool, Distance, Twnship, Garage and Baths are equal to zero Alternative hypothesis: H1: At least one of the regression coefficients of Bedrooms, Size, Pool, Distance, Twnship, Garage and Baths are not equal to zero So using Microsoft excel we can find the multiple regression equation the steps involved to solve is shown below Data Data Analysis Regression The solution of Microsoft excel is shown below SUMMARY OUTPUT Regression Statistics Multiple R 0.730476 R Square 0.533595 Adjusted R Square 0.499937 Standard Error 33.31064 Observations 105 ANOVA df Regression Residual Total Intercept Bedrooms Size Pool Distance Twnship Garage Baths SS MS 17590.9 3 1109.59 9 F 15.8534 1 Significanc eF 7 123136.5 97 104 107631.1 230767.6 Coefficient s 62.24869 7.375498 0.038627 Standard Error 40.91404 2.590021 0.014755 t Stat 1.52145 2.84765 2.61795 P-value 0.1314 0.00537 0.01026 Lower 95% -18.9544 2.235023 0.009343 Upper 95% 143.451 12.5159 0.06791 Lower 95.0% -18.9544 2.235023 0.009343 Upper 95.0% 143.451 12.5159 0.06791 -19.1114 -1.01267 -1.73901 35.49802 23.09255 7.126553 0.741385 2.699416 7.675838 9.058308 -2.6817 -1.3659 -0.6442 4.62464 2.54932 0.00861 0.17512 0.52095 1.16E-0 0.01236 -33.2557 -2.48411 -7.0966 20.2636 5.114313 -4.9672 0.45877 3.61858 50.7324 41.0707 -33.2557 -2.48411 -7.0966 20.2636 5.114313 -4.9672 0.45877 3.61858 50.7324 41.0707 The linear regression equation is 1.01E-13 Y =62.24869 +7.375498 X1+0.038627 X2-19.1114 X3-1.01267 X4-1.73901 X5+35.49802 X6+23.09255 X7 Conclusion: Since the p-value is less than the test statistic value there is no evidence to accept the null hypothesis. Hence we conclude that at least one of the regression coefficients of Bedrooms, Size, Pool, Distance, Twnship, Garage and Baths are not equal to zero