Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
The Scientific Method Data Recording & Transformation Recording Data Field or experimental data must be recorded in a planned way Variables under investigation may be : direct measurements e.g. weight, length, amount, pH category frequencies e.g. numbers of a species, length range, colour derived measurements e.g. %, numbers / area, amount / time, amount / area / time Datalogging tables must: record all values needed to derive the value of a variable permit derivation calculations to be recorded alongside the raw data have clear headings and units for each value e.g. Lecture 3: Data Recording and Transformation Recording Data Frequency data are normally recorded as a tally each fifth stroke crossing out the previous four for quick mental addition! Areas to be sampled can be measured out on a grid system (eg 0.5 metre intervals for 0.5 x 0.5m quadrats) and sample quadrat positions chosen from random number tables Map references are given as Eastings (rows) first, then Northings (columns) second Lecture 3: Data Recording and Transformation Recording Data Fixed independent variables ie temperature, humidity, container size which have the potential to affect the value of the dependent variable must also be recorded Preliminary experiments may be needed to set values for these e.g. a temperature at which a bacterial culture will grow well laboratory instruments must be calibrated before recording variable values eg pH meters are checked / reset against buffer solutions spectrophotometers must be zeroed against a blank solution containing reagents but no product, then read against a range of known concentrations of the product - used to plot a calibration curve for the instrument Consistent rounding of decimal numbers (up or down!) and correct choice of significant figures to reflect the accuracy of measurements is very important Rounding up is conventional in scientific work Lecture 3: Data Recording and Transformation Data Transformation Summarises and highlights trends in the data eg Totals – sum all the data values for a variable, useful for comparison and other purposes Percentages – describe the proportion of data falling into particular categories Rates – show how a variable changes with time and allow comparison of data recorded over different time periods Reciprocals (1 ÷ variable) – reverse the magnitude of a variable and can help data interpretation Relative values – expression of data in relation to a standard value, providing context or helping application e.g. egg output per 1000 hens per month , energy requirement per Kg body weight problem? Lecture 3: Data Recording and Transformation Now attempt the two data transformation exercises in your workbook! Lecture 3: Data Recording and Transformation Descriptive Statistics Three important mathematical descriptions of the distribution of data Empirical frequency distributions Measures of location Measures of dispersion Frequency Distributions Show the frequency of occurrence of observations in a data set Qualitative, non-numerical and discrete data (for at least one variable) are usually depicted in a bar chart Lecture 3: Data Recording and Transformation Descriptive Statistics The data are discontinuous, so the bars do not touch Mean frequency of vole jawbones in owl pellets 10% 9% 9% 8% 7% 7% 6% 5% Data values may be entered on or above the bars and multiple data sets can be displayed using different coloured/hatched bars side by side 4% 4% 3% Mean frequency of vole jawbones in owl pellets 2% 2% 1% 0% Woodland Grassland Cornfield Riverbank Riverbank, 9% Habitat A pie diagram displays the relative frequency of data in each category (shown numerically above the bars in the bar chart) Absolute values may be entered alongside the segments Lecture 3: Data Recording and Transformation Woodland, 4% Cornfield, 2% Grassland, 7% Descriptive Statistics Continuous data is usually depicted in a histogram Size Frequency in a Sample of Perch The bars touch to depict continuity The X (horizontal) axis usually records the class interval This distribution is skewed to the left 60 50 40 30 20 10 15 .9 14 .0 – 13 .9 12 .0 – 11 .9 10 .0 – 9. 9 8. 0– 7. 9 6. 0– 5. 9 4. 0– 3. 9 2. 0– 0– 1. 9 0 Class intervals must be even and clearly defined such that an observation can fall INTO ONE CLASS ONLY e.g. 0 - 0.99, 1.00 – 1.99, 2.00 – 2.99, 3.00 – 3.99 Lecture 3: Data Recording and Transformation Descriptive Statistics It is sometimes helpful when comparing two or more frequency distributions where the total numbers of observations differ to calculate relative frequency or cumulative relative frequency distributions This type of data plot is called an ogive Lecture 3: Data Recording and Transformation Measures of Location (Averages) Average refers to several measures of the central tendency of a data set ► arithmetic mean x if x is a continuous variable and there are n observations in the sample, then the sample mean x x x= n sigma = “sum of ” The mean is a good measure of central tendency when the data is distributed symmetrically but will be distorted by a few excessively small or large values of x (outliers) Lecture 3: Data Recording and Transformation Measures of Location (Averages) ► median — the central value in a set of n observations arranged in rank order, with as many observations above it as below it If n is an odd number, the median = the counting from the smallest n 1 thobservation, 2 If n is an even number, the median is half-way between the value of the central two values ► mode - the most commonly occurring observation in a data set. The modal class is the group or class into which most observations fall in a histogram In a perfectly symmetrically distributed data set, mean, median and mode have the same value Lecture 3: Data Recording and Transformation Measures of Dispersion Four main expressions of the spread of data ►Range – the difference between the largest and the smallest observations ►Interquartile range - the range of values enclosing the central 50% of the observations when they are arranged in order of magnitude (ranked ) ►Variance - determined by calculating the average of the deviation of each observation from the arithmetic mean The variance is a very useful measure of data dispersion. Because some of the values will be negative, the deviations are squared to make them all positive and the variance ( s2 ) is calculated as: Lecture 3: Data Recording and Transformation Measures of Dispersion s2 is used to denote the sample variance and distinguish it from the population variance – given the symbol σ2 and calculated by dividing by n s2 = mean of (squares minus the square of the mean) Lecture 3: Data Recording and Transformation Measures of Dispersion ► Standard deviation ( s or SD) is the square root of the variance and the most popular measure of dispersion and represents the average of the deviations of the observations from the arithmetic mean The population standard deviation ( σ ) is calculated by using n rather than (n – 1) in the same way that σ2 represents the population variance. The population mean is given the symbol µ (mu) Lecture 3: Data Recording and Transformation