Wednesday, 13 January 2016

Unit 4 sec 4.3 standard deviation

Unit 4 sec 4.3 standard deviation
12 December 2015
23:32

The best known measure of spread is the standard deviation, or SD. The bad news is that, using pencil and paper, it is hard work to calculate the standard deviation, particularly with large datasets. The good news is that, these days, once the day the have been keyed in, calculator or computer can work out the standard deviation in a flash. But before becoming totally reliant on a machine, it is a good idea to perform one or 2 pencil and paper calculations of the standard deviations using very simple datasets.
  An alternative name for the standard deviation is the RMS deviation - in full, the root mean squared deviation. Literally, it is the (square) root of the mean of the squared deviations. This complicated name will make more sense when you follow through the steps involved in the calculation.

Strategy to find the standard deviation of a dataset
1.  find the mean of the dataset.
2.  Find the difference of each value from the mean - these are the deviations, often labelled as the d values.
3.  Square each deviation - this gives the d² values.
4.  Find the mean of these squared deviations - this number is the mean squared deviation, better known as the variance.
5.  The square root of the variance to get the root mean square deviation - that is, the standard deviation.

Example 5 finding a standard deviation
find the standard deviation of the following dataset.
1,2,4,6,7
Solution
find the mean
mean= (1+2+4+6+7)/5=20/5=4
subtract the mean from each data value to find the deviations
the deviations are -3, -2, 0, 2, 3.
Square the deviations.
The squared deviations are 9, 4, 0, 4, 9.
Calculate the mean of the squared deviations to find the variance.
The variance is (9+4+0+4+9)/5= 26/5= 5.2.
The standard deviation is the square root of the variance.
So the standard deviation is
√5.2= 2.3 (to 1d.p).

You often find the calculation standard deviation, rather complicated, and the steps hard to remember. It can be helpful to think about some of the ideas of visually.
 You may have wondered why it is necessary to square the deviations step 3 of the calculation. In order to see the point of this, consider what would have happened if you had not squared them.
  As you can see, the positive and negative deviations have cancelled each other out and we are left with the numerator of 0. So the value of the main deviations is 0., This will always be true to the mean deviation; the positive and negative deviations will always cancel each other out, leaving an answer of 0 for the mean deviation. You may like to try this yourself with some other examples.
 It is to avoid this problem that the deviations are squared in step 3 (making them positive), and this is then undone by taking the square root in step 5.
  Although calculating standard deviation by pencil and paper is quite hard work, rest assured that, these days it is normally done on a calculator or computer; as you will see later, the module resource dataplotter calculates and displays it and other statistical summaries automatically. There are a number of reasons why the standard deviation is a useful measure of spread, and here are 2 of the main ones.

Reasons for using the standard deviation as a measure of spread.
The standard deviation is the best known and most commonly used measure of spread.
 All the values in the dataset are included in its calculation.

 (However, unlike the interquartile range, its value can be to sell extent distorted by outliers.)

No comments:

Post a Comment