SAMANWAYAM: January 2017

MEASURES OF VARIABILITY

INTRODUCTION

The calculation of three measures of central tendency- the mean, the median, and the mode measures typical or representative of asset of scores as a whole. To find some measure of the variability of our scores, that is, of the “scatter” or “spread” of the separate scores around their central tendency. If a group is homogeneous, that is, made up of individuals of nearly the same ability, most of its scores will fall around the same point on the scale, the range will be relatively short and the variability small. But if the group contains individuals of widely differing capacities, scores will be strung out from high to low, the range will be relatively wide and the variability large.

Four measures have been devised to indicate the variability or dispersion within a set of measures. These are (1) the range, (2) the quartile deviation or Q, (3) the standard deviation or SD, and (4) the average deviation. Here we need to focusing on the following measures of variability, (1) the range, (2) the quartile deviation or Q, and (3) the standard deviation.

CALCULATION OF MEASURES OF DISPERSION

1) THE RANGE

The range may defined that the interval between the highest and lowest scores. The range is the most general measure of spread or scatter, and is computed when we wish to make a rough comparison of two or more groups for variability. The range takes account of the extremes of the series of scores only, and is unreliable when the area N is small or when there are large gaps (zero f’s) in the frequency distribution. Suppose that the highest score in a distribution is 120 and there is a gap of 20 points before we reach 100, the lowest score is 60, the single high score of 120 increases the range from 40 (100-60) to (120-60).

USE OF RANGE

a) When the data are too scant or too scattered to justify the computation of a more precise measure of variability.

b) When a knowledge of extreme scores or of total spread is all that is wanted.

MERITS OF RANGE

It can be easily calculated and understood.

DEMERITS OF RANGE

a) It helps us to make only a rough comparison of two or more groups with respect to the variability of the scores concerned.

b) It is very greatly affected by fluctuations in sampling. Its value is never stable.

c) It takes into account only the two extreme end scores of a distribution and is unreliable when N is small or when there are large gaps in the frequency distribution.

d) The range does not take into account the composition of a group or a nature of distribution of the scores within the extremes. The range of a symmetrical and asymmetrical distribution can be identical.

2) THE QUARTILE DEVIATION

The quartile deviation or Q is one-half the scale distance between the 75^th and 25^th percentiles in a frequency distribution. The 25^th percentile or Q₁ is the first quartile on the score scale, the point below which lie 25% of the scores. The 75^th percentile or Q₃ is the third quartile on the score scale, the point below which lie 75% of the scores. Quartile deviation is expressed by the formula:

Q = (Q₃-Q₁)/2

To find Q, we must first compute the 75^th and 25^th percentiles. These statistics are found in exactly the same way as was the median, which is, the 50^th percentile or Q₂. The only difference is that ¼ of N is counted off from the low end of the distribution to find Q₁ and that ¾ N is counted off to find Q₃. The formulas are:

Q₁= l + [i (N/4 – Cum f₁)]/ f_q

Q₃= l + [i (3N/4 – Cum f₁)] / f_q

Where l = the exact lower limit of the interval in which the quartile falls.

i = the length of the interval.

Cum f₁ = cumulative f upto the interval which contains the quartile

f_q = the f on the interval containing the quartiles.

Quartile deviation of the ungrouped data can be calculated as follows:

Scores: 12, 12, 15, 17, 20, 25, 25, 26, 33, 37, and 4

Q = (Q₃ - Q₁)/2

Since there are 11 scores the 3^rd score is Q₁ and the 9^th score is Q₃. So,

Q = (9^th term – 3^rd term)/2

= (33 – 15)/2

= 9

Quartile deviation from grouped data can be calculated as follows:

Class	Frequency	Cumulative frequency
120 – 139	50	1000
100 – 119	150	950
80 – 99	500	800
60 – 79	250	300
40 – 59	50	50

N = 1000

Q₁ = l + [i (N/4 – Cum f₁)]/ f_q

= 595 + [20(250-50)] / 250

= 75.5

Q₃ = l + [i (3N/4 – Cum f₁)] / f_q

= 79.5 + [20 (750 – 300)] / 500

= 97.5

Q = (Q₃ – Q₁) / 2

= (97.5 – 75.5) / 2

= 11

USE OF QUARTILE DEVIATION

a) When the median is taken as a measure of central tendency.

b) When the details of the distribution at either end is available.

c) When there are scattered or extreme scores which would influence the standard deviation disproportionately.

MERITS OF QUARTILE DEVIATION

a) It is a more representative and trust worthy measure of variability than the range.

b) It is a good index of score density at the middle of the distribution.

c) It is useful in indicating the skewness of a distribution.

d) Like the median, it is applicable to open-end distributions.

DEMERITS OF QUARTILE DEVIATION

a) It is not capable of further algebraic treatment.

b) It is possible for two distributions to have equal quartile deviation, but quite dissimilar variability at the lower and upper 25% scores. This may lead to incorrect conclusions.

c) It is unduly affected by a considerable clustering of scores at any one end of a distribution.

3) THE STANDARD DEVIATION

The standard deviation or SD is the most stable index of variability and is customarily employed in experimental work and in research studies. The SD differs from the average deviation in several respects. In computing the average deviation, we disregard signs and treat all deviations as positive, where as in finding the SD we avoid the difficulty of signs by squaring the separate deviations. Again the squared deviations used in computing the SD are always taken from the mean, never from the median or mode. The conventional symbol for the SD is the Greek letter Sigma (σ).

The formula for calculating SD of a few scores is

σ = √ (∑x² / N)

Let us find the SD from the following scores:

16, 18, 20, 22, 24

The mean of the scores = (16+18+20+22+24)/5= 20

Scores	Mean	Deviation from the mean( x )	Square of deviation(x²)
16	20	-4	16
18	20	-2	4
20	20	0	0
22	20	2	4
24	20	4	16
N = 5			∑x²= 40

σ = √ (∑x^₂/N)

= √ (40/5)

= 2.83

SD from the grouped data can be obtained by the formula:

σ = √ (∑fx²/N)

Class	Midpoint (X)	Frequency (f)	fX	Deviation from mean(x)	fx	fx²
0-2	1	1	1	-10	-10	100
3-5	4	3	12	-7	-21	147
6-8	7	1	7	-4	-4	16
9-11	10	2	20	-1	-2	2
12-14	13	3	39	2	6	12
15-17	16	3	48	5	15	75
18-20	19	2	38	8	16	128
		N=15	∑fX =165			∑fx² =480

Mean = ∑fx / N

= 165 / 15 = 11

σ = √ (∑fx^₂/N)

If frequencies are large, this procedure may involve complex calculations. So a short cut method can be used to calculate SD using the formula:

σ = i √ [ (∑X²/N) - (∑X/N) ²]

Class	Frequency (f)	Deviation (X)	fX	fX²
50-54	3	4	12	48
45-49	4	3	12	36
40-44	5	2	10	20
35-39	8	1	08	08
			+ 42

30-34	10	0	0	0
25-29	06	-1	-6	6
20-24	04	-2	-8	16
15-09	04	-3	-12	36
20-24	03	-4	-12	48
15-09	03	-5	-15	75
			-53
i = 5	N = 50		∑fX=-11	∑fX²=293

σ = i x √ [(∑X²/N) - (∑X²/N)²]

= 5 x √ [(293/50) - (11/50)²]

= 5 x √5.81 = 12.05

USE OF STANDARD DEVIATION

a) When the statistics having the greatest stability is sought.

b) When extreme deviations should exercise a proportionally greater effect up on variability.

c) When coefficients of correlation and other statistics are subsequently to be computed.

MERITS OF STANDARD DEVIATION

a) It is well defined and its value is always definite.

b) It is based on all the scores in the data.

c) It is amenable to algebraic treatment and possess many useful mathematical properties, this is why it is used in many advanced statistical studies.

d) It is less effected by fluctuations in sampling than most other measures of variability.

DEMERITS OF STANDARD DEVIATIONS

a) Statistical interpretation using SD is comparatively difficult.

b) It gives more weightage to extreme scores and less to those which are near the mean, because the squares of the deviations are taken. These squares will become very large as the deviations increase.

SUMMARY

There is a tendency for data to be dispersed, scattered or to show variability around the average or the central value. This tendency is known dispersion or variability. Range is the simplest but very rough measure of variability. Range is the difference between the highest and the lowest scores of the series, and thus depends only on the position of two extreme scores and, as such it is not reliable.

Quartile deviation is designed as the semi inter quartile range and is computed by the formula,

Q = (Q₃ – Q₁) / 2

Where Q1 and Q3 represents the first and third quartiles of distribution. It is more stable than the range, but it also fails to take in to account the fluctuations of all the items in series. Standard deviation, denoted by symbol `σ ‘is the square root of arithmetic average of the squared deviations of scores from the mean of the distribution. It is regarded as the most stable and reliable measure of variability.

In an ungrouped data, the formula for SD is σ = √ ∑X²/ N ,

In case of grouped data, the formula for SD is σ = √ ∑fX²/ N and

The shortcut formula is σ = i x √ [(∑fX²/N) - (∑fX²/N)²] .

In computation of further statistics from the measure of dispersion, we always prefer to compute standard deviation (SD) to all other measures of variability.

REFERENCES

a) Henry .E. Garrett - Statistics in Psychology and Education.

b) S.K. Mangal - Statistics in Psychology and Education.

SAMANWAYAM

Pages

Tuesday, 3 January 2017

ISSUES IN CONTEMPORARY EDUCATION

SANKHYA

ADJUSTMENT MECHANISMS IN PSYCHOLOGY