next up previous contents
Next: Stem and Leaf Display Up: Graphical Presentation Previous: Graphical Presentation   Contents


Histogram

Grades of copper from 5 soil samples were measured: 24, 28, 22, 23 and 24 ppm. What is the distribution of the values of these grades? A histogram could look as follows:

Figure 2.1: 5 Grades of Copper in ppm.
/begin{figure}
% latex2html id marker 1674
/setlength{/unitlength}{1mm}/begin{ce...
...){/rule{4.5mm}{1.8mm}}
} % Ende scriptsize
/end{picture}/end{center}/end{figure}

Figure 2.1 is already very interesting, however it becomes more informative if we have more data at hand. E.g. 80 sample values are shown in Figure 2.2.

Figure 2.2: Grades of Copper in ppm.

/begin{picture}
% latex2html id marker 1711
(70,40)
{/scriptsize/put(0,5){/line(...
...iput(57.75,5)(0,1.3){1}{/rule{4.5mm}{1.1mm}}
} % Ende footnotesize
/end{picture}
    

/begin{picture}
% latex2html id marker 1758
(75,40)
{/scriptsize/put(5,5){/line(...
...ltiput(62.75,5)(0,1.3){1}{/rule{4.5mm}{1.0mm}}
} % Ende scriptsize
/end{picture}

The average of all the values may be seen to be approximately 22 ppm.

The measured variable (grade of copper) is a continuous one. Practically, the values are registered in a discrete manner. For displaying the frequencies in a histogram, the values have to be classified anyways. The width of classes is an open question. If grade is measured up to $/frac{1}{2}$ ppm then a simple histogram might look like in Figure 2.3.

Figure 2.3: Grades of Copper in ppm.
/begin{figure}
% latex2html id marker 1821
/setlength{/unitlength}{1mm}/begin{ce...
...5){1}{/rule{2mm}{2mm}}
} % Ende scriptsize
/end{picture}/end{center}/end{figure}

In theory, one could increase the number of classes more and more, in accord to the increment of the number of sample values, such that the contours of the histograms tend to a smooth curve as shown in Figure 2.4.

Figure 2.4: Ideal Limiting Distribution of Grades of Copper.
/begin{figure}/begin{center}
/mbox
{/beginpicture
/setcoordinatesystem units <.9...
...3174 -2.5 .01753
-2.75 .00909 -3. .00443 /
/endpicture}
/end{center}/end{figure}

The frequencies in the histograms were given in absolute values. Of course, it is also senseful to show the scaling of relative frequencies, which simply means the division of each absolute frequency by the total number of samples. Sometimes a data value is equal to a class limit which provokes a necessary decision of what class the value should belong to. In our example, the ``observer'' (the person who analyses the grade) was told to register the number up to a tenth $ppm$, and in the case that a value is very close to a half of $ppm$, to add a sign + or - to the value if it were just above or below, respectively. The registered values are reproduced in Table 2.1.



Table 2.1: Values of Grades of Copper in ppm.
28.3 25.4 27.0 25.5- 20.9 24.0 25.1 22.2 24.8 25.1
23.5- 24.6 26.1 24.7 26.5- 27.5- 25.6 22.9 25.5+ 23.5-
23.9 26.5+ 24.5- 24.3 24.5+ 27.0 27.3 25.0 22.8 23.5
26.4 27.1 23.4 24.1 26.7 24.9 23.5+ 27.4 25.5+ 22.8
25.1 24.7 26.3 21.8 23.2 24.3 24.5- 26.0 24.1 27.5

In practice, with pencil and scratch paper, one would not draw a histogram but try to produce a list of frequencies ``tallying'' by putting slashes for a value (a simple stroke per item) in a certain class, or by using up to 10 symbols (dots, boxlines, crossed lines) (see Tukey, 1977[22]). The presentations in the following avoid the complication of rounding up or down. A number simply is truncated in order to find the class which it belongs to. At the same time, it is attempted to show as much information as possible in each class.


next up previous contents
Next: Stem and Leaf Display Up: Graphical Presentation Previous: Graphical Presentation   Contents
Rudolf Dutter 2003-03-13