Centro de Cálculo, Facultad de Ingeniería (11), CC 30


Montevideo, URUGUAY


Removing outliers from records prior of its use is a major concern in any technical or scientific field. Meteorology is not an exception, and an important effort in devise methods has been made to locate them despite the fact that it has been misconsidered as a purely technical task. The currently applied methods are very crude because they are mostly computerized versions of traditional criteria, failing to exploit the capabilities of modern computer systems. Extensive comparison among methods have not been done, no reliable statistical comparison among different outlier detection strategies can be made without a tool for generate instances of a database contaminated with artificial errors. This paper describes a heuristic model suitable to simulate the usual errors observed in a 30 years, ten stations, daily rain dataset, which has been carefully checked against typing errors. We will restrict ourselves to simulate only such errors. Some methods are discussed, namely: a) choosing at random other value in the same dataset b) choose at random other value for the same station c) model imperfectly some driving mechanism for the errors. The results will be compared with the observed problem, and from them we were able to show that options a) and b) underpredicts the difference between errors and true values, while even imperfect, option c) renders satisfactory results.

Presented at:

X Congresso Brasileiro de Meteorologia, Brasilia, 26-30 October, 1998

If you are still interested in it; here you have: THE TECHNICAL REPORT in .PDF format (331KB)