The accuracy of thematic maps derived by image classification analyses is often compared in remote sensing studies. This comparison is typically achieved by a basic subjective assessment of the observed difference in accuracy but should be undertaken in a statistically rigorous fashion. One approach for the evaluation of the statistical significance of a difference in map accuracy that has been widely used in remote sensing research is based on the comparison of the kappa coefficient of agreement derived for each map. The conventional approach to the comparison of kappa coefficients assumes that the samples used in their calculation are independent, an assumption that is commonly unsatisfied because the same sample of ground data sites is often used for each map. Alternative methods to evaluate the statistical significance of differences in accuracy are available for both related and independent samples. Approaches for map comparison based on the kappa coefficient and proportion of correctly allocated cases, the two most widely used metrics of thematic map accuracy in remote sensing, are discussed. An example illustrates how classifications based on the same sample of ground data sites may be compared rigorously and highlights the importance of distinguishing between one- and two-sided statistical tests in the comparison of classification accuracy statements.