In reporting the results of your modeling efforts, you need to be explicit about the methods used, the assumptions made, the limitations on your model’s range of application, potential sources of bias, and the method of validation (see the following chapter). The section on “Limitations of the Logistic Regression”2 from Bent and Archfield [2002], a publication of the USGC, is ideal in this regard:

The logistic regression equation developed is applicable for stream sites with drainage areas between 0.02 and 7.00 mi2 in the South Coastal Basin and between 0.14 and 8.94 mi2 in the remainder of Massachusetts, because these were the smallest and largest drainage areas used in equation development for their respective areas. [The authors go on to subdivide the area.]

The equation may not be reliable for losing reaches of streams, such as for streams that flow off area underlain by till or bedrock onto an area underlain by stratified-drift deposits (these areas are likely more prevalent where hillsides meet river valleys in central and western Massachusetts). At this juncture of the different underlying surficial deposit types, the stream can lose stream flow through its streambed. Generally, a losing stream reach occurs where the water table does not intersect the streambed in the channel (water table is below the streambed) during low-flow periods. In these reaches, the equation would tend to overestimate the probability of a stream flowing perennially at a site. ...

Get Common Errors in Statistics (and How to Avoid Them), 4th Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.