22
Statistical and Practical Elements of
Model Building
This chapter revisits the systematic procedure for identification that was outlined in Chapter
1, with specific attention to model development. Input and experimental design for identifica-
tion are presented. Statistical principles and certain pragmatic guidelines for choosing model
structures, the estimation algorithm and model quality assessment are presented. Information
theoretic methods for order determination are also discussed.
22.1 INTRODUCTION
Until this point in the text, we have studied the different theory and principles of identification, in
particular, the various deterministic-plus-probabilistic model structures, important estimation prin-
ciples and methodologies, and the mathematics of computing predictions for a given model. Ap-
plication of these principles to model development, as we have observed, also requires inputs and
decision making from the user’s end.
The objective of this chapter is to address the finer issues and aspects of model development that
constitute the gap between theory and practice.These issues are described below:
i. Input design concerned with generating informative data, which in turn ensures identifiability.
ii. Data pre-processing including removal of means, trends and drifts; handling outliers and missing
data, and pre-filtering.
iii. Estimating input-output delay from data using non-parametric and parametric methods.
iv. Guidelines for shortlisting candidate models from the repertoire of models in Chapter 17.
v. Initial conditions for constructing regressors and executing the numerical non-linear PEM algo-
rithms for model estimation.
vi. Statistical measures of testing model adequacy, i.e., model accuracy, reliability and predictive
abilities.
vii. Methods and guidelines for model structure selection and order determination.
Input design is a very important step in identification since it is responsible for generating infor-
mative data. When this stage fails, the identification exercise results in poor models regardless of
the rigor in the remaining steps. Solutions to the remaining issues are based on a combination of
statistics, intuition and insight. Time-delay estimation can be formulated as a rigorous optimization
problem in time or frequency domains, with the latter yielding efficient estimates. On the other hand,
there exists no automated procedure for model development, and rightfully so since the experience
and domain knowledge of the user can barely be replaced by a set of rules or formulae. An approach
that works well in most cases is to start with a simple model first and gradually sophisticate it using
the feedback from model diagnostic checks. Statistical analysis of the model estimates, predictions
and prediction errors play a vital role in model diagnosis and making any necessary improvements
to models. Needless to say, the user’s intuition and experience is also invaluable in this respect.
The methods and guidelines presented in this chapter are demonstrated on case studies in Chap-
ter 24. It is also useful to place the various aspects addressed in the chapter in the context of the
systematic identification procedure outlined in Chapter 1, reproduced here for convenience:
611
Get Principles of System Identification now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.