Replace Missing Values
In this example, the variables SES and URBANICITY are class variables for which the
value ? denotes a missing value. Because a question mark does not denote a missing
value in the terms that SAS defines a missing value (that is, a blank or a period), SAS
Enterprise Miner sees it as an additional level of a class variable. However, the
knowledge that these values are missing will be useful later in the model-building
process.
To use the Replacement node to interactively specify that such observations of these
variables are missing:
1. Select the Modify tab on the Toolbar.
2. Select the Replacement node icon. Drag the node into the Diagram Workspace.
3. Connect the Data Partition node to the Replacement node.
4. Select the Replacement node. In the Properties Panel, scroll down to view the Train
properties.
a. For interval variables, click on the value of Default Limits Method, and select
None from the drop-down menu that appears. This selection indicates that no
values of interval variables should be replaced. With the default selection, a
particular range for the values of each interval variable would have been
enforced. In this example, you do not want to enforce such a range.
Note: In this data set, all missing interval variable values are correctly coded as
SAS missing values (a blank or a period).
b. For class variables, click on the ellipses that represent the value of Replacement
Editor. The Replacement Editor opens.
Notice that SES and URBANICITY both have a level that contains
observations with the value ?. In the case of these two variables, this level
represents observations with missing values. Enter _MISSING_ as the
Replacement Value for the two rows, as shown in the image below. This
action enables SAS Enterprise Miner to see that the question marks indicate
missing values for these two variables. Later, you will impute values for
observations with missing values.
Replace Missing Values 19
Enter _UNKNOWN_ as the Replacement Value for the level of
DONOR_GENDER that has the value
A. This value is the result of a data
entry error, and you do not know whether the intention was to code it as an
F
or an M.
Click OK.
5. In the Diagram Workspace, right-click the Replacement node, and select Run from
the resulting menu. Click Yes in the Confirmation window that opens.
6. In the window that appears when processing completes, click OK.
In the data that is exported from the Replacement node, a new variable is created for
each variable that is replaced (in this example, SES, URBANICITY, and
DONOR_GENDER). The original variable is not overwritten. Instead, the new variable
has the same name as the original variable but is prefaced with REP_. The original
version of each variable also exists in the exported data and has the role Rejected.
To view the data that is exported by a node, click the ellipsis button that represents the
value of the General property Exported Data in the Properties Panel. To view the
exported variables, click Properties in the window that opens, and then view the
Variables tab. Similarly, you can view the data that is imported and used by a node by
clicking the ellipsis button that represents the value of the General property Imported
Data in the Properties Panel.
T I P Preditive Modeling with SAS Enterprise Miner: Practical Solutions for
Business Applications” provides examples of and options for the StatExplore and
Replacement nodes. The book also discusses alternate configurations for the Data
Partition node.
20 Chapter 4 Explore the Data and Replace Input Values

Get Getting Started with SAS Enterprise Miner 14.1 now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.