Tuesday, July 15, 2008

MBR Modeling Node

On page 462, under the Fundamental Contents to MBR Modeling section, I would like to make it very clear as to the reason why principal component variables are used in nearest neighbor modeling in SAS Enterprise Miner. One reason why is because the first principal component is first entered into the model, then followed by the second principal component variable. In other words, the nearest neighbor modeling estimates are calculated similar to moving average estimates in which the first k-values are averaged by the sorted values of the first variable in the model within the subsequent values of the second variable.

In Enterprise Miner, the probe x is defined by the sorted values of the input variables that are created in the SAS data set. Since it is recommended in using the principal component scores with numerous input variables to the analysis, then the probe x is determined by the sorted values of the principal component scores. Therefore, the values of the first principal component will determine the sorted order of the fitted values of the target variable to the nearest neighbor model. The nearest neighbor modeling estimates are calculated by the average target values or the number of target categories within a predetermined window of k points that lie closest to the current data point to fit in the multidimensional region.

No comments: