Categorical Predictors in SPC for Excel

When categorical predictors are used in SPC for Excel regression routines, a dummy coding system is used to make dummy variables out the categorical predictors.    The dummy coding system used is 1, 0.  An example of this is given below.

Suppose you have two factors you think might impact a response.  Factor 1 is a categorical predictor.  Factor 2 is a numerical predictor.  The data are shown below in Table 1.

Table 1: Regression Data with Categorical Predictors

Factor 1Factor 2Response
A1762
A1868
A1366
B1760
B1566
B1557
C1962
C2054
C1959

Factor 1 has three possible levels:  A, B and C. With the 1,0 dummy coding system, the coefficients represent the distance between the factor levels and the reference level.   The reference level is the first level when the levels for the categorical factor are placed in alphabetical order.  For this example, the levels under Factor 1 in alphabetic order are:

  • A
  • B
  • C

A is the reference level in this example.  In creating the dummy variables, there is no column created for A.  There is one column for B and on column  for C.  The settings are shown in Table 2.

 Table 2: Level Settings for Categorical Factor 1

If the Factor 1 level is:BC
A00
B10
C01

The table with the dummy variables added is shown below.  The first column in Table 1 is converted to the two columns: Factor 1_B and Factor 1_C.  SPC for Excel uses the nomenclature of listing the categorical factor name first followed by the level that the column responds to.  The first column is for level B in Factor 1.

Table 3: Dummy Variables for Categorial Predictors

Factor 1_BFactor 1_CFactor 2Response
001762
001868
001366
101760
101566
101557
011962
012054
011959

The coefficients determined by the software are used to build the model.  The coefficients are given in the table below.  Note that there is the intercept then the three predictors in Table 3: Factor 1_B, Factor 1_C and Factor 2.

Table 4: Predictor Table

PredictorCoefficient
Intercept71.18
Factor 1_B-4.455
Factor 1_C-5.782
Factor 2-0.365

Not there is nothing for level A in Factor 1.  This is because the coefficient for Factor 1 when Level A is used is 0. 

A model exists for each level of a categorial predictor.  In this example, the model with the various levels of Factor 1 used are:

Level A (coefficient is zero):

Response = 71.18  + -0.365(Factor 2)

Level B:

Response = 71.18 + -4.455(Factor 1_B) + -0.365(Factor 2)

Response = 71.18 + -4.455(1) + -0.365(Factor 2)

Response = 66.725 – 0.365(Factor 2)

Level C:

Response = 71.18 + -7.582(Factor 1_C) + -0.365(Factor 2)

Response = 71.18 + 5.782( (1) + -0.365(Factor 2)

Response = 65.97 – 0.365(Factor 2)

Scroll to Top