Ultimately, we are experimenting with the goal of optimizing a system. Factorial or fractional designs are good for initial trials when we have limited information. After this we can proceed with a sequence of experiments to ensure that we slowly replace factorial experiments with designs that are closer to the optimal conditions. This procedure is called response surface methods (RSM).
 
RSM for a Single Variable

First let’s consider the effect of a single factor, \mathbf{x_1} as it relates to our response, \mathbf{y}. This is to illustrate the general response surface process.

Plot demonstrating the response surface methods with just one singular factor.
Figure 9.3.1.1 Plot demonstrating the response surface methods with a single factor.

We start at the point marked \mathbf{i=0} as our initial baseline (cp=center point). Then, we run a 2-level experiment, above and below this baseline at -1 and 1and obtain the corresponding response values of \mathbf{y_{0,-}} and \mathbf{y_{0,+}}. From this we can estimate a line of best fit and move in the direction that increases \mathbf{y}. Note that the sloping tangential line is also called the path of steepest ascent. Make a move of step-size = \mathbf{\gamma_1} units along \mathbf{x_1} and measure the response,  \mathbf{y_1}. Since the response variable increased, we keep going in this direction.

Make another step-size, this time of \mathbf{\gamma_2} units in the direction that increases \mathbf{y}. We measure the response, \mathbf{y_2}, and are still increasing. Encouraged by this, we take another step of size \mathbf{\gamma_3} . The step-sizes, \mathbf{\gamma_i} should be of a size that is big enough to cause a change in the response in a reasonable number of experiments, but not so big as to miss an optimum.

Our next value of \mathbf{y_3} is about the same size as \mathbf{y_2}, indicating that we have plateaued. At this point we can take some exploratory steps and refit the tangential line (which now has a slope in the opposite direction). Or we can use the accumulated datapoints to fit a non-linear curve. Either way, we can then estimate a different step-size \mathbf{\gamma_4} that will bring us closer to the optimum.

This approach works well when there is only a single factor that affects the response. However, in most systems there are multiple factors that affect the response, we need to adapt this method to find optimums for those systems.

9.3.2. Optimization of a 2-Variable System

Let’s say we are looking to optimize a bioreactor where two factors, temperature T, and substrate concentration S are known to affect the yield. However, our outcome of interest is actually total profit which takes into account energy costs, raw materials costs and other relevant factors. Figure 9.3.2.1 shows (hypothetical) contours of profit in light grey, but in practice these are often unknown. We currently operate at these baseline conditions:
  • T = 325 K
  • S = 0.75 g/L
  • Profit = $407 per day

We create a full factorial around this baseline by choosing \mathbf{\Delta_T = 10}K, and \mathbf{\Delta_S = 0.5}g/L based on our knowledge that these are sufficiently large changes to show an actual difference in the response value (see Table 9.3.2.1), but not too large so as to move to a totally different form of operation in the bioreactor.

Table 11.2.1 Bioreactor Experiment Design of Experiments
Experiment T (actual) S (actual) T (coded) S (coded) Profit
Baseline 325 K 0.75 g/L 0 0 407
1 320 K 0.50 g/L 193
2 330 K 0.50 g/L + 310
3 320 K 1.0 g/L + 468
4 330 K 1.0 g/L + + 571

It is evident that we can maximize profit by operating at higher temperatures and higher substrate concentrations. The only way, however, to know how much higher is to build a linear model of the system from the factorial data:

</span>\begin{split}\hat{y} &= b_0 + b_T x_T + b_S x_S + b_{TS} x_T x_S \\ \hat{y} &= 389.8 + 55 x_T + 134 x_S - 3.50 x_T x_S\end{split}</div> where <span class="math notranslate nohighlight">[latex]x_T = \dfrac{x_{T,\text{actual}} - \text{center}_T}{\Delta_T/2} = \dfrac{x_{T,\text{actual}} - 325}{5}

and similarly, x_S = \dfrac{x_{S,\text{actual}} - 0.75}{0.25}.

The model shows that we can expect an increase of $55/day of profit for a unit increase in T. In real-world units that would require increasing temperature by \Delta x_{T,\text{actual}} = 1\times \Delta_T /2 = 5K to achieve that goal. This scaling factor comes from the coding we used:

\[\begin{split}x_T &= \displaystyle \frac{x_{T,\text{actual}} - \text{center}_T}{\Delta_T / 2} \\
\Delta x_T &= \displaystyle \frac{\Delta x_{T,\text{actual}}}{\Delta_T / 2}\end{split}\]

Similarly, we can increase \(S\) by \Delta x_{S,\text{actual}} = 1\times \Delta_S /2 = 0.25g/L to achieve a $134 per day profit increase.

The interaction term is small, indicating that the response surface is mostly linear in this region. Figure 9.3.2.1 shows the model’s contours (straight, green lines). Notice that the model contours are a good approximation to the actual contours (dotted, light grey), which are unknown in practice.

Figure 11.2.1. First Factorial for Bioreactor Example.
Figure 9.3.2.1. First factorial experiment for bioreactor example.

To improve our profit in the optimal way we move along our estimated model’s surface, in the direction of steepest ascent. This direction is found by taking partial derivatives of the model function, but ignoring the interaction term since it is so small.

\[\dfrac{\partial \hat{y}}{\partial x_T} = b_T = 55 \qquad \qquad \dfrac{\partial \hat{y}}{\partial x_S} = b_S = 134\]

This means that for every b_T = 55 coded units that we move by in x_T we should also move x_S by b_S = 134 coded units. Mathematically:

\[\frac{\Delta x_S}{\Delta x_T} = \frac{134}{55}\]

The simplest way to do this is just to pick a movement size for one of the variables, then change the movement size of the other appropriately.

So we will choose to increase \Delta x_T = 1 coded unit, which means:

\[\begin{split} \Delta x_T &= 1 \\
\Delta x_{T,\text{actual}} &= 5 \,\text{K} \\
\Delta x_S &= \frac{b_S}{b_T} \Delta x_T = \frac{134}{55} \Delta x_T \\
\text{but we know that}\qquad\qquad \Delta x_S &= \frac{x_{S,\text{actual}}}{\Delta_S / 2} \\
\Delta x_{S,\text{actual}} &= \frac{134}{55} \times 1 \times \Delta_S / 2\,\, \text{by equating the previous 2 lines} \\
\Delta x_{S,\text{actual}} &= \frac{134}{55} \times 1 \times 0.5 / 2 = \bf{0.61}\,\,\text{g/L}\\\end{split}\]
This gives us the following conditions for run 5:
  • T_5 = T_\text{baseline} + \Delta x_{T,\text{actual}} = 325 + 5 = 330K
  • S_5 = S_\text{baseline} + \Delta x_{S,\text{actual}} = 0.75 + 0.6 = 1.36g/L

So, we run the next experiment at these conditions the daily profit is y_5 = $669. This is a substantial improvement from the baseline case.

We decide to make another move, in the same direction of steepest ascent, i.e. along the vector that points in the \dfrac{134}{55} direction. We move the temperature up 5K, although we could have used a larger or smaller step size if we wanted giving us the following conditions for run 6:

  • T_6 = T_5 + \Delta x_{T,\text{actual}} = 330 + 5 = 335K
  • (S_6 = S_5 + \Delta x_{S,\text{actual}} = 1.36 + 0.61 = 1.97 g/L

Again, we report a profit of y_6 = $688. It is still increasing, but not by nearly as much. Perhaps we are starting to level off. However, we still decide to move temperature up by another 5 K and increase the substrate concentration in the required ratio. We get the following conditions for run 7:

  • T_7 = T_6 + \Delta x_{T,\text{actual}} = 335 + 5 = 340K
  • S_7 = S_6 + \Delta x_{S,\text{actual}} = 1.97 + 0.61 = 2.58 g/L

The profit at this point is y_7 = $463. We have gone too far as profit has dropped off. So we return back to our last best point, because the surface has obviously changed, and we should refit our model with a new factorial in this neighbourhood:

Table 9.3.2.2 Sequential Runs of Bioreactor Experiment
Experiment T (actual) S (actual) T (coded) S (coded) Profit
6 335 K 1.97 g/L 0 0 $688
8 331 K 1.77 g/L - - $694
9 339 K 1.77 g/L + - $725
10 331 K 2.17 g/L - + $620
11 339 K 2.17 g/L + + $642

In order to move more slowly along the surface, this time we choose slightly smaller ranges in the factorial \text{range}_T = 8 = (339 - 331)K and \text{range}_S = 0.4 = (2.17-1.77)g/L.

A least squares model from the 4 factorial points (experiments 8, 9, 10, and 11 run in random order), seem to show that the promising direction now is to increase temperature but decrease the substrate concentration.

\[\begin{split}\hat{y} &= b_0 + b_T x_T + b_S x_S + b_{TS} x_T x_S \\
\hat{y} &= 673.8 + 13.25 x_T - 39.25 x_S - 2.25 x_T x_S\end{split}\]

As before we take a step in the direction of steepest ascent by b_T units along the x_T direction and b_S units along the x_S direction. Again we choose \Delta x_T = 1 unit, though we must emphasize that we could used a smaller or larger amount, if desired. Therefore:

\[\begin{split}\frac{\Delta x_S}{\Delta x_T} &= \frac{-39}{13} \\
\Delta x_S &= \frac{-39}{13} \times 1\\
\Delta x_{S, \text{actual}} &= \frac{-39}{13} \times 1 \times 0.4 /2 = -0.6\,\text{g/L}\\
\Delta x_{T, \text{actual}} &= 4\,\text{K}\\\end{split}\]
This gives us the following conditions for run 12:
  • T_{12} = T_6 + \Delta x_{T, \text{actual}} = 335 + 4 = 339K
  • S_{12} = S_6 + \Delta x_{S, \text{actual}} = 1.97 - 0.6 = 1.37g/L

We determine that at run 12 the profit is $ 716. But our previous factorial had a profit value of $725 on one of the corners. Now it could be that we have a noisy system; after all, the difference between $716 and $725 is not too much, but there is a relatively large difference in profit between the other points in the factorial.

Some considerations when you are approaching an optimum:

  • The response variable will start to plateau (remember that the first derivative is zero at an optimum)
  • If the response variable remains roughly constant for two consecutive jumps (you may have by-passed the optimum)
  • The response variable can decrease, sometimes very rapidly, if you overshoot the optimum
  • The presence of curvature can also be inferred when interaction terms are similar or larger in magnitude than the main effect terms

This means that an optimum will exhibit some form of curvature. Thus, a model that only has linear terms will be unable to find the direction of steepest ascent along the true response surface. We must add terms that account for this curvature.

9.3.3. Checking for Curvature

When the measured center point is quite different from the predicted center point in your linear model, that is a signal that there is curvature present. This can be accommodate for by adding polynomial terms to the model.

The factorial’s center point can be predicted from (x_T, x_S) = (0,0), and is just the intercept term. In the last factorial, the predicted center point was \hat{y}_\text{cp}= $670. Yet the actual center point from run 6 showed a profit of $688. This difference of $18 is substantial, especially when compared to the main effects’ coefficients.

9.3.4. Central Composite Designs

It is beyond the scope of this pressbook to go into detail about central composite designs. However, this section will show you what they look like for the case of 2 and 3 variables, taking an existing orthogonal factorial and augmenting it with axial points. Conveniently, these points can be added later as well to account for nonlinearity.

The axial points are placed 4^{0.25} = 1.4 coded units away from the center for a 2 factor system, and 8^{0.25} = 1.7 units away for a three factor system.

Figure 11.4.1. Illustration of central composite design for 2 and 3 factor systems. Axial points are placed at 1.4 and 1.7 units away from the center of 2 and 3 factor systems respectively.
Figure 9.3.4.1. Illustration of central composite design for 2 and 3 factor systems. Axial points are placed at 1.4 and 1.7 units away from the center of 2 and 3 factor systems respectively.

A central composite design layout was added to the factorial in the above example and the experiments run, randomly, at the 4 axial points.

The four response values were y_{13} = 720, y_{14} = 699, y_{15} = 610, and y_{16} = 663. This allows us to estimate a model with quadratic terms in it: y = b_0 + b_T x_T + b_S x_S + b{TS} x_T x_S + b_{TT} x_T^2 + b_{SS} x_S^2. The parameters in this model are found in the usual way, using a least-squares model:

y = 688 + 13x_T - 39x_S - 2.4 x_T x_S - 4.2 x_T^2 - 12.2 x_S^2

Notice how the linear terms estimated previously are the same! The quadratic effects are quite significant compared to the other effects, which was what prevented us from successfully using a linear model to project out to point 12 previously.

The final step in the response surface methodology is to plot this model’s contour plot and predict where to run the next few experiments. As the solid contour lines in the illustration show, we should run our next experiments roughly at T = 343K and S = 1.60 g/L where the expected profit is around $736. You can determine this approximately with your eyes or analytically. This is not exactly where the true process optimum is, but it is pretty close to it.

This example has demonstrated how powerful response surface methods are. A minimal number of experiments quickly converged towards the true, unknown process optimum. We achieved this by building successive least squares models that approximate the underlying surface. Those least squares models are built using the tools of fractional and full factorials, as well as basic optimization theory, to climb the hill of steepest ascent.