This lesson introduces the Principles of Experimental Design and blocking.
I. Objectives
II. Reading Assignment
IV. Explanation and Examples
In our last lesson, we designed an experiment which compared two types of paint. In general, an experiment is conducted to make comparisons among different treatments. This can become increasingly complicated as the number of treatments increases, or if the differences between experimental units become more apparent. The simplest case for a design is when
Given this simple situation, all available experimental units are randomly allocated among the treatments. This simple design, called a Completely Randomized Design, also works when we have more than 2 treatments.
Example. Suppose we have 4 different gasoline additives and we want to compare the increase in gas mileage when they are used. We have the resources to conduct 20 trials using the same vehicle repeatedly. Since the experimental units are the same car at different times, we can use the completely randomized design. Using only the 4 additives, we divide up the 20 experimental units into 4 groups randomly using the process in the previous lesson. It is extremely important that we use randomization. Just because we do not recognize differences among our experimental units before the experiment is conducted does not mean these differences do not exist. What if the trials are conducted repeatedly with little time in between? If the vehicle's gas mileage decreases as the oil gets dirtier, we will notice a trend of less mileage throughout the trials. If we had assigned additive 1 to the first 5 trials, additive 2 to the second, etc., we would have introduced bias into our experiment. Using randomization removes this possibility of bias.
In this last example, we wanted to test if our gas mileage increased depending on which additive we used. Suppose that the results are that all additives resulted in similar gas mileage. Have we shown that any one of them actually increases gas mileage? No. We have only compared them to each other and have not tested that any actually works. To test this, we need to introduce a fifth treatment called a control. A control is a treatment that is actually "no treatment." In the pharmaceutical industry, it would be called a placebo. In our experiment, the control would be the treatment of gasoline without any additive. If any of the additives outperforms the control, then we know that additive works.
A few problems that can arise in experimental design have been mentioned. To protect our experiments from these problems and others that are yet to be discussed, we introduce the Principles of Experimental Design.
The Principles of Experimental Design
In our last experiment with gasoline additives, suppose we had 3 cars available to us. Also assume that they were very different in terms of gas mileage. When we randomly assign which car gets which additive, it is possible that a particular additive is always assigned to the car with the worst gas mileage. This would affect the results of our experiment making us believe that this additive worked poorly when it could actually be the best. In this example, the type of car used would be called a lurking variable. Lurking variables are variables that have an important effect on the response in an experiment but are not included among the variables studied. The existence of lurking variables leads to confounding: when effects of different variables on the response cannot be distinguished from each other.
Example. Two different types of resins can be used in a new plastic polymer. An experiment is performed to test which resin results in a more durable polymer. Suppose there are two methods used to introduce the resin to the polymer, method A and method B. Resin 1 is only produced at a lab in California where they only use method A, while resin 2 is produced in Texas where only method B is used. After the experiment has been performed, the polymer produced in California is found to be more durable, but we are unable to say that this is due to resin 1. The type of resin is confounded with the method used. The polymer in California may be more durable because method A results in more durable polymers while it may be possible that resin 1 actually is less effective than polymer 2.
How can we remove the effects of lurking variables? If we make the conditions of each trial of our experiment identical, then we have removed outside influences. This is more easily said than done. Quite often, lurking variables are not identified until after the experiment has been performed when it is too late to remove them. We also may not have the resources to remove lurking variables. In the polymer example, if the labs in California and Texas do not have the equipment to perform a new method, then we may have to resort to transporting the resins to one lab and use only one method. This may limit the number of trials since we can only use one lab. In this way we are able to remove the lurking variable completely from the experiment, but sometimes the lurking variable, once identified, can become of interest.
Instead of removing a lurking variable, we can include it as part of the treatment. In our example, we have two resins and two methods. From this we can consider four treatments, resin 1 with method A, resin 1 with method B, resin 2 with method A, and resin 2 with method B. Both method and resin are called factors of the treatment. An experiment can have many factors and to calculate the number of treatments resulting from many factors, we multiply the number of levels of each factor together. If there had been 3 methods and 4 resins to be tested, we would have 12 different treatments. In our example, to include method as a factor we could have each lab randomly select half of their resin and ship it to the other lab so that each resin would be introduced with each method.
So far, we've been able to remove lurking variables by either removing them from the experiment all together or to consider them as part of the treatment. A third way of removing lurking variables is called blocking. This is when the lurking variable is inherent in the experimental unit and cannot be randomly assigned as a treatment.
Example. Suppose we have iron ore mined from three different locations. We would like to test 5 new types of acid. Some measure of reactivity will be used when the acid is introduced to the ore. We would like to know which acid is most volatile. The iron ore from each location is found to have different impurities which may affect the outcome of our experiment. Since we have a limited amount of ore from each location, we are unable to throw out the ore from two locations. Thus we only consider identical ore samples. We are also unable to remove the impurities and reapply in a random fashion so we cannot consider the location as a treatment as we did in the polymer example. If all the ore had been identical, we could have used a Completely Random Design and randomly split the ore samples into 5 groups, one for each acid type. But since our ore has inherent differences, we will first keep our ore separated by these differences - keeping them in their three location-groups. Next, within each group we separate the samples into 5 categories, one for each acid type. When we measure the reactivity, we will look to see if the type of impurity changes the effect of the acid. The type of design used here is called a Randomized Complete Block Design. In this design, the experimental units are first separated into blocks. Within each block the experimental units are homogeneous while experimental units chosen from two different blocks may have inherent differences.
The second principle of experimental design is to use randomization to assign subjects to treatments. The purpose of this is to remove bias. If the researcher has a preconceived notion of how the experiment should work out, it is only human nature to help this happen whether consciously or subconsciously. To keep the researcher's personal thoughts from affecting the outcome, we randomly assign experimental units to treatments. Randomization can be thought of as an insurance policy against bias. This is especially important when publishing results and conclusions of an experiment for the scientific community. We may believe that we are unbiased, but if our results show what we had hoped for, other researchers may be more dubious unless we can prove that we used randomization to protect the experiment from ourselves.
The third principle of experimental design deals with weird occurrences. Every once in a while something really strange will happen. How do we know that this is not the case in any experiment we conduct? Say we have 3 treatments and only 3 experimental units. We randomly match treatment with experimental unit and then perform our experiment. It is always possible that this one time treatment 1 will outperform the other treatments when in fact it is almost always the worst. How could this happen? We could have picked a very strange experimental unit that could affect our experiment in a way that normally would not happen. The researcher could have measured something incorrectly or some kind of freak accident could have happened. To protect us from this random possibility, we replicate our experiment. We repeat our experiment as often as we can afford. A freak accident may still occur, but the effects on our experiment will not be as severe. We study the average outcomes instead of a single trial.
Notice that the last two principles work both with random chance; the second principle is to introduce random chance to remove bias while the third removes random freak occurrences by replicating. Also note that replication does not remove bias. In fact, it may compound bias. If a researcher has a bias for a particular treatment, and then replicates the same mistake over and over, the bias will just have a larger effect.
So far in this lesson, two different types of designs have been introduced: the Completely Random Design and the Randomized Complete Block Design. A third design which is very similar to the Randomized Complete Block Design is called a matched pairs design. We found that when we block our experimental units by their inherent difference in a randomized complete block design, we can protect our experiment from confounding. Sometimes we will have inherent differences in our experimental units but we are unable to block because each are so different from the other. We would not be able to have blocks large enough to apply each treatment. The easiest way to see this is by considering people. Suppose you have 2 treatments and 10 people. You would want to apply each treatment 5 times but you know that all your people are different. They may be of all different sizes, fitness levels, gender, ethnicity, etc. Now consider your 10 people to be 5 sets of identical twins. We can randomly pick one member of each set of twins to apply each treatment. In this way, the lurking variables among the people have been neutralized. But this is an over idealized case. It is not often you have 5 sets of twins.
Example. Suppose you have 10 very different people. You want to conduct an experiment to see which of two insect repellants works best. You could separate your ten people randomly into 2 groups of 5. Then apply each insect repellant to their five people. But this does not alleviate the problem that the 5 people may react differently to the repellants. Instead, consider your set of experimental units to be 10 pairs of arms. For each person, randomly select one arm to receive repellant A and then apply repellant B to the other arm. In this fashion we have applied each treatment to each person. We can see how each person reacts to both treatments. How many experimental units do we have under this second organization? Since the experimental units are arms and we have 20 arms, we have 20 experimental units.
Reading Review Questions