Homework 1 St 711 Due date listed on web. (just place it on the front desk when you come to class) A proper randomization scheme would make the probability of the first two units getting the same treatment the same as the probability that the last two get the same. In assigning 10 cows to 2 treatments, five per treatment, the probability that the second cow gets the same treatment as the first is 4/9 in a proper randomization scheme, since there are 9 equally likely possibilities remaining and 4 of them match the first choice. Suppose my treatment units are 10 cows on which I will run my experiment, getting one observation per cow. I have 2 treatments, perhaps 2 ways of doing the morning milking of a cow, and I will assign them as follows. For cow 1, I toss a coin. If it comes up heads, I use treatment 1 and if tails, treatment 2. I do the same on all subsequent cows until I have used one of the treatments 5 times. Then I will do the rest of the trials with the other treatment. 1. What is the probability of tossing exactly 4 heads (and 4 tails) in 8 tosses of a fair coin? (you will see later why I asked this) 2. Run this program to see how the treatments are assigned in 100 repeats of this "randomization" scheme. Note that the numeric value for cowj will be the treatment assigned to cow j. Data random; array cow(10); keep cow1-cow10 match910; do trial = 1 to 100; sum1=0; sum2=0; do unit = 1 to 10; if ranuni(1635409) > 0.5 then do; cow(unit)=1; sum1=sum1+1; end; else do; cow(unit)=2; sum2=sum2+1; end; if sum1 = 5 then do i=unit to 9; cow(i+1)=2; unit=10; end; if sum2 = 5 then do i=unit to 9; cow(i+1)=1; unit=10; end; end; match910 = (cow9=cow10); output; end; proc print data=random(obs=20); run; proc means sum data=random; var match910; run; (A) How many times (out of 100) did cows 9 and 10 get the same treatment? Note: You can print everything out and count by hand but since X=(A=B); is a logical function returning X=1 when A and B are equal and X=0 otherwise, the program uses PROC MEANS with this idea to count up how many times the condition holds. (B) How many times (out of 100) did cows 1 and 2 get the same treatment? Do the answers to (A) and (B) seem close to each other? Why is that of interest? 3. What is the probability that I will have to toss the coin when I get to cow 9? Use the fact that the only way this could happen is if I had tossed exactly 4 heads and 4 tails up to that point. If I have stopped tossing when I get to cow 9, what is the implication for the cow 9 and 10 treatments? 4. Using the above, compute the probability that cows 9 and 10 will get get the same treatment. How many, then, out of 100 do you expect to get the same treatment? Comment on how that compares to what you observed. How does it compare to 100(4/9)=44.4 that we would expect under a proper randomization scheme (see the introduction). 5. In a binomial situation with 100 trials where p is the probability of some event (like cows 9 and 10 getting the same treatment) and f is the observed proportion of times that event occurred, p(1-p)/100 is the variance of f and Z=(f-p)/s is approximately N(0,1) where s is the square root of p(1-p)/100. Using the program results, find the Z that goes with f=proportion of 100 trials in which cows 9 and 10 got the same treatment. For p use the probability that cows 9 and 10 get the same treatment when coin tossing is used. Is |Z|>1.96? 6. In the program, 1635409 is referred to as a "seed" for the random number generator. Maybe our results just occured by accident. Change the seed to 49153 and comment on how the 2 counts (first 2 same, last 2 same) change. You might also run a few more seeds just to see how volatile these numbers are (just write up the 49153 result). 7. Assigning treatments as described here is not uncommon. Briefly summarize in a few sentences what you have learned about the validity of this randomization scheme. Note: there may be nothing particularly important about the order in which the cows enter the milking barn, but then again there may. The last cows may be in some ways weaker or more lethargic than those that enter first (which may or may not affect the milking). The point of randomization is that it guards against any such gradient whereas if we do not randomize, we must guarantee that we've thought of all possible such effects and none of them matters. Optional (not collected) You might redo question 5 letting p be the expected probability under a proper randomization. In that way you are doing a computer experiment to test the hypothesis that the scheme is a valid randomization scheme. You might also try, analytically or using the computer, to figure out how many cows it would take to make the probability of the last two getting the same treatment some prechosen value like 0.99.