mathepi.com
under construction; last updated October 13, 2003
Home   About us   Mathematical Epidemiology   Rweb   EPITools   Statistics Notes   Web Design   Search   Contact us  
 
> Home > Computational Epidemiology Course > Lecture 7 in progress  

Sequential Repetition, Continued

We have seen that we can accomplish sequential repetition by having a function continue to call copies of itself, progressively simplifying the problem at hand, until all the needed repetition is done (recursion). In principle, this mechanism is sufficient to solve any programming problem, and moreover, it is very effective to use when the data structures themselves are hierarchical (we won't really treat hierarchical data structures in this semester).

But R does not always perform recursion efficiently. And for many algorithms, it can be more work to express them in terms of recursive function calls. In this lecture we will learn the classical alternative to recursion: the loop. In R, there are two main looping constructions: the counted loop and the conditional loop.

The Counted Loop

The counted loop is essentially another way to implement the map concept. Basically we start with a collection of objects, and we have a series of commands we wish to do for every member of the collection. In a counted loop, the system first sets a loop variable to the first member of the collection, and then does the desired commands. Then, the loop variable is assigned to the second member of the collection, and the desired commands are done again. This is repeated until the loop variable has been assigned the value of the last member of the collection and the commands carried out.

Imagine that we wish for instance to square the numbers 3, 9, 4, and 7. Of course, the best way to do this in R is c(3,9,4,7)^2, using the vectorized operator ^ to square the whole list. But I'm going to show you how you do this with the counted loop. To start with, we might just square them separately:
> 3^2
[1] 9
> 9^2
[1] 81
> 4^2
[1] 16
> 7^2
[1] 49
>
Here, we have just used duplication instead of repetition. We have done the repeating, because we repeated the squaring operation over and over. We want the computer to do the repeating. To get an idea of how the counted loop works, let's do the computation a different way:
> ii <- 3
> ii^2
[1] 9
> ii <- 9
> ii^2
[1] 81
> ii <- 4
> ii^2
[1] 16
> ii <- 7
> ii^2
[1] 49
>
This is actually more work, but it illustrates how the counted loop works. In a counted loop, the R system automates the above process of sequentially assigning some variable a predetermined sequence of values, and doing a command for each.

We could be fancier and ask the computer to print a message to the user each time:
> ii <- 3
> cat("The square of ",ii," is ",ii^2,".\n")
The square of 3 is 9.
> ii <- 9
> cat("The square of ",ii," is ",ii^2,".\n")
The square of 9 is 81.
> ii <- 4
> cat("The square of ",ii," is ",ii^2,".\n")
The square of 4 is 16.
> ii <- 7
> cat("The square of ",ii," is ",ii^2,".\n")
The square of 7 is 49.
>
This is not so bad. We used the cat command to build up the user output. Notice that we had to include spaces in the output string to keep the result from looking strange to the user. And we had to use the backslash-n at the end; this produces a carriage return. Because the number we're squaring appears twice in the output (once as itself, and once to be squared), it is nice to have assigned it to the variable ii at first. If we were typing the entire cat command over and over and changing the number, we might forget to change it in one of the two places and make an error.

Let's see what the counted loop looks like. It is called for in R. In the example below, we have the for command itself, the loop variable, the sequence of values to loop through, and the commands to do for each value.
for (ii in c(3,9,4,7)) {
+ cat("The square of ",ii," is ", ii^2,".\n")
+ }
The square of 3 is 9 .
The square of 9 is 81 .
The square of 4 is 16 .
The square of 7 is 49 .
>
Notice the presence of the in keyword to separate the loop variable from the sequence of values to loop through.

In many cases, you use the structure to perform an action a number of times:
for (ii in 1:10) {
+ cat("I'm counting--I'm at ",ii,"!\n")
+ }
I'm counting--I'm at 1 !
I'm counting--I'm at 2 !
I'm counting--I'm at 3 !
I'm counting--I'm at 4 !
I'm counting--I'm at 5 !
I'm counting--I'm at 6 !
I'm counting--I'm at 7 !
I'm counting--I'm at 8 !
I'm counting--I'm at 9 !
I'm counting--I'm at 10 !
>

Let's do the factorial example using a counted loop. Remember we can always do it using prod; for example, the factorial of 8 can be computed by prod(1:8). So let's do it using a counted loop:
> ans <- 1;
for (ii in 1:8) {
+ ans <- ans * ii
+ }
> print(ans)
[1] 40320

Let's use a counted loop to add up the elements of a list. For definiteness, let's take a look at data from the California recall election or 2003. According to the State of California, as of Oct. 13, 2003, with 100% of precincts reporting, the number of votes received by the top ten candidates are as follows:
> topten <- c(3850982,2504640,1053968,218852,44201,22979,15875,13015,11257,10316);
> tot <- 0
for (ii in topten) {
+ tot <- tot + ii
+ }
> print(tot)
[1] 7746085

Here is another way to do the same thing. This time, we step through each element of the list, one at a time:
> topten <- c(3850982,2504640,1053968,218852,44201,22979,15875,13015,11257,10316);
> tot <- 0
for (ii in 1:length(topten)) {
+ tot <- tot + topten[ii]
+ }
> print(tot)
[1] 7746085

We can do any map operation using the counted loop. So for instance, let's square the numbers from 1 to 10, in reverse:
> final <- 10
> vals <- final:1
> ans <- rep(0,length(vals))
for (ii in 1:length(vals)) {
+ ans[ii] <- vals[ii]^2
+ }
> print(ans)
[1] 100 81 64 49 36 25 16 9 4 1

More examples of the counted loop

Let's do another example. You've heard it said that the probability of heads on a fair coin toss is 0.5. What does this mean? It means that the relative frequency of heads should approach 0.5 if you do enough tosses. So here's what I want to do. Let's try to toss a coin (a simulated, computer coin, that is) a certain number of times, and keep track of the results. Let's try to toss a coin 10, 100, 200, 500, 1000, 2000, 5000, 10000, 20000, 50000, and 100000 times, and keep track of the number of heads. So we've got 11 simulations to do, and we'll have 11 different results; we'll need a vector of length 11 to hold the results.
> ntrials <- c(10,100,200,500,1000,2000,5000,10000,20000,50000,100000)
> nheads <- rep(NA,length(ntrials))
for (ii in 1:length(nheads)) {
+ nheads[ii] <- sum(sample(c("H","T"),ntrials[ii],replace=TRUE)=="H")
+ }
> nheads
This is an example of the map pattern. In goes a collection of numbers of times to toss the coin; out comes the number of heads on that many tosses. So we are doing a parallel operation sequentially by means of the loop.

For another example, let's look at one last sequential repetition. Let's look at a so-called birth process. In the science fiction movie Island of Terror, medical research accidentally creates some silicate monsters that are doubling every 6 hours. Imagine that we start with one of them, and follow the number of them every doubling time, for the first 100 doubling times.
> ntimes <- 100
> nn <- rep(1,length(ntrials))
for (ii in 2:ntimes) {
+ nn[ii] <- 2*nn[ii-1]
+ }
> nn
Notice that we start the loop at the second position, and calculate each element of the list nn in terms of the previous value. This is how you may use the counted loop to undertake sequential repetitions.

As an exercise, see if you can convert the needle reuse computations from the previous lecture to use a for loop instead of recursion.

In the next lecture, we will learn the other main loop construct in R: the conditional loop while.