Let's think about an extremely useful and realistic case, a simple "threshold" function: if x < thresh, it returns a, otherwise it returns b. A schematic of a possible set of data, the function, and its parameters might look like this:

discont0.gif

Here's an R function to generate the expected values of such a threshold function given a vector of independent-variable values x and the parameters a, b, and thresh:
thrfun <- function(x,a,b,thresh) {
  ifelse(x<thresh,a,b)
}
Technical note: the reason we use ifelse is that if in R only operates on one item at a time-if x is a vector and you use if (x<10) it is just looks at x[1]. ifelse looks at every item in the vector, which is what we want in this case.

Now let's generate some data with a step function in them.

data.x <- seq(0,5,length=20)
det.y <- thrfun(data.x,2,5,3)
data.y <- det.y+rnorm(length(data.x),0,0.5)
plot(data.x,data.y)
This produces (for example) the following data:

discont1.gif

Now we need to construct a sum-of-squares/likelihood function if we're going to try to fit the parameters to data. I'm making the function taking the parameters as a vector rather than a list (e.g. function(a,b,thresh,x,y)) since this is the way nlm and the other minimizers expect things to be defined (we won't go that far in this example, but it's useful to get in the habit).
likfun <- function(pvec,x=data.x,y=data.y) {
  exp.y <- thrfun(x,pvec[1],pvec[2],pvec[3])
  sum((exp.y-data.y)^2)
}

Now let's assume, for simplicity, that we know a and b and we're just trying to fit thresh to the data. We'll define thrvec as a set of values that we want to calculate the sum-of-squares for, and then use sapply to calculate likfun for each value.

thrvec <- seq(0,5,length=100)
tmpfun <- function(z) likfun(c(2,5,z))
ssvec <- sapply(thrvec,tmpfun)
plot(thrvec,ssvec,type="l",xlab="threshold parameter",ylab="S.S.")
This gives us the following S.S. plot:

discont2.gif

Notice the ``staircase'' effect: each of the treads (flat parts) of the staircase is an x range between two of the data points we have. Between these data points, it doesn't matter how we change the threshold value. For example if we don't have any data between x = 1 and x = 2, it doesn't matter whether the threshold is 1.5 or 1.6. These sharp changes in the sums-of-squares function will give derivative-based minimizers hiccups.


File translated from TEX by TTH, version 2.60.
On 21 Feb 2000, 19:17.