A story of the great cicada census
|According to Frank, the conversion of cicada mass into wasp mass should be carried out via a model of the form
The calculations and estimates
There are many parameters to estimate in the process described above (unknown a priori, to varying degrees). Here's a list of (most of) them (and some preliminary estimates):
- 2 for the transformation of cicada mass into wasp mass (assuming a power model);
- 8 (as many as) for the conversion of wasp RWL (Right Wing Length) into mass (assuming power models including species);
- Tentative model (assuming only two -- treating all cicadas the same):
- 2 for the conversion of wasp RWL into mass (assuming a power model -- we actually need this conversion in the opposite direction, according to the story above);
- Tentative model:
- 8 for the means and standard deviations of the cicada RWL (assuming normal distributions and species dependence -- estimates in the table below);
- 3 for the relative proportions of the cicada species in the population of cicadas (assuming we want a density, since the fourth is determined once three are obtained); and
- some parameters related to the distributions of male and female wasps in the two distinct locations of Newberry and St. Johns. Both populations are important, as the females are the census takers (but don't sample opportunistically), and the male distributions should be a direct result of the census taken.
Table containing the estimates for means and standard deviations (parameters) for the (presumed) normal distributions of cicada right-wing lengths, by species.
We do not estimate all the parameters in the same way, however. Ideally we might estimate all of these parameters simultaneously, and find the best combination of parameters to make the story fit together best; however, the chance of being able to find a global minimum of a system with 23 or more parameters starting from scratch is pretty small -- so we take a different approach. We estimate parameters along the way, where it's easy, and then leave a few that we can't estimate "locally" to find using the procedure we're about to describe.
What do we mean by estimating "locally?" An example is the parameters linking the wasp mass and the wasp RWL. We have good data on the wasps, including their masses and their RWLs. We plot the scatterplot, we see what appears to be a power relation, and we estimate a model. "Case closed" (or almost): because this model feeds into the rest of our estimation procedures, any errors that we make in estimating these parameters will feed into errors in the estimates of other parameters.
As the saying goes, "All models are wrong; some models are useful." We know that we will make errors: do we want to estimate certain parameters and treat them as perfect and then estimate other parameters based on them, or treat all parameters as error prone and estimate them all at once (supposing the errors to be distributed up and down the line)? We have essentially estimated 18 parameters above (with several of them estimated as zeros). That leaves a few parameters that haven't yet been estimate, but which we can estimate by the methods we describe below.
The parameters we estimate non-locally in the following are the parameters related to the relative abundance of the four (well, three, really) types of cicadas -- small, medium, and large -- and the parameters related to the conversion of cicada mass into wasp mass. The mass-to-mass conversion parameters can be estimated in the laboratory (and have been -- Jon? Is that right?).
There are two ways that I propose by which we make the estimates for the relative sizes of the cicada populations (and that's all we can get: not real numbers, but relative numbers). Common to both will be several assumptions.
- Male cicadas are produced from a single cicada.
- If the wasps practice sex allocation by size of cicada (e.g. small ones are turned into males, larger ones are reserved for females), then there needs to be an adjustment: the females wouldn't be sampling randomly from the cicada populations to turn them into males. According to Grant, that is not a problem.
- Another question for sex allocation is this: do the wasps use times of prey scarcity to turn to male production, and produce females when prey are abundant?
Let be the of an animal and let be the mass of an animal (a male animal, in the case of the wasps). The subscript "w" will indicate wasps, and "c" subscripts indicate cicadas; "t" subscripts refer to a "transfer" (between wasps and cicadas). In this model, we're going to assume that
- wasps sample the cicadas according to their wing length, according to the relationship indicated in the kernel-smoothed graph obtained by Katie;
- there are only three types of cicadas (small, medium, and large -- we combine the NH and DO cicadas);
- the cicada distribution in the two locations -- Newberry and St. Johns -- is the same;
- the cicada distribution is stable in structure, as are the wasp populations;
- there is a single, species-independent equation that transforms cicada RWL into mass.
Several (if not all) of those assumptions are suspect: for example, backwards-stepwise regression showed that there is a species effect in the relationship between mass and RWL for cicadas.
We have a distribution of wing lengths of cicadas in these Florida locations which we model with the following density as follows (where there are three kinds of cicadas -- small, medium, and large):
where represents a normal density and . The parameters are to be determined. (see below for and All other parameters have been estimated locally.
Now, for a particular cicada of given right wing length , we have that its mass is estimated as
Then the cicada's mass is converted to wasp mass (according to Frank) via a power model, so that the wasp formed by eating this cicada would have mass
Lab estimates suggest that this relationship is linear (i.e. ), with a constant of .
Converting from the mass of the wasp to the wasp's length ,
then finally we've got that
That is, that there is a simple power model relationship between the two lengths:
(where we have hid a little of the mess by defining a couple of new parameters -- if we can estimate and , then we'll have estimated and as
From this, we'd predict that the distribution of the wasp lengths would satisfy , and that the predicted mean wing length of the wasps would be
(and our results indicate that this is almost perfectly so: in St. Johns, 24.86 versus 24.25; in Newberry, 21.85 versus 21.75).
Our objective now is to find the best set of parameters ; that is, the set that provides the best fit to the distribution(s) of male wasp wing lengths in the samples obtained in Newberry and St. Johns. Notice that we thus end up with two separate estimation problems. This is important. The wasps in St. Johns are larger, so they census a part of the cicada population that the wasps in Newberry can't touch.
So let's think about one of those populations of wasp wing lengths, given by density . Here is one strategy for choosing , based on minimizing the difference in two function over a range of wing length values.
We have an empirical cumulative distribution (let's call it ) of male wasp wing lengths, which is a (step) function. It's not critical that this empirical distribution be differentiable.
Now, we need the cumulative distribution of the modeled male wing lengths. Let's call the modeled cumulative : then
We then find the parameters that minimize the integral
Results suggest that the neocicadas are more prevalent than we would have expected, and that the tib cicadas less so:
Example Application: St. John
In the following two applications, we relax the requirement that the wasps sample the cicada population randomly; instead, we assume that they sample according to their wing length, as indicated by the kernel-smoothed graph obtained by Katie. Katie used smoothing techniques to obtain her graph. Because we need a function, which we can consult to determine how many wasps are taking small versus medium versus large cicadas, we modeled this using cumulative distribution functions (cdfs) from a normal distribution. The results are as follows: