A New View of Statistics
ON THE FLY FOR DIFFERENCES BETWEEN FREQUENCIES
You can use sample size on the fly to get the minimum number of subjects, but you don't get quite the same saving as for correlations or means. I've used simulation to see how many subjects you need to give acceptable confidence limits for a wide range of frequency differences. I've found that it's at least 100 subjects, even for very large effects, so that will have to be our starting number.
The other thing we need for sample size on the fly is an acceptably narrow confidence interval for the outcome statistic. It's straightforward if we use the difference in frequencies as the outcome, but it gets really complicated if we use relative risk or the odds ratio. Let me explain with the example of injury in runners and cyclists.
The difference in rates of injury can be expressed either as a difference in the percentage rates (47 - 15 = 32%), or as a relative risk of injury (runners have 47/15 = 3.1 times the risk of cyclists). The acceptable width of the interval for a difference in the percentage rates is a fixed 20%, as I explained earlier. In our example the difference is 32%, so the required publishable confidence limits are 22% to 42%. Expressed as a relative risk, these frequencies correspond to 3.1, with confidence limits 2.1 to 5.1. But suppose the original frequencies were 67% and 52%. The difference in frequencies is still 32%, and the acceptable confidence limits on this difference are still 22% to 42%. But now the corresponding relative risk is 1.9, with confidence limits 1.5 to 2.5 What a mess! The odds ratio misbehaves in the same way for case-control data.
So here's the method, based on the confidence interval for the differences in frequencies between the groups, expressed as percents.
All computations in the above procedure are available on the spreadsheet, which includes the case of unequal numbers of subjects in the groups.
How do you present the final outcome? Obviously you need to show the frequency of the injury or whatever as a percent in the two groups. You should also show the confidence limits for the difference in frequencies (confidence limits = the difference in frequencies ± half the confidence interval, which you will have calculated in the last iteration of the sampling process). That's it, as far as I am concerned, but for a clinical journal you may have to show a relative risk or an odds ratio. If the editor of the journal insists on one or other of these effect statistics, put it in, and get your stats program to calculate its confidence limits.
To describe the outcome of your research in qualitative terms,
check where the confidence limits of the frequency difference fall on
the scale of magnitudes. Here's a
version of it for frequency differences:
For example, if the limits are 22% and 42%, the effect is small-moderate; if they are -5% and 15%, the effect is trivial-small, and so on.
Go to: Next · Previous · Contents · Search · Home
resources=AT=sportsci.org · webmaster=AT=sportsci.org · Sportsci Homepage · Copyright ©1997
Last updated 16 June 97