How to Investigate When The Effective Sample Size Greater Than 100%

Comparing groups where a group has been over-recruited

Consider a simple example. Let us say a survey was designed to compare the attitudes of indigenous with non-indigenous Australians, which represent, respectively, 5% versus 95% of the Australian population. Such a study would generally employ non-proportional stratification, over-recruiting the indigenous Australians. For example, the study may be designed so that the indigenous Australians represent 50% of the sample (500).

The reason for using such a non-proportional sample design is because we are more likely to find a significant difference if comparing a sample of 500 indigenous Australians with a sample of 500 non-indigenous Australians than if comparing a sample of 50 indigenous Australians with a sample of 950 non-indigenous Australians.

If conducting such an analysis, the effective sample size will be greater than 100% as due to the non-proportional sampling the sampling error is smaller than if simple random sampling has been conducted (i.e., which would have involved a sample of 50 indigenous Australians). Note that this is the intuitively sensible result: it is consistent with the motivation for over-recruiting indigenous Australians in the sample.

Where strata have different variances

Where different strata of a sample have different variances for a statistic that is being estimated then it is optimal to over-recruit respondents in the groups with the higher variances (this is referred to as Neyman allocation in the statistics literature). Thus, where a sample is recruited such that there is over-recruitment of groups with higher variances then this can lead to an effective sample size of more than 100.

Comparison to other programs

Other software designed for taking sampling designs into account will also produce effective sample sizes that exceed 100% of the actual sample size (e.g., IBM's SPSS Complex Samples) and the surveys package for R.

Many of the programs used within the market research industry for analyzing surveys, such as IBM's Survey Reporter, instead use Weight Calibration using Kish's Effective Sample Size Formula. It is also used in many analyses in Displayr (however, all crosstabs involving means and proportions in Displayr use Taylor Series Linearization). You can use Kish's Effective Sample Size Formula in Displayr by changing the Statistical Assumptions setting of Weights and significance to Kish's approximation.

How to Add Rows to a Table to Display Effective Column Sample Size

How To Set The Weighted Sample Size

Articles in this section

Comparing groups where a group has been over-recruited

Where strata have different variances

Comparison to other programs

Next

Articles in this section

Comparing groups where a group has been over-recruited

Where strata have different variances

Comparison to other programs

Next

Related articles