BACK TO BASICS
Settling on a suitable sample size for your project is half the battle
by Kim Niles
Selecting the correct sample size is often the most difficult aspect of any project. Rules of thumb are important because they promote discussion that facilitates the selection of an optimal sample size.
The three key components of sample-size selection are:
- How accurate or confident you need to be: This is based on the alpha/beta error you select. If we need to be only 50% confident (flip a coin: α = 0.5), then the next two components don’t matter. A sample of one would suffice.
- How precise you need to be: Precision relates to the ability to understand the variation in the data. If the data can’t vary at all, then the other two components don’t matter. Again, a sample of one would suffice.
- The differences you are trying to measure: Larger differences allow for easier decisions. If the differences you are trying to measure are enormous—for example, red versus blue or 0.1 versus 100,000—then the other two components don’t matter. A sample of one would suffice.
Of course, assumptions always apply, and many other considerations affect sample-size selection.
These considerations include: sampling cost, purpose, approach, method, capturing a reasonable amount of data variation, the type of model being developed, the underlying data distribution—such as normal or exponential—and the type of statistical tools being used.
Rules of thumb
I developed and named all but the last rule of thumb in the following list:
- Trial-and-error sampling (≥ three samples): Pick three pieces of each sample to compare new and old data to be approximately 80% confidence in the results.
- Design of experiments sampling (≥ eight samples): For most manufacturing situations in which differences to be tested are typically large (reasonable extremes), test costs are relatively high and desired statistical confidence is low (for example, turning knobs on machines). A Taguchi L8 or 2∧3 full factorial design will likely produce high-confidence results using only eight or more samples.
- Central limit theorem (CLT) sampling (≥ 30 samples): Picking samples in groups of 30 or more will take advantage of the CLT and will ensure data normalcy in the distribution of those groups. Note that a single sample of 30 doesn’t use the CLT.
- Reliability sampling (60 samples): Per Beta tables, 60 samples without any failures equates to 95% confidence in 95% reliability.
- Shewhart sampling (≥ 100 samples): When developing statistical process control, Shewhart recommended that 25 sets of four samples be taken as a rule of thumb to assess process stability.
- Human survey sampling (≥ 500 samples): To capture a reasonable amount of human variation—such as race, religion, location, sex and age—rules of thumb vary between 500 and 2,000 samples.
- STRUT sampling (various):
Calculated using a formula outlined in "STRUTS: Statistical Rules of Thumb,"1
Again, use these rules for planning and discussion purposes only. They might not apply to your situation. There are a lot of ways to accurately calculate samples sizes. Statistics books have parametric formulas and tables for just about any distribution type, or you can do an internet search for "sample size calculator." Additionally, you can use the rule of threes outlined in Tony Gojanovic’s QP article.2
- Gerald van Belle, "STRUTS: Statistical Rules of Thumb," Departments of Environmental Health and Biostatistics, University of Washington, 1998.
- Tony Gojanovic, "Zero Defect Sampling," Quality Progress, November 2007.
Kim Niles is an adjunct instructor through San Diego State University and the University of California in San Diego (UCSD), as well as a quality and statistical consultant in San Diego. He earned his master’s degree in quality science from California State University in Dominguez Hills. Niles is an ASQ-certified quality engineer and Six Sigma Black Belt, as well as a UCSD-certified Master Black Belt. He is a fellow of ASQ.