Care and Feeding of Checkweighers
Statistics to help guard against release of underweight packages
by Lynne B. Hare and Keith Eberhardt
Many years ago, during a tour of a production facility where packages of customer goods were being filled, a statistician colleague and I (Lynne Hare) stood transfixed by an end-of-line checkweigher. There was a red light on above a label that read, "Needs rezero." That didn’t seem to bother anyone.
We watched as package after package progressed over the checkweigher’s scale, some of them being kicked off into a bin by a pneumatic device. Then we weighed a large number of packages from the bin and found almost all of them were above the label-declared weight. Only a very few were actually below the label declaration by more than the law would allow.1
When no one else was around, my colleague announced he could do an excellent imitation of the checkweigher. Moving his head rapidly from side to side, he blinked his eyes repeatedly and said, "What was that? What was that? What was that?"
We laughed. It’s easy to poke fun but more difficult to come up with something that is constructive—yet sufficiently simple—to use on a busy production floor.
If you start with questions that come immediately to mind, you might want to know what the checkweigher is targeted to reject, what the size of its zone of uncertainty is, and, given those things, what the chances are that a good package would be rejected and a bad package would escape detection.
Quantifying the uncertainty
Let’s start with assessing the size of the zone of uncertainty, or the gray zone, as it is often called. It is common practice for operators or quality control specialists to run a package of known weight at machine speed several times over the checkweigher scale and record the resulting readings. A zone of target plus or minus 3 standard deviations of the resulting data might be used to provide an estimate of the size of the gray zone.
That strategy gets you only part way there. We should be a bit more ambitious to ensure that the checkweigher’s acuity is uniform over the weight range of expected performance and that the checkweigher reads accurately over that same range.
Here’s a strategy that might help provide such assurances. Choose a range of weights likely to be experienced by the checkweigher. For example, one checkweigher study was designed to evaluate performance over weights ranging from 83.2 to 86.5 grams (g). The weight range in this case would be 3.3 g.
Next, divide the range by a convenient number to yield approximately 10 or more increments. Here, we use steps of 0.3 g to provide 12 weight increments starting at 83.2 and ending at 86.5. Then, artificially create 12 packages that weigh as close to these exact amounts as possible by using a static, calibrated analytical balance. Run these packages randomly over the checkweigher scale five times each, recording the stated weights and capturing the packages so you don’t send tidy packages of birdshot, or whatever you have used to create the packages, out to your customers. Table 1 lists the data resulting from this exercise.
Next, of course, follow the first rule of data analysis: "Always, always, always, without exception, plot the data and look at the plot." Figure 1 is a scatter diagram of the raw data. At this point, you would make a rough visual assessment of the plot to determine whether the variation among checkweigher reported weights is the same among all the actual weights—that’s "the acuity being the same over the weight range" part mentioned earlier. Then you would want to look at the plot to get some assurance the checkweigher-reported weights are roughly equal to the actual weights on average. That’s the accuracy part mentioned earlier.
Of course, there are formal tests for these phenomena. To test for uniformity of variation among the five checkweigher readings of each of the 12 packages, we could use Bartlett’s test. For these data, the test statistic is 9.92 and its probability is 0.537, clearly not small enough to persuade us that the standard deviations are different.
To ensure the checkweigher-reported values are the same within chance variation as the actual weights, we could look at the slope of the regression of those weights on the actual weights. For these data, the calculated slope is 0.978 and its standard error is 0.057. Because the calculated slope is less than one standard error away from the theoretical value of 1, we would say the slope does not differ significantly from that value. More formally, we could test the difference between the calculated slope and the theoretical value in light of the variation as a t-statistic:
t = b1 − 10 = 0.978 − 1.0 = −0.386
where b1 is the estimated slope and sb1 is its standard error. The conclusion is the same: We cannot say the slope is different from 1.
Ideally, the regression line should have a zero intercept, and the formal test for that looks like the one for the slope. It turns out the calculated intercept for these data (1.914 with a standard error of 4.821) does not differ significantly from zero, but it should also be noted that our data are very far, in terms of multiples of variation, from weights of zero. So this test is a wild extrapolation (which doth taste of wormwood to statisticians).
Given that our checkweigher appears healthy from the perspectives of uniform acuity and a high degree of accuracy, we come back to the original objective of assessing the size of its zone of uncertainty, or gray zone. Pool the within-actual-weight standard deviations. A quick way to do that is to carry out an analysis of variance on the data using the actual weight as the source of variation. Take the square root of the residual mean square, and call it sg. In this example, you should get sg = 0.442. It has 48 degrees of freedom, more than enough to keep you out of too much hot water.
Protecting against shipment of short-weight packages
What do you do with that? Well, suppose you wanted your checkweigher to prevent erratic weights from getting to the customer and getting your company in trouble. Further, suppose your label declaration is L grams. Look up the maximum allowable variation (MAV) corresponding to that label weight in NIST Handbook 133.2 If you are willing to take only a a % chance that a package weighing at the label minus MAV would get to the customer, you would set the checkweigher reject point at Xc = Label – MAV + ta (df)sg. Here, ta (df) is the student’s t-statistic that marks off a percentage of the distribution in the tail and has df degrees of freedom.
In our example, suppose the label declaration is L = 82 g. NIST Handbook 133 shows a MAV of 7.2 g.3 If we want only a 1% chance that a package weighing L – MAV = 74.8 grams will get out to the customer, we should set the checkweigher at:
= 82.0 – 7.2 + (2.407)(0.442)
= 75.864 grams
Note that lighter packages will get to the customer even less often.
Protect against rejecting good packages
Your local, friendly plant manager is concerned about getting the product out the door. Production volume is one measure for which plant managers are rewarded. The manager’s question will be, "How many good packages are being zapped by the checkweighers into the reject bin? They have to be reweighed manually, and that slows us down."
That question is not answered as easily as the question concerning the release of short-weight packages. To find the solution, you need to know the size of the checkweigher’s gray zone, the checkweigher’s setting and the MAV. But you also need to know the mean and the standard deviation of the package weights moving over the checkweighers. Given that information, you can use multivariate statistics to learn the probability that a package will be rejected, given that its weight is acceptable, for example, above the label minus MAV.
If you are not current in your multivariate statistics, you can rest assured that the larger the checkweigher’s gray zone, the higher the percentage of good packages that will be incorrectly zapped. What do you do about that? Remember the red light above the label that says "Needs rezero"? Take it seriously and follow the manufacturer’s instructions to rezero the checkweigher.
Or you can just stand there and say, "What was that?"
- Tom Coleman, Linda Crown and Kathryn M. Dresser, eds., NIST Handbook 133, fourth edition, National Institute of Standards and Technology, 2005, http://ts.nist.gov/weightsandmeasures/h1334-05.cfm.
- Ibid, tables 2-5.
Lynne B. Hare is a statistical consultant. He holds a doctorate in statistics from Rutgers University, is past chairman of the ASQ Statistics Division and is a fellow of ASQ and the American Statistical Association.
Keith R. Eberhardt is a principal scientist at Kraft Foods Research in East Hanover, NJ. He holds a doctorate in statistics from Johns Hopkins University and is a fellow of the American Statistical Association.