Findings on the Statistical Relevance of X-Clubs Results

They're pretty damn'd good!

The NZ-Wide Pairs held on Friday Nov-8 were a fantastic opportunity to analyse the statistical relevance of X-Clubs scoring. Eighteen clubs sent in results, ten of them clubs using NZ Scorer. We got to cross-add 301 pairs and provide a sneak preview of likely placings and percentages very promptly after the files were received. The eagerly-awaited full field results for 992 pairs were published the following Thursday.

In order to do some comparisons we attempted to fish out our 301 pairs from the full field of 992. In this we embarked on a scaled up version of what is often a puzzling situation at club level. X-Clubs was now in much the same relation to the NZ-Wide Pairs as any club is to X-Clubs. We were not entirely successful with getting the 301 matches due to a most puzzling behaviour of Excel (or one's lack of understanding thereof). Our final file is attached below should anyone wish to check the following observations. But, with 257 of them we had sufficient for our purposes. If you download the file, please take note of our predicted placings shown in the orange columns. The NZW placings were normalised by sorting them into descending order then assigning the numbers 1 to 257 to each line before re-sorting into club order. These rankings of our 257 pairs matched pairs very closely with their final positions in the overall result.

We have been keen to find out how X-Clubs scores rate when stacked up against a truly huge field - here 992 pairs. Statisticians talk of three "levels of confidence" delineated by the mystical "standard deviation". The first level is where 68% of the data can be expected to fall within one standard deviation from the mean, the second level concerns the limits within which 95% of the data will sit (two standard deviations) and we can be confident that almost all the data will fall within three standard deviations from the mean. Rather than calculate these figures we have merely observed the limits within which 68% and 95% of the X-Clubs variances from the NZW-defined result - or "true" result as we refer to it.

We wish to point out that even the NZW results would not be entirely spot-on when compared to those calculated should 10,000 pairs have played the boards. We simply accept that they are closer to the mark than X-Clubs results. Our purpose is to measure how closely our X-Clubs results approximate the NZW results.

For a sample size of 257 pairs, 68% of X-Club scores (175) fell between +0.33 and -0.33 either side of bang-on. 95% or almost all (244) scores fell between +/- 0.66 and there were just 13 scores outside that range. That is: when there is a huge turnout of 250-ish pairs, the X-Club scores for 70% of pairs should be within only one third of one percent either side of the scores players would get if there were 1000 pairs playing. All but a dozen scores should be outside of two thirds of one percent either side of their "true" score.

Now we're not quite up to 257 pairs in our daily X-Club sessions sessions yet. We have topped 180 pairs on some Friday mornings. So here's an analysis of the first 156 pairs to come in.

For a sample size of 156 pairs, 68% of X-Club scores (106) fell between +0.47 and -0.47 either side of bang-on. 95% or almost all (148) scores fell between +/- 0.82 and there were just 8 scores outside that range. That is: when you play Wednesday evenings or Friday daytimes with 160 other pairs, the X-Club scores for 70% of the field should be within only one half of one percent either side of the scores you would get if there were 1000 pairs playing. All but half a dozen scores should be beyond 0.8% either side of their "true" score.

We then reprocessed the scores of 49 pairs chosen from 4 random clubs. This is equivalent to 24 tables - maybe a good-sized tournament. The results were pretty bizarre by our standards. There was only one pair very close to what it earned in the NZ-Wide Pairs but on either side of them the variances started at +/-1% and increased outwards to +/-5%. 70% of the scores were within 3.8% of what they got in the NZWP, and 95% were within 4.5% either side of their NZWP scores. There were 4 scores between 4.5% and 5% away from what they earned when scored across 992 pairs.

The results of our analysis so far were bearing out our theory - the more scores you get in, the more realistic the scoring across the field becomes. So far we have shown that, in round figures, 250 pairs would have a standard deviation of 0.3%, 150 pairs would produce a standard deviation of 0.5%, and 50 pairs would give something like 3.8%. We thought we should try to see what sort of spread you could expect for 100 pairs. We found:

For a sample size of 94 pairs, 68% of X-Club scores (64) fell between +0.86 and -0.86 either side of bang-on. 95% or almost all (89) scores fell between +/- 1.4 and there were just 8 scores outside that range. That is: when you play in a slot with 100 other pairs, the X-Club scores for 70% of the field should be within 0.85% either side of the scores you would get if there were 1000 pairs playing. All but half a dozen scores of the 100 scores should be beyond 1.4% either side of their "true" score.

Our conclusion is that somewhere around the 120 pair mark the scores you're getting in X-Clubs would not change remarkably if you were to be playing those boards in a field any larger than that 120 pairs.

We think it's pretty good that our numbers are getting up over the 120 pair mark in half the X-Club time slots played. Those scores are stable within an acceptably small margin of error. With more than 120 pairs playing, the X-Club scores converge even closer to their "true scores".

AttachmentSize
NZW Pairs Comp 257.xls110.5 KB