Inter-Egg Correlation

An alternative to the prediction-based analysis of the EGG data is based on the reasonable assumption that if an effect results in deviations at a specified time, there should be some non-zero level of correlation between the eggs at that time. We examine first the inter-correlations of the actual meanshifts, and then, the inter-correlations of the Chisquare values used in the primary event-based analyses.

A general procedure for determining whether there is any effect on the Eggs from global events, either unknown or insufficiently newsworthy to provoke a GCP prediction, is to examine the intercorrelations across the eggs. Given that they are independent sources of data, and widely dispersed so that no ordinary local perturbative forces could similiarly affect them, there should be no tendency for the correlation matrix to show anything but chance fluctuations. If instead, we see a tendency for correlation among the meanshifts or the corresponding Chisquare values generated by the individual eggs, this indicates a global source of anomalous effect in accordance with the GCP's general hypothesis. An early attempt was made to explore this possibility looking at a single day's data.

Although this effort appeared promising, the correlational approach needs the power of large amounts of data to mitigate the extremely small hypothesized effect. Doug Mast created a set of scripts and analysis functions to examine all the data over long periods. A "first draft" assessment of all data from 1999 was completed in March, 2000. Here is his description:

Date: Sun, 19 Mar 2000 12:54:14 -0500 (EST)
From: mast@sabine.acs.psu.edu
To: rdnelson 
Subject: a preliminary GCP correlation study

I've completed an initial study of inter-egg correlations, and
I thought I should share with you what I've learned so far.

Basically, the object of the study was to see whether the "signals"
produced by the eggs correlate with one another more or less than
one would expect by chance.  Somewhat arbitrarily, I defined signals 
to be composed of one-second egg results and to have a length of 60 
samples (1 minute).  The method is then to compute Pearson correlation 
coefficients between the signals from all possible egg pairs, over a 
large number of one-minute intervals.  I throw out pairs for which 
either egg had missing data during the current minute.  I keep a tally 
of how many signal pairs were significantly correlated at the 0.1 
level, the 0.01 level, and so on down to the 10^-8 level.  

By the definition of the Pearson correlation coefficient, 1/10 of
the signal pairs should (on average) be correlated above the 0.1 
level, 1/100 should be correlated above the 0.01 level, etc.  One 
can also compute p_values for the number of events observed at each 
significance level (for example, the odds of >= 15 events of 
probability 0.1 occurring in 100 samples is 0.0726). 

The hypothesis tested was whether anomalous numbers of high 
inter-egg correlations occur within the GCP data.  I didn't worry 
much about causes of any such correlations (global consciousness, 
experimenter effect, equipment bugs, sunspots, or whatever), since 
any anomalous number of high correlations would be highly 
interesting to me, regardless of the cause.

Anyway, using some scripts (wget, sed, matlab, mathematica), I did 
this analysis for every minute of every day in 1999.  As a control, 
I did the same analysis for signals offset by pseudorandom intervals 
(for each day, Matlab's pseudorandom number generator was seeded by 
an integer determined from the date).  The control data helps to 
safeguard against wrong conclusions based on math or computer errors, 
and also serves as a calibration check.)  For each day, I archived 
summary data showing the number of significant cross-correlations 
at each level investigated, both for all the egg pairs and for each 
individual egg pair.

Below is a table showing the results for the synchronous
and control runs, for all of 1999.  "N_tries" is the number of 
cross-correlations performed, "N_sig" is the number of observed
correlation events above each significance level, "%_sig" is the
proportion of significant events normalized so that 100 corresponds 
to the expected value, and "p_val" is the (one tailed) probability 
of observing >= "N_sig" events in "N_tries" trials (or, for runs 
with fewer hits than expected, the probability of observing 
<= "N_sig" events).  Any significant occurances of correlations would 
correspond to %_sig > 100 and p_val < 0.05 (or some other small 
number) at a particular significance level.

Synchronized Eggs (Number of Tries = 75828996)

Significance Level 0.10 0.01 0.001 10^-4 10^-5 10^-6 10^-7 10^-8

Number 7586446 759343 75838 7664 768 79 8 0

Percent 100.05 100.14 100.01 101.07 101.28 104.18 105.50 0

p-value 0.087 0.11 0.49 0.18 0.37 0.37 0.49 0.47

Effective Z 1.360 1.227 0.025 0.915 0.332 0.332 0.025 -0.075

Control Batch 1, Randomized Eggs (Number of Tries = 75817363)

Significance Level 0.10 0.01 0.001 10^-4 10^-5 10^-6 10^-7 10^-8

Number 7581793 757050 75889 7590 701 71 2 0

Percent 100.00 99.85 100.09 100.11 92.46 93.65 26.38 0

p-value 0.49 0.097 0.40 0.46 0.019 0.32 0.19 0.47

Effective Z 0.025 -1.299 0.253 0.100 -2.075 -0.468 -2.075 -0.075


These results indicate that any significant inter-egg correlations 
(for the one-minute signals I defined) are about of the frequency 
expected by chance.  However, several features in the data are 
intriguing.  For the synchronous correlations, the number of high 
correlations is more than expected at each significance level between 
10^-7 and 10^-1.  None of these effects have a striking p-value, 
although those for 10^-1 and 10^-2 are close to a liberal definition 
of "low."  

However, the control runs (with signals offset by pseudorandom 
intervals) seem in several cases to show *fewer* high correlations 
than expected by chance.  When the synchronous and control runs are 
examined together, the possibility of some real effect seems greater.  
To gather a little more data, I tried a second control set with a 
different (but still deterministic) random seed.  The results, with 
the same format as the above table, are below.

Control Batch 2, Random Eggs (Number of Tries = 75816521)

Significance Level 0.10 0.01 0.001 10^-4 10^-5 10^-6 10^-7 10^-8

Number 7583367 757208 75336 7515 745 81 7 1

Percent 100.02 99.87 99.37 99.1 98.26 106.84 92.33 131.90

p-value 0.26 0.13 0.040 0.22 0.32 0.25 0.51 0.18

Effective Z 0.643 -1.126 -1.751 -0.772 -0.468 0.675 0.025 0.915


Here again, there is a "significant" (p=0.04) absence of high 
correlations at one level, and no significantly large numbers of 
high correlations.  So this seems to confirm the results of the 
other control run.

Graphing the Meanshift Correlations

A graph of the Effective Z-Score for the three datasets as a function of the decreasing probability levels visualizes the differences. The Z-Scores for the Synchronized set are shown in red, compared with the Stouffer Averaged Z-Scores for the Control data in Black. I have done a preliminary calculation of the combined "bottom line" that needs to be checked for its appropriateness and logic. The algorithm is (sum((ysynch-yctrl)/2^.5))/8^.5 for each of the controls, which yields Z = 2.4387 and 1.50 for the comparison of the synchronized data with Control 1 and Control 2, respectively. The Stouffer combination of these results in Z = 2.785. This does not account for the non-independence of the counts at the different probability levels. There is something like a 10% overlap of .1 with the necessarily included .01, plus 1% with .001, etc. I don't know how to adjust for this properly, but my guess is that the end-effect of the non-independence would be a relatively small reduction in the apparent combined Z. If the reduction were as large as 30%, the difference would remain significant at the 0.05 level. A rough compensation for the overlap can be made by reducing the Z-scores by the amount which will be counted at the next level. The 0.10 count includes 10% from the 0.01 level, etc. This will give approximately Znew = 0.1111111 x Zorig. Using this correction, the composite Z is reduced to 2.4757, which has a 1-tailed probability of 0.0066.

Assuming the logic for this exploration is correct, and the non-independence penalty is modest, as suggested in the rough calculation of an overlap compensation, the result indicates a clear difference between the Synchronized data and the randomly paired Control data, which is readily seen in the following figure.

We should have some concern, however, that the difference calculation above is driven as much by the low counts for controls as by the high counts for the synchronized eggs. If we look at the effective Z-scores for synchronized eggs alone, compared with expectation 0, the result is much weaker. For all eight levels, the composite Z is 1.301 (1.4641 without the 11% correction). If one ignores the last two levels, which have too few data to give a reasonable estimate, the Z is 1.521 (1.711 without the 11% correction). Given the nature of the data, these more modest Z-scores probably are a better indication of the possible inter-egg correlation effect. [RDN]

Doug continues:

It seems like there is some effect here, but any such effect is very 
small, barely visible even after 75 million trials.  Further 
replication and checking are certainly necessary--an obvious second 
test will be to do the same analysis for this year's data.  If there 
is a real effect, the cause is a tough question--artifacts in the 
RNGs may be hard to rule out.  For instance, could the RNG behavior 
be affected by any electrical signals associated with the host 
computer downloading data from the egg once a second?

[ regarding equipment artifacts, Roger Nelson replied: 
Re artifacts, I don't think there is anything likely from
host computer electricals.  The design precludes this, and 
we have been doing both calibrations and acute "influence"
testing of em fields, temperature, vibration, sound, etc., 
on these devices for years without finding any artifactual
effects.  Of course we can't rule out anything we haven't
thought of testing.  There are a couple of arguments against
artifact -- one, the design includes a logical XOR that
guarantees a statistical mean of p=.5.  Secondly, the data
themselves from your analysis show both high and low counts.
I don't have a final answer, but it does not look like
artifact can be responsible for the correlation counts. ]

Finally, here is another view of the data, which is completely
post hoc and poorly justified, but possibly still interesting.  We
could take the number of significant correlations at each level to 
be a binary random variable, with a value of 1 for a high number
of correlations (score > 100 in the above tables) and value of 0 for 
a low number of correlations (score < 100).  We can further make the
dubious assumption that each significance level can be treated as an
independent random variable.  By this criterion, the synchronous runs
give 7 1's out of 8 trials (p=0.03125) and the control runs give 
10 0's out of 16 trials (p=0.22725).  The apparent significance for 
the synchronous runs could very well be a fluke--it will be 
interesting to see if something similar happens for this year's data.

I'd love to hear your thoughts about this little study--you can also
feel free to share this message with any colleagues who might be
interested.  I'll be more than happy to answer any questions about
what I did.  If you're interested in checking or replicating my 
results, I'd be happy to share my codes with you (and others) as well.

Correlations of the Chisquare measure

The predictions for the GCP are for non-directional deviations of the means from expectation, and most of the individual event analyses use a Chisquare test, where Chisquare is composed of the squared, normalized meanshifts. The following tables show the inter-correlations of Chiquare measures in a format similar to that used above. Here is Doug Mast's description:

Date: Fri, 14 Apr 2000 18:30:44 -0400 (EDT)
From: mast@sabine.acs.psu.edu
To: rdnelson 
Subject: more correlation results

Hi Roger,

I finished the first correlation tests for the 1999 "chi-square" egg 
data.  These are analogous to the results of my other little study, 
except for two things: (1) the signals correlated are squared Z-scores 
instead of raw (mean 100) data, and (2) I counted both positive and 
negative correlations.  Here, positive and negative were counted 
separately, while before, only the positive correlations were counted.

Synchronized Eggs (Number of Tries = 75829066)

Significance Level 0.10 0.01 0.001 10^-4 10^-5 10^-6 10^-7 10^-8

Count of r > 0 7289007 622522 86594 21515 7435 2911 1174 528

Count of r < 0 7629967 870816 146608 36388 11435 3858 1397 536

Control Egg Pairing (Number of Tries = 75817454)

Significance Level 0.10 0.01 0.001 10^-4 10^-5 10^-6 10^-7 10^-8

Count of r > 0 7287510 621524 86710 21539 7341 2733 1133 506

Count of r < 0 7635552 872107 147193 36336 11417 4010 1496 561

There doesn't seem to be a clear effect jumping out at me here.  
At the "10^-6" through the "10^-8" levels, it's starting to look 
like a trend, with the synchronous data having more "significant" 
correlations with r > 0 and fewer with r < 0.  

But obviously, as I mentioned before, the "significance levels" no 
longer correspond to the true PDFs of the correlation coefficents.  
If we can determine the PDF of the "chi-square" correlation 
coefficient, then we could determine whether there really is an 
apparent trend or not.

Graphing the Chisquare Correlations

The following graph shows the difference between the counts of significant correlations for synchronized vs control pairs of eggs. There are, as noted, more positive and less negative correlations in the synchronized data than in controls. I am uncertain that the estimate for the standard deviation (used to scale the graph) is statistically correct, and hence do not feel a parametric test of differences is appropriate. A non-parametric Wilcoxon Signed-Rank test gives a two-tailed probability of 0.078 for the excess of positive correlations, 0.039 for the deficit of negative correlations, and p = 0.019 for the difference between the positive and negative correlation counts, providing further evidence that there may be a generalized effect of correlation among the synchronized eggs.
[RDN, 12 Apr 2000: this is a descriptive result that should be crosschecked and independently confirmed.]

This work is new, and there are issues such as the effects of possible non-independence and the computation of appropriate error estimates which need deeper consideration. Several people are involved, and part of the exchange is by email, allowing us to provide access to the discussion.

GCP Home