Event Selection Examples
To set up a formal test, we identify an event that provides a focus for collective attention of large numbers of people around the world. We then fix the start and end times and specify a statistical analysis to be performed on the data.
As explained in Event Selection Procedures, we choose a variety of kinds of events in order to learn what matters, and we have only gradually learned how to set parameters that are adequate. In particular, we had to guess what kinds of events might produce the data deviations we hoped to study, and we also had to learn how much data should be included in the event specifiction. Experience and analysis have helped answer these questions, and despite an effect size so small that we need dozens of events for reliable statistics, it is possible to standardize many of the event selection parameters.
The descriptions below explain how we set the time period for most events. Some are firmly fixed, and others generally so, while a few categories still demand flexibility. A standard analysis statistic has been used for almost all events since late 1999. It is a measure of network variance, calculated as the squared Stouffer’s Z score for each second, accumulated over the whole event period. It is important to add that in all cases, the specification is done a priori. All formal events are completely defined and entered into the hypothesis registry before the corresponding data are extracted from the archive.
Event Specification Examples
The following list represents our selection procedure as of 2010. The specifications are guidelines rather than strict rules, but they cover most kinds of events. Much of the guidance comes from a decade of experience, during which we have developed useful rules of thumb. Analysis shows, for example, that event periods have become longer over the years, and these guidelines reflect that. Similarly, we have learned that the average effect period is a few hours, but also that the exact length of our event period is not critical — in other words, that great flexibility is not required.
-
Sudden or surprising events such as terror attacks. 6 hours. These are
impulse
events with a precise time of occurrence in most cases. The normal specification is a period beginning an hour before the event and continuing for 5 hours post-event to allow news to spread and emotional reactions to develop. If there is no likely buildup or premeditation time (e.g., a temple stampede), the period begins at the time of occurrence. In case of a complex attack (terrorists in Mumbai) an adequate period may need to be 24 hours. - Large natural disasters, engendering fear and compassion. 12 hours or 24 hours. Earthquakes have a point in time, and their extent is often known within a few hours so that a period of 12 hours is adequate for global reactions to develop. Typhoons, hurricanes, tsunamis, and volcanic eruptions are less sharply focused and may need 24 hours.
- Political and social events like elections, protests, demonstrations. Fixed by event or 24 hours. Many powerful events are attended and have impact in real time, so a period of, e.g., 2 hours will contain the event and the immediate reactions. Protests and demonstrations are often difficult to time and continue for long periods, so 24 hours is typically chosen to represent them.
- Organized meetings and meditations with deliberate focus. Fixed by event or 24 hours. Earth Day, World Peace Day and similar events are organized to allow participation around the world, so we typically specify a 24 hour period. Some focus events are set for a certain time (e.g. 18:00- 20:00 PST) and this time is used for the event specification.
- Celebrations and ceremonies like New Years, religious gatherings. Variable, defined by the event. Repeated instances always use the same specifications. For example the 10-minute period around midnight, averaged across time zones, is used for New Years. For each Kumbh Mela in India, 13 daylight hours have been used.
About half the events in the formal series are identifiable before the fact; the accidents, disasters, and other surprises must, of course, be identified after they occur. We do not look for spikes
in the data and then try to find what caused them. Such a procedure is obviously inappropriate though many people imagine it is what we do or should do. After specification (and after the data are in) analysis for an event proceeds according to the registry specifications, yielding a test statistic relative to the null hypothesis. These individual results become the series of replications that address the general hypothesis and ultimately are combined to estimate its likelihood.
It is important to keep in mind that we have only a tiny statistical effect, so that it is always hard to distinguish signal from noise. This means that every success
might be largely driven by chance, and every null
might include a real signal overwhelmed by noise. In the long run, a real effect can be identified only by patiently accumulating replications of similar analyses.