Architecture, REGG-Net version 1.5 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Part 1: Description and overview of issues PART 2: SUGGESTED SPECIFIC VALUES PROPOSED PACKETS AND THROUGHPUT ANALYSIS Given these constraints, we propose the following data packet. Each packet would include:
This gives us a maximum packet size of less than 1000 bytes, which should work well with most network MTUs (and the number of records can be reduced to create a packet as small as 38 bytes, which still contains over 25% data). Limiting the number of records to 60 and using seconds as a time stamp means that we will need to transfer a maximum of one packet per minute. The first data record sent would be the one indicated by the last-data-received from the Basket value (this allows some cross-verification that's probably not necessary), and the final record sent would correspond to the most-recently-acquired data. An optional connect-time limit could be set to help hold down local costs (but it would have to be set high enough that all the data would eventually get through). We next consider the latencies and bandwidths of various connection types, and their implications for throughput of data. Here are some initial estimates:
A full communication scenario might look like the following. The Egg would dial the server, and tell it it was online. The server (with some application level latency) would provide the Egg with a packet that included both configuration information and last-record-received (timestamp) information. The Egg would then begin sending data packets. We can now get a fair estimate of the actual cost in telephone time using this scenario. We model an Egg taking data at 10 trials per second, and dialing in hourly via 14.4kbps modem. Counting in the various expected latency terms, and ignoring data errors, retries, and compression, we get the following rough estimates:
(Making the same estimate based purely on bandwidth would be off by over a factor of four.) Generating the same numbers for other dial-in frequencies gives:
By dropping the sampling rate to one trial/sec, the hourly connection time goes down to less than one third: once/hour (1 trial/sec) 34 sec/call, 14 min/day These numbers should help give some idea of the costs likely to be experienced by the remote dial-and-drop sites. Clearly we need to be sensitive to any costs being imposed on the Egg-site hosts by "administrative changes"! BROADER NETWORKING AND PROTOCOL ISSUES As part of the networking process, all the Egg-sites must share a common clock. This is exactly the purpose of the standard NTP protocol, and its implementation on the expected platform (Linux) appears to be such that it compensates for both clock value and clock *drift*, so that the resulting uniformity of clock time is better than the typical one second resolution transferred in the protocol. We recommend using NTP, with the Baskets serving as second-tier servers from some other canonical source and with all of the Eggs periodically synchronizing themselves to the Basket time. There may be a need to designate one Basket as primary, in case of a discrepancy, however. I recommend the following relationship between the permanent and dial-and-drop scenarios. In the case of dial-and-drop, the Egg will send a packet to the Basket which simply serves to communicate, "I'm online now." At this point, the Basket will reply with a packet describing the options and indicating the last successfully received index value. Once this packet arrives at the Egg, the Egg begins sending data from this point until all its data has been sent. In the case of a permanent connection, the Egg can elect to send an "I'm online now" message at whatever interval it desires, and the protocol continues as above. The server need never know the difference between the two types of Eggs. (However, if we prefer, the Basket can send the "last-received" packet whenever it wishes to collect data. It must then know not to ask dial-and-drop Eggs, or to expect no response from them.) In either case, the body of the protocol is identical at both ends, and only the initiation changes. The Basket should still have responsibility for monitoring the "aliveness" (fertility?) of particular Eggs, and notifying human administrators if a particular Egg seems to be down or partitioned off from the network. It is probably desirable to be able to set at least some of the options from the Basket, so that the administration of the Eggs does not require extensive involvement of the personnel at each Egg-site. However, we need to have some method of ensuring that the settings are authenticated. As a simple security mechanism, we can assume that the Eggs will only accept updates from the Basket they have contacted in the "I'm online now" phase, using known (fixed?) IP addresses, and assuming the security of the routing tables against corruption. This is essentially an IP "dial-back" approach. If this is inadequate, it is possible to implement something like a shared secret DES encryption scheme (ala CHAP) for authentication. This requires substantially greater sophistication, and may not be necessary. To help offset the impact of ever-more-frequent network partitions, I think it is important for each Egg to know about all the Baskets. Each Egg may even be configured to prefer a different Basket, on the assumption that communications within a continent are cheaper or at least more reliable than transcontinental ones. Thus, a Scandinavian Egg might report to a Dutch Basket, and a Californian Egg to an New Jerseyan Basket, and in the end only the Baskets would need to exchange information (presumably over higher bandwidth links) to get the whole picture. In the event of a trans-atlantic partition, each Basket would still receive data, and each Egg would still be able to report to its first choice of Basket. In the event of Dutch Basket down-time (if, say, it was being borrowed by the Easter Bunny) the Scandinavian Egg would then contact the NJ Basket directly, after noticing the missing Dutch Basket. LOCAL EGG ISSUES The Eggs generally should not need any display as Eggs. However, many of the Egg hosts (people) may want to know what is going on, and indeed it may be worthwhile to have at least a status display. It could be just a text report, with indicators of time on, amount of data reported, grand deviation (as a check whether all is well), etc. In general we should avoid any aspect that requires maintenance at the Egg-site. However, some Egg-site maintainers may be comfortable with extra features that are not appropriate for everyone. These features can be set locally at the Egg-site, with an appropriate interface that warns the maintainer of the extra burden being taken on. These features should also be made robust in the event of inattention. For example, if the local sites wish to have a data backup, one possibility is a floppy disk. However, the amount of data generated at the maximum speed of ten trials/second would roughly fill one floppy disk per day. This puts a high maintenance burden on the Egg-site maintainer. In contrast, running at one trial/second would extend the life of a floppy to over ten days, which is probably a reasonable maintenance burden for most sites. At this rate, this sort of backup adds less than $20 per site per year of media costs. If the local site maintainer forgets to change disks, the system should recognize this and either (1) discard the data on the disk and start from scratch when the disk fills up (2) stop writing data and discard data until a new disk is available, or (3) stop writing data and queue further writes until a new disk is available. We would discourage non-data-acquisition uses of the Egg machine, such as the installation of a web browser, because it potentially increases the hardware requirements, competes for bandwidth with the required data transfers, and interacts in somewhat unpredictable ways with the dialup scheme. Although most of these are not concerns for permanently-connected Egg-sites, these sites are the most likely to already have browsing capabilities. Furthermore, keeping the hardware and software platforms uniform allows for easier "hot spare" replacement. LOCAL BASKET ISSUES Some sort of utilities (probably software on the Baskets, or perhaps even a private web area) needs to be built to help view the performance of the network rather than just the results. Things like a global view of connectivity, down-time ratios, and Egg type information would be useful. Using SNMP to some extent is certainly possible, but although I would like to encourage the usage of IETF standards as much as possible, it may be quicker to roll our own details for this capability. SOCIAL IMPLICATIONS If and when our Eggs hatch, we may need to open certain cans of worms to feed the hatchlings. One might divide the issues into those related to the project "not working" and those related to it "working," but of course, terms themselves need to be defined. For now we take it to mean that the system detects some sort of global consciousness structure. If it doesn't work, there will be a need to explain why its results are different from the preliminary studies like the Diana and Theresa work. What if it does work? It seems that discovering and being able to measure something like global cohesion is a huge breakthrough, and we should consider how to communicate the discovery properly. Is there also a moral significance to demonstrating the power of group-think? What do we do if we discover that the mechanism measure other things of significance? Jiri has alluded to the fact that it could equally well pick up on the consciousness of animals other than humans. If it notices solar eclipses, it certainly has the potential to notice other significant astronomical or geological events, or our reactions to them. One thought in particular, given that animals are often sensitive to things that people miss, is that the system might detect phenomena such as earthquakes before they actually occur. This possibility alone, if it came true, would make the project extremely significant to humankind. ACKNOWLEDGEMENTS This document evolved in response to input from many individuals, some of whom must have psychically known what input was needed since they hadn't yet seen the document. In particular, Jiri Wackermann's comments on the layered protocol suggested a much better organization for the yet-unseen document, and his comments on mass-storage backups helped convince Roger of the importance of this issue. Dick Bierman's comments on synchronizing the processing using timestamps reinforced our own belief in the necessity of this process, and his discussions of Z-score versus bit-sum representations forced Greg to review his thought processes on this matter. Charles Overby reminded us that we need to keep connection costs firmly in mind. Further feedback not specifically mentioned or visible in the final document was still greatly appreciated, in many cases forcing us to clarify our own reasoning about the issues involved. Part 1: Description and overview of issuesPart 2: Suggested specific values Part 3: Glossary |