Timing

Big iron satellites are made with lots of discretes, hand wiring, metal boxes with lots of screws, and physical connectors you attach with a wrench, using military production techniques 20 to 40 years behind those used for consumer products and precision industrial gear. That construction method is compatible with low volume production, field replacements and upgrades, corrosive environments, and reuse of production lines in secure facilities. It is not compatible with low cost, low weight, high reliability(!), precision, or ultra large scale integration. Old-style military production may be strategically and economically unsound, but it frees civilian fabs to compete relentlessly, reduce costs ruthlessly, and improve performance dramatically. If civilian fabs become essential to military production, they will become strategic targets.

My background is semiconductor design and test. I worked 20+ years for Tektronix designing chips for measurement instruments. I helped write the IEEE 1149.4 Analog Boundary Scan standard. I spend a lot of time writing little computer programs, bypassing the weaknesses and poor configurability and poor automatability of most computer-aided design tools.

Fifteen years ago, as a consultant, I helped design timing generators for semiconductor testers, which must deliver a wave of thousands of digital signals through meters of wiring to a test fixture, with relative timing accuracies and jitter of about a picosecond, one sigma. I did the error budgeting and signal conditioning design that helped one subcomponent deliver less than 8 femtosecond jitter one sigma. These were single edges, and hundreds of thousands of edges on thousands of pins must all happen within very narrow time constrains. These experiences taught me how to deliver time signals with "impossible" specifications. The main tricks are to eliminate sources of variance, filter out nonessentials, and use matched and twinned or even "quadded" components wherever possible.

For example, each timing signal was delivered off the timing generator chip with 6 wires - a differential pair, two isolated power supplies connected only to that driver, and two dedicated grounds shielding these other 4 wires. The substrate of the chip was analyzed for total noise injection, with each internal subcircuit balanced for nominally zero injection during switching events. The column-grid-array packages added additional grounds as spacing. The idea was to eliminate all known sources of common mode noise, then add additional paths to soak up what was missed by analysis. Then place the sensitive subcircuits of the chip well away from the noisy subcircuits. Huge amounts of computer simulation, at the transistor and cell and subcircuit level, so that we could do "what if's" at every stage: If X goes wrong, how can Y still be made immune? If Y is not immune, what can X do to compensate? This can be done systematically, minimizing human effort, though keeping the computers busy. Lots of Perl scripts and C programs to massage the simulation outputs into design actionable outputs. All within the context of big chips with millions of cheap and reliable and well matched components.

Drivers were tunable in time and amplitude. We could adjust both of the differential pairs separately, to tune out common mode noise and differential time skew. And we added scan and sampling testability to all pins, a small extension of IEEE 1149.4 called "early capture", which allowed us to reconstruct pin waveforms for evaluation, and perform limits tests and calibrations in the deployed system. Rather than depend on noise margins, we could look at actual signal extremes and their results. We could look at calibration results for drift. If measurements varied over time, we could predict failure early, and schedule replacements when convenient, not when a near-failure caused the tester to make mistakes, rejecting good parts and passing flaky ones.

The timing chips were mounted on multilayer circuit board with surface mount connections, and shielded balanced differential lines on special circuit board materials (either without fiberglass yarn, or cut at 45 degrees so that wires over a yarn thread would not have different timing than one over a gap). The tester could be calibrated at multiple temperatures. Using the measurements on both send and receive drivers, we could "TDR" (Time Domain Reflectometry) any critical signal path and collect waveforms showing discontinuities or mismatched termination (which we could also adjust). Timing delays were adjusted during operation to compensate for interference and crosstalk and small temperature changes. Initial calibration of a system might take hours, but once it was stored and the system stabilized, the testers could operate very reliably. They could taken out of service for a few minutes and re-calibrated daily, and immanent failures noted, either to be bypassed by reconfiguration or replaced during scheduled downtime.

This is the sort of thing you can do when you can put a billion transistors, consuming nanowatts each, on a chip. Moore's law doubles capability every two years, so it is possible to do 100 times better now. I've seen some presentations on military radar hardware, and it is where consumer/commercial capabilities were 30 years ago. Sherman, set the wayback machine for 1982 ...

Timing for server sky is relaxed, compared to what I was doing more than a decade ago, because we do not have to rely on single edge measurementss. Instead, we can signal average over many seconds, change temperatures and measure responses, and build calibration tables that will allow us to trim delays as operating conditions change. The calibration tables will be built with software and CPUs, but the hardware that delivers the timing changes will use digital to analog converters fed by calibration registers, in turn fed by specialized DSP lookup engines. The general purpose CPUs will be involved at a higher level, looking at measurements, making complex decisions, and collecting anomalies for engineers to analyze. If our timing signal buses are resonant, with 100 ohm differential impedance, and a noise bandwidth of 1 GHz, with 350K terminators, the thermal noise is v2 = 4kTRB, 44 microvolts. On a 200mV peak-to-peak sine wave, that is an edge timing jitter of 70 ppm added to the outgoing signals from one thinsat. Averaged over an array of 30,000 thinsats, each with perhaps 50 transmitters, the average system timing jitter will be less than 100 parts per billion. With a transmit path of 10,000km (M288 radius 12,789km to 50 degrees North), that is a beam wander of less than a meter. Other effects (and noise figure!) will be much larger.

The calibration will look at many cycles, of course; we will be mixing our I and Q primary reference timing clocks with the timing clocks of neighboring thinsats, measuring phase differences and adjusting frequencies. We can also measure the temperature of the resonators, the physical spacing drift, etc, and add adjustments (in analog trims and hardware DSP, not software) for this drift. Error correction circuits should correct error only, not known offsets and drift.

Besides relieving the CPU of computational burden and reducing the system power, doing the calculations with hardware DSP increases radiation resistance; a soft error can flip a calculation bit or tickle the phase of a resonator, but it can't rewire a DSP engine. The phase synthesizers do not need the flexibility (or the vulnerability) of software. It is possible that we will make errors in programming our DSP, and need to change the wiring masks on our chips, but that is much cheaper than launching programmable flexibility that we do not need.

We will do calibration in millisecond timeouts in the transmit function during normal operation; that is enough time to make thousands of measurements and compute millions of instructions. If a thinsat is not being accessed for calculation or data retrieval, it can be taken offline for more extensive calibration - spacing calibration or component heating/cooling.

Server sky cannot be done with 30 year old discrete military microwave technology. Much better technology has been developed for huge consumer markets. It is time for the consumer manufacturers to take space products away from the military manufacturers. Care should be taken to avoid military entanglements, creating paperwork, lethargy, and encumbering secrecy, while attracting violent strategic responses.

Timing (last edited 2012-04-17 05:15:17 by KeithLofstrom)