Differences between revisions 23 and 24

Distributed Artificial Intelligence

Many amateur "AI professionals" are "philosophers" who ignore as much experience and data as necessary in order to make lofty statements. This impresses the rubes but does not translate into hardware. Real AI is optimized for physics, not philosophy. Sine data, cogito ergo sum insipiens.

Superfast large core AI will not happen because that approach is costly and suboptimal. Distributed AI, many nodes including humans and other lifeforms, is happening now.

Human brains are distributed low power multiprocessors relying heavily on pattern matching. Brains move atoms between synapses, not electrons (though they distribute action potentials across neurons at electronic speeds). The brain does all its work, including physiological maintenance, with 20 watts of power, and does so at a results-generation rate that matches the problems encountered in the natural environment. Brains could work faster, but energy is costly to gather, and waste is difficult to dissipate.

There is no reason to think that engineers (human or artificial) will design AI in excess of productive need or available resources. The productive need is vast, perhaps unbounded, but it is situational and specific. AI will permit us to address many situations that are too expensive to tackle with brains.

Problems like rendering display frames are calculation, not artificial intelligence, and they are done today with hundreds of thousands of parallel threads. But the customers (humans and their visual systems) need no more than 200 frames a second, because the human visual system doesn't trigger on display artifacts smaller than 5 milliseconds. The computational task of preparing a high resolution display, can be efficiently divided into hundreds of thousands of threads with current gigahertz-capable but 100-megahertz-optimal parallel hardware. As Moore's law provides more hardware, the threads will increase and the thread clock rate will drop, someday devolving to a 200 hertz thread per pixel. Distribution and efficient information flow will dominate. While those threads will probably still be instantiated on a substrate smaller than a thumbnail, a millisecond of information scatter and gather at half the speed of light permits a substrate 75 kilometers across.

The biggest extant AI, the worldwide network of Google computers, handles a huge parallel task, serving millions of simultaneous queries. Google uses a crapton of parallel computers at multiple sites to do that. Using fewer, faster computers is more expensive and less energy efficient.

Speed is proportional to CV (capacitance times voltage), power is proportional to CV²F, so increasing C (parallelism) and reducing V is a win, more results per watt-hour. Nature knows this, Intel knows this, nVidia knows this, Google knows this, I know this. The millenialist AI community does not.

Google response time is a matter of two things - network delay, and how many canned responses they've stored up for generic search queries. Typically, each customer-facing compute node pulls the search words deemed "important" out of a search query, ignores the rest, assembles a pointer, then does one disk lookup on the machine the pointer points to, returning the results of a generic search that was performed hours or weeks or years ago. Along with the revenue-generating ads that attach to the search words. That approach is fast and cheap, and works like a human brain. Also like interacting with the typical brain, annoying to those of us who are trying to find exceptional results. Still, Google's customers (the advertisers) get access to the product they want (you and me), and this is a very efficient way to cost-effectively harvest high quality product for the customers.

Given speed-of-light network delays, there is no reason to spend gigabucks to further reduce the response time of the computers. It is better to create more and smaller Google data centers, closer to the product AKA search users, so search users can be harvested faster and more efficiently. There is no need to build a Google data center bigger than the product field it harvests. Google's data center in The Dalles, Oregon, is an exception; electricity is so cheap in Oregon that many of the backroom tasks, like efficiently sorting the internet into bins and assigning search pointers to them, is best done where energy is cheap. Then those bins are replicated to production data centers around the world. Of course, those distributed data centers can also assist with bin creation during times of low product demand.

Google, like nature, puts all its eggs in MANY baskets. Take away three Google data centers, and with a little bit of routing magic, the rest will shoulder the load, working as a group, perhaps a little slower because the speed of light delays to some customers is larger. Four smaller nodes in a densely interconnected region will have half the latency of one big node, and use half the total routing resources, and provide faster response if one of the nodes is lost. If all four are lost, there is probably infrastructure damage throughout in the region, so the reason to respond is probably lost, too.

The unit of genetic reproduction is the species, not the individual; a too-small subset of a species is nonviable (estimated to be greater than 500 individual humans for sufficient genetic diversity and accident tolerance). Individuals are optimized for selection; trash bags for bad genes. Individual speed is optimized for the speed of the threats and opportunities in the environment. Large and fast threats are few, because they are costly, and limited by physics. The best way to respond to rare and expensive threats is more individuals, not a few almost-invulnerable individuals, especially if the individuals can collaborate to remove the source of the threats.

Individuals present more "vulnerable surface" to the environment, but they also can collect more resources (food and information) through that larger surface. If the individuals can differentiate (like learning human beings), then each individual has different vulnerabilities; the chance that one threat can take out all of the individuals is much smaller than one threat taking out one large individual. An elephant can menace one small individual, but falls prey to a coordinated band of individuals smaller in total mass. Coordination beats concentration.

AI, like the human brain, is a tool to solve a problem with the time and resources available. The best AI computes plausible solutions in advance of need, as power-efficiently as possible, then refines them for specific situations as they occur. Lots of solutions efficiently manufactured in parallel, distributed in time to match the rate of problem manifestation, is the optimum way to proceed. For problems manifesting at 100 kilosecond (1.16 day) rates, sense-and-respond solution systems 100 AU in diameter are adequate, and emergent physical threats (or opportunities) at sublight velocities will need much longer to affect those large and networked systems.

-  ⇤ ← Revision 23 as of 2015-05-15 15:31:35 → 
  Size: 7217
  Editor: KeithLofstrom
  Comment:
+   ← Revision 24 as of 2015-05-15 16:03:12 → ⇥
  Size: 7184
  Editor: KeithLofstrom
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 3:
-Many amateur "AI professionals" are "philosophers" who ignore as much experience and data as necessary in order to make pompous categorical statements that impress the rubes but cannot be translated into hardware.  Real AI is optimized for physics, not philosophy. Sine data, cogito ergo sum insipiens.
+Many amateur "AI professionals" are "philosophers" who ignore as much experience and data as necessary in order to make lofty  statements.  This impresses the rubes but does not translate into hardware.  Real AI is optimized for physics, not philosophy. Sine data, cogito ergo sum insipiens.
 Line 5:
-Megabrain AI, or superfast single core AI, will not happen (in that way) because it is costly and suboptimal.  Distributed AI, many nodes including humans and other lifeforms, is happening now.
+Superfast large core AI will not happen because that approach is costly and suboptimal.  Distributed AI, many nodes including humans and other lifeforms, is happening now.

Diff for "DistributedAI"

Distributed Artificial Intelligence