Tropical Island Hopping

3 minute read

Published:

Islands

There are functionally infinite (non-trivial) paths to reach super-intelligence or AGI1. Intelligence is organized compute, and you can compute with almost anything (light, electrons, quantum states, mechanical gears, Minecraft)2. These potential intelligences include warehouses of brains in vats, silicon-based digital computers, or Neuralinked human brains. All of these have significantly different characteristics, e.g., intelligence per dollar (or watt), and while human-level intelligence seems high on this metric, there will be orders of magnitude improvements in the future. The optimal intelligence is extremely capable and energy-efficient. Yet, in the near term, these characteristics are irrelevant, and in practical terms not worth thinking too much about, because you only need to find one super-intelligence. Then, with its help, we then iterate an arbitrary number of times toward the direction of the optimal intelligence. So while the entry point is varied, the final optimal is not and the paths will converge.

In other words, the fastest path toward the super-intelligence is through the closest super-intelligence. This is similar to a diving expedition where an AI is plopped randomly in the ocean and wants to get to the highest mountain. Here, the elevation represents some aggregate intelligence metric, the ocean level represents human-level intelligence, and islands are technological clusters. One island could be transformer-based networks, another nearby island diffusion language models, a distant potentially larger island programmed biology. If the submerged AI wants to get to the highest island in the distance, it might be simpler to stop at the closest island first. There, it can at least get enough food (or power), assemble a raft, or build a plane to larger islands. Many of these islands are up in the clouds too, where human-level intelligence is not strong enough to chart a path directly from the water.

Then, the true question becomes what is the closest island to the AI diver right now. You could imagine with infinitely different initializations to our civilization (or other civilizations) and the effects these would have on the geography, but the only thing that matters is what is the closest island. Is it built from Adam-based optimizers, LayerNorm (or RMSNorm), residual networks, mixture-of-expert transformers run on silicon-based hardware?. Yann LeCunn would say no, but there is a reason research at Meta has flopped under his leadership. Or will all of these methods still peak under the surface of the ocean. Recent trends suggest that scaling up sequence-to-sequence modeling on digital hardware will be sufficient to get to super-intelligence, even if it costs trillions of dollars. Then, that intelligence will be used to rapidly iterate to increase model efficiency (more dynamic sparsity), or improve the computer architecture, or build a better process technology.


  1. The definition of super-intelligence or AGI often needlessly obstructs conversations about it. This definition here aligns with sci-fi priors and represents a stricter lower-bound. This can also be labelled ASI, but AGI is more commonly used in non-technical conversations, and the distinction is only academic. Any AGI system will be ASI when scaled and scaling will happen over a very short period of time. 

  2. Physical Neural Networks