How many flops is the human brain




















Riedel both seemed to expect something like 1 and 2 to be true, and they seem fairly plausible to me as well. A minimal, computationally useful operation in the brain probably dissipates at least 0. One possibly instructive comparison is with the field of reversible computing, which aspires to build computers that dissipate arbitrarily small amounts of energy per operation.

Useful, scalable hardware of this kind would need to be really fancy. As Dr. Frank, the biggest current challenge centers on the trade-off between energy dissipation and processing speed. Christiano also mentioned challenges imposed by an inability to expend energy in order to actively set relevant physical variables into particular states: the computation needs to work for whatever state different physical variables happen to end up in.

Of course, biological systems have strong incentives to reduce energy costs. But various experts also mentioned more general features of the brain that make it poorly suited to this, including:.

I think that this is probably true, but absent this middle step, 0. Indeed, lower numbers e. Even equipped with an application of the relevant limit to the brain various aspects of this still confuse me — see endnote , further argument is required. Still, we should take whatever evidence we can get.

Communication bandwidth, here, refers to the speed with which a computational system can send different amounts of information different distances.

Estimating the communication bandwidth in the brain is a worthy project in its own right. But it also might help with computation estimates. This is partly because the marginal value of additional computation and communication are related e.

One approach to estimating communication in the brain would be to identify all of the mechanisms involved in it, together with the rates at which they can send different amounts of information different distances. Another approach would be to draw analogies with metrics used to assess the communication capabilities of human computers. AI Impacts , for example, recommends the traversed edges per second TEPS metric, which measures the time required to perform a certain kind of search through a random graph.

There are a number of possibilities. One simple argument runs as follows: if you have two computers comparable on one dimension important to performance e. Of course, we know much about brains and computers unrelated to how their communication compares. But those drawn to simple a priori arguments, perhaps this sort of approach can be useful.

Using Dr. Naively, then, perhaps its computation is comparable indeed, superior as well. For example, if we assume that synapse weights are the central means of storing memory in the brain, we might get:. So the overall comparison here becomes more complicated. A related approach involves attempting to identify a systematic relationship between communication and computation in human computers — a relationship that might reflect trade-offs and constraints applicable to the brain as well. A more sophisticated version of this approach would involve specifying a production function governing the returns on investment in marginal communication vs.

These are all just initial gestures at possible approaches, and efforts in this vein face a number of issues and objections, including:. But I think approaches in this vicinity may well be helpful. Figure 1, repeated.

Rather, they are different attempts to use the brain — the only physical system we know of that performs these tasks, but far from the only possible such system — to generate some kind of adequately but not arbitrarily large budget. Can we do anything to estimate the minimum directly, perhaps by including some sort of adjustment to one or more of these numbers?

Paul Christiano expected the brain to be performing at least some tasks in close to maximally efficient ways, using a substantial portion of its resources — see endnote. That said, as emphasized above:. Here are a few projects that others interested in this topic might pursue this list also doubles as a catalogue of some of my central ongoing uncertainties. It is much less common for them to say what they mean.

I now think this much less likely. Rather, I think that there are a variety of importantly different concepts in this vicinity, each implying different types of conceptual ambiguity, empirical uncertainty, and relevant evidence. These concepts are sufficiently inter-related that it can be easy to slip back and forth between them, or to treat them as equivalent.

But if offering estimates, or making arguments about e. AI timelines using such estimates, it matters which you have in mind. This appendix briefly discusses some of the pros and cons of these concepts in light of such questions, and it offers some probabilities keyed to one in particular.

I chose this point of focus centrally because:. Such constraints restrict the set of task-functional models under consideration, and hence, to some extent, the relevance of questions about the theoretical limits of algorithmic efficiency. The brain, after all, is the product of evolution — a search and selection process whose power may be amenable to informative comparison with what we should expect the human research community to achieve.

At a minimum, it depends on what you want the simulation to do e. Is replicating the division of work between hemispheres, but doing everything within the hemispheres in a maximally efficient but completely non-brain-like-way, sufficient?

Are birds reasonably plane-like? Are the units of a DNN reasonably neuron-like? Some vagueness is inevitable, but this is, perhaps, too much. One way to avoid this would be to just pick a precisely-specified type of brain-likeness to require. For example, we might require that the simulation feature neuron-like units defined with suitable precision , a brain-like connectome, communication via binary spikes, brain-like average firing rates, but not e.

But why these and not others? Absent a principled answer, the choice seems arbitrary. We can imagine appealing, here, to influential work by David Marr, who distinguished between three different levels of understanding applicable to an information-processing system:.

The report focused on level 1. I have yet to hear a criterion that seems to me an adequate answer. Note that this problem arises even if we assume clean separations between implementation and algorithmic levels in the brain — a substantive assumption, and one that may be more applicable in the context of human-engineered computers than biological systems. Do we need to emulate individual transistors, or are logic gates enough? Can we implement the adders, or the ALU, or the high-level architecture, in a different way?

A full description of how the system performs the task involves all these levels of abstraction simultaneously.

Given a description of an algorithm e. Figure Levels of abstraction in a microprocessor. From Jonas and Kording , p. A The instruction fetcher obtains the next instruction from memory. This then gets converted into electrical signals by the instruction decoder, and these signals enable and disable various internal parts of the processor, such as registers and the arithmetic logic unit ALU.

The ALU performs mathematical operations such as addition and subtraction. The results of these computations can then be written back to the registers or memory.

B Within the ALU there are well-known circuits, such as this one-bit adder, which sums two one-bit signals and computes the result and a carry signal. C Each logic gate in B has a known truth table and is implemented by a small number of transistors. We know F the precise silicon layout of each transistor. Perhaps we could focus on the lowest algorithmic level, assuming this is well-defined or, put another way, on replicating all the algorithmic levels, assuming that the lowest structures all the rest?

Are ion channels above or below the lowest algorithmic level? But focusing on the lowest-possible algorithmic level brings to the fore abstract questions about where this level lies. What about the highest algorithmic level? This constraint requires specifying the necessary accuracy of the mapping from algorithmic states to brain states though note that defining task-performance at all requires something like this. That said, I think something in this vicinity might turn out to work.

More generally, though, brain-like-ness seems only indirectly relevant to what we ultimately care about, which is task-performance itself. Can findability constraints do better?

Examples include task-functional systems that:. The central benefit of all such constraints is that they are keyed directly to what it takes to actually create a task-functional system, rather than what systems could exist in principle. This makes them more informative for the purposes of thinking about when such systems might in fact be created by humans. This makes them difficult to solicit expert opinion about, and harder to evaluate using evidence of the type surveyed in the report.

There are a few other options as well, which appeal to various other analogies with human-engineered computers. For example, we can imagine asking: how many operations per second does the brain perform?

An operation is just an input-output relationship, implemented as part of a larger computation, and treated as basic for the purpose of a certain kind of analysis. This amounts to something closely akin to the mechanistic method, and the same questions about the required degree of brain-like-ness apply.

Again, we need to know what is meant. How would you do it? An arbitrarily skillful programmer, after all, would presumably employ maximally efficient algorithms to use this computer to its fullest capacity.

Can we apply this approach to the brain? All these options have pros and cons. That said, it may be useful to offer some specific though loose probabilities for at least one of these. Paul Christiano both seemed to think it possible to compute firing decisions less than once per timestep see Section 2. But it requires that the FLOPs costs of everything be on the low side.

And my very vague impression that many experts even those sympathetic to the adequacy of comparatively simple models would think this range too low. And it seems generally reasonable, in contexts with this level of uncertainty, to keep error bars in both directions wide. As I discuss in Section 2.

If we would end up on the low end of this range or below absent those processes, this would leave at least one or two orders of magnitude for them to add, which seems like a reasonable amount of cushion to me, given the considerations surveyed in Sections 2.

Overall, this range represents a simple default model that seems fairly plausible to me, despite not budgeting explicitly for these other complexities; and various experts appear to find this type of simple default persuasive. This range is similar to the last, but with an extra factor of x budgeted to cover various possible complexities that came up in my research.

Overall, this range seems very plausibly adequate to me, and various experts I engaged with seemed to agree. But as discussed above, lower ranges seem plausible as well. Benna and Fusi And in general, long tails seem appropriate in contexts with this level of uncertainty. Table of Contents. For somatic voltage prediction Fig. Maheswaranathan et al. Ujfalussy et al. Naud et al. Two compartments, each modeled with a pair of non-linear differential equations and a small number of parameters that approximate the Hodgkin-Huxley equations.

The scaled coincidence rate obtained by dividing by the intrinsic reliability Jolivet et al. Nirenberg and Pandarinath Naud and Gerstner a. Simulating realistic conditions in vitro by injecting a fluctuating current into the soma. Poirazi et al. Keat et al. Beniaguev et al. Hay et al. Classic set of models.

Anthony Zador expected the general outlines to be correct. Chris Eliasmith uses a variant in his models. Models synapses as a dynamical system of variables interacting on multiple timescales. May also help with online learning. Some experts argue that shifting to synaptic models of this kind, involving dynamical interactions, is both theoretically necessary and biologically plausible.

Use slope of the loss function to minimize the loss. Contentious debate about biological plausibility. The learning step is basically a backwards pass through the network, and going forward and backward come at roughly the same cost.

Konrad Kording, Prof. As the brain does not use fetching, decoding, and execution engines, FLOPS is a meaningless measurement. This site uses Akismet to reduce spam. Learn how your comment data is processed. We welcome suggestions for this page or anything on the site via our feedback box , though will not address all of them. AI Timelines. Clarifying concepts.

Research problems. A different application of the functional method treats deep neural networks trained on vision tasks as automating some portion of the information-processing in the visual cortex — the region of the brain that receives and begins to process visual signals sent from the retina via the lateral geniculate nucleus.

Such networks can classify full-color images into different categories with something like human-level accuracy. Using these networks for functional method estimates, though, introduces at least two types of uncertainty.

For example: the visual cortex is involved in motor processing, prediction, and learning. Indeed, the idea that different cortical regions are highly specialized for particular tasks seems to have lost favor in neuroscience. Second, even on the particular task of image classification, available DNN models do not yet clearly match human-level performance. For example:. Examples of generalization failures. From Geirhos et al. Left: image pairs that belong to the same category for humans, but not for DNNs.

Right: image pairs assigned to the same category by a variety of DNNs, but not by humans. Suppose we try to forge ahead with a functional method estimate, despite these uncertainties. What results? We also need to estimate two other parameters, representing the two categories of uncertainty discussed above:. My estimates for these are very made-up.

See section 3. Overall, I hold these estimates very lightly. The question of how high 2 could go, for example, seems very salient. And the conceptual ambiguities involved in 1 caution against relying on what might appear to be conservative numbers.

For example, it is at least interesting to me that you need to treat a 10 Hz EfficientNet-B2 as running on e. This weakly suggests to me that such a range is not way too low. Not on its own, because the brain need not be performing operations that resemble standard FLOPs. Various experts I spoke to about the limit method though not all were quite confident that the latter far exceed the former. Both, though, appeal to general considerations that apply even if more specific assumptions from other methods are mistaken.

Back to Top The communication method The communication method attempts to use the communication bandwidth in the brain as evidence about its computational capacity. Communication bandwidth, here, refers to the speed with which a system can send different amounts of information different distances.

This is distinct from the operations per second it can perform computation. But estimating communication bandwidth might help with computation estimates, because the marginal value of additional computation and communication are related e. ASCI Purple will have 50 terabytes trillion bytes of memory; Morvec estimates a brain to have a terabyte capacity. Brains are portable; ASCI Purple will be the size of refrigerator-size boxes covering 8, square feet about the size of two basketball courts and will weigh tons.

The average brain is 56 cubic inches and weighs 3. The human brain is distinguished by its ability to think and create in addition to simply processing information quickly, said Wise Young, director of the Keck Neuroscience lab at Rutgers University in New Jersey.



0コメント

  • 1000 / 1000