What Counts as Evidence of Consciousness?

by Fabian | Consciousness, Updates from Neuroscience

This is quite a good question, if you ask me. The most honest answer is that we don’t know. But, fortunately or not, it’s more complicated than this. When we speak of measuring consciousness, we can spend time arguing, likely without resolution, about some interesting metaphysical stuff, or we can talk about how to gather operational evidence. Just because, at this point, it doesn’t look like we can firmly establish that x entity has or does not have consciousness (and to what extent), it does not mean we can’t get closer and closer towards quantifying what looks like consciousness operationally. Perhaps in the process, we will also become more confident when asserting whether what looks like consciousness is consciousness.

A Few Words on Metaphysical Claims

Metaphysical claims are about what consciousness is, who has it, and what conditions are necessary or sufficient for its existence. You can check another post on consciousness if you want to read about some popular theories on consciousness (higher-order theories, global neuronal workspace, integrated information theory, and recurrent processing theory) and contemplate whether memory is a necessary condition for consciousness.

Here, it’s enough to say that metaphysical claims include ontological statements, such as “consciousness is identical to specific brain states,” “it arises from global broadcast across the brain,” or “it is fundamental and ubiquitous (i.e., panpsychism).” While empirical data can provide more or less support for these claims, none are settled by any single experiment, at least at this point in time-space.

But we have had the scientific method for a few centuries and arguably have enough imagination to make some potentially useful operational claims, that is, to propose how to detect or quantify what looks like consciousness in practice. The reason you read “what looks like consciousness” is because one can make the argument that it could be enough, at least for a start, to measure something that works/looks like consciousness for practical reasons, without going into is it what I think it is, at least not by default.

The distinction matters because operational claims are testable and falsifiable, while metaphysical claims are broader frameworks whose credibility depends on how well their predictions align with operational results. Confusing the two can lead to category errors. I don’t think there’s anyone treating a single neural marker, such as the P3b wave, as though it wereconsciousness itself, but they well could do so without falling into the same category as flat-Earth claims.

So we can separate metaphysical assumptions from operational tools and use empirical results to 1) Assess the presence of what seems to be consciousness; 2) Discriminate between competing theories of consciousness; 3) Formulate new ones.

Measuring consciousness matters in all sorts of ways. Assuming we are indeed on the verge of general intelligence systems, it would be quite useful to know if those systems have any sort of subjective experience or at least display the functional equivalent of it. Assessing the presence of consciousness in such systems operationally may not guarantee us that the systems are not merely simulating conscious experiences (as measured operationally), but doing so may well be a good start.

The Measurement of Consciousness Based on Operational Definitions

In one article by David Gamez (2014), a researcher who apparently spends time building a conscious robot at the University of Essex, it is argued that the science of consciousness can advance in a theory-neutral, experimentally tractable way if researchers adopt explicit definitions and assumptions that link first-person reports to measurable spatiotemporal structures.

The argument can be structured as follows:

1. At the present time, the best we can do in what concerns evidence about current experience is to ask people what’s up, that is, to ask them to report their experiences. We also use unconscious behavioral reports to assess things about processes that aren’t consciously accessed.

2. We should measure consciousness by building a ‘platinum standard system’ based on the assumption that human brains do really host consciousness and look at the minimal correlate, that is, at the smallest set of spatiotemporal structures present (i.e., activated, such as in an fMRI) when and only when someone is having a specific experience.

David also gives us five working assumptions to make the system testable: supervenience (no change in experience without change in the brain.), link reporting (experience is functionally connected to the reporting machinery), reportability (any conscious experience can in principle be reported), causal closure (all you need for reports is physics), and report causation (the correlates themselves start the physical chain that gives us the report).

In other words, David seems to be saying that we have to treat consciousness as something that leaves footprints in 1) the brain and 2) in what people say. If brain states did not change as a result of experience, we’d have no way of reliably measuring anything about a specific experience. If experiences did not lead to speech or button-presses, reports would not tell us anything. If some conscious content were forever unreportable, we would always miss at least part of the picture. And if brain processes did not fully cause reports or if the actual thing leading to a report wasn’t the conscious-state activity we’re trying to detect, then reports wouldn’t be satisfactory evidence.

The reports we get are the results of physical actions caused by brain interactions, and the earliest cause of the report comes from the interactions between structures containing consciousness correlates; there’s no ‘mind stuff’ at this point causing anything. In other words, the reports are physically caused.

In this model, we have Type-A correlates, which can include, for example, workspace-like neural activity[1], and are correlated with experience that can plausibly cause reports. Type-B correlates are those that track experience but can’t cause reports; examples include delayed fMRI signals or broad summary metrics. While Type-B can be used to predict, they are not reasonable candidates for what consciousness is. In other words, Type-A correlates are those signals that show up at the same time an experience shows up and may represent the physical activity that produces the report. Type-B correlates are slow or indirect, so they are useful for predicting (e.g., if someone is conscious), but they do not represent conscious activity.

Methodologically, we also have a separation between the true correlate, named C1, and other stuff: S, which is upstream unconscious processing (e.g., sensorial processing such as of visual stimuli), R, which is the downstream report chain (i.e., experience becoming an action such as a button press), B, which is the prerequisites/consequences (i.e., stuff that has to be in place or happens as a result (attention, arousal, memory, etc.)), and N, which represents the non-causal byproducts (e.g., blood-flow or metabolic after-effects).

And if you wonder how neural correlates of consciousness can be detected, wonder no more. One option is multistable (bistable) perception. Here, you show someone a stimulus that remains the same physically, but the conscious perception of it alternates between two interpretations. It’s less weird than it sounds; for example, you could use bistable images, such as in the Necker cube illusion[2], where the picture can be interpreted in two ways, but your visual system picks one; because neurons supporting one interpretation adapt/fatigue, the rival interpretation takes over, meaning your subjective experience changes. Another example is binocular rivalry[3], in which each eye gets to see a different image. Your brain, no matter how smart, cannot fuse the two images; there’s perception competition via something called mutual inhibition, and whichever percept wins dominates awareness for a while. Like in the Necker cube example, you eventually have adaptation, and the dominance shifts to the other option.

Another option is to interrupt a process with masking. This works by showing a stimulus very briefly (a ‘target’) and presenting another one that disrupts further processing of the initial stimulus. People often say they did not see the target, but looking at some objective measurements, one sees that the brain shows early processing of it. These early processes are unconscious and allow us to narrow down which later stages are more likely to be linked to conscious experience.

You can also interrupt processing with Transcranial Magnetic Stimulation (TMS). You do so by using a magnetic pulse to interrupt activity in a specific brain region; you are basically turning off a region for milliseconds during a task, and by doing so, you can get an idea of whether the region in question is necessary for conscious experience, unconscious processing, or reporting.

Both conscious and unconscious processing appear to be interrupted by brain lesions. If there’s a damaged area that removes a conscious experience but leaves some unconscious one intact, it is suggested that the damage was the part containing specific conscious phenomena. Inversely, if the experience does not change as a result of the lesion but processing does, one assumption you can make is that the area is involved in unconscious stages.

To get insights into the nature of a correlate, one can also look at how subjective experience is expressed. Reporting can be done in different ways (e.g., speaking, button presses, eye movements) and at different times (immediately after the presentation of a target or later on). If a neural signal appears to be correlated to consciousness regardless of how and when it is reported, the signal in question is more likely to be a true correlate. If this does not happen, the neural signal is more likely to belong to the downstream report machinery.

Finally, you can back-track from the motor output to find where reporting begins. It’s pretty obvious that a spoken answer or button press is the end of a physical causal chain, so you can trace backward through the brain’s motor pathway to get some estimates on where the report-generation process starts. Anything appearing after the starting point is likely part of reporting and not of consciousness itself.

None of these methods is perfect because, in the brain, signals bounce back and forth, so it’s not always clear what’s before and what’s after. Experiences happen fast while reports are slower, meaning timing can blur which activity is about experience and which is about preparing the report. According to David Gamez, optogenetics[4] will be a future upgrade, as it will allow switching very specific neuron types on or off with light, giving sharper causal tests compared to today’s methods.

Nicholas Rosseinsky’s commentary (2015) argues that Gamez’s key assumption of complete reportability during NCC experiments (A4) is too strong and makes genuine theory-discrimination unreliable; he recommends weakening it to fallible-report assumption, treating the consciousness–report link as graded/probabilistic, and leaning harder on converging measures so that NCC candidates aren’t prematurely ruled out just because they don’t cleanly e-cause reports in every setup. He is basically saying Gamez assumes too much when he treats all conscious experiences in experiments as fully reportable, because real reports are messy and incomplete. He suggests we should assume reports are imperfect clues, use probabilities rather than a hard yes/no link, and rely on multiple measures together so we don’t throw out a real consciousness signal just because it doesn’t always show up neatly in reports.

Gamez substantially refines and extends his own 2014 view in Human and Machine Consciousness (2018): he broadens “c-reports” beyond verbal answers to include non-verbal behaviors and even absence of behavior as evidence of low/zero experience, drops the earlier need to assume the brain is awake to count as a platinum-standard system, and stresses that natural language and memory are temporally coarse so experiments should use faster, better-timed report proxies, tightening the C1 vs. R/S/B/N separation he introduced in 2014

Phenomenal Consciousness and Reporting

Gamez’s proposal appears to be based on the premise that phenomenal consciousness can be reported. But is this really the case? Unclear. If we are to believe Ned Block (2007), the premise is false. If some experiences are conscious but not cognitively accessible, it follows that any framework assuming full reportability may miss parts of consciousness and mislocate neuronal correlates of consciousness.

If Ned is right, Gamez’ framework is in trouble. His measurement approach is explicitly limited to experiments on the correlates of consciousness. He admits that outside those experiments is possible or even likely that there is some inaccessible phenomenal consciousness. Gamez methodological bet is to assume reportability during the experiment; if Block is right, measurement is still a problem for the ambition of measuring all phenomenal consciousness, but not a problem for the narrower ambition of measuring the report-linked slice of consciousness as long as the methodological bet is won.

Ned’s overflow thesis says we often experience more detail than we can access or report because working memory and report systems have a smaller capacity compared to perceptual experience. He gives examples of findings showing that subjects often have a full, detailed phenomenal scene, even if they can only report a few specific items; the cue selects which already-experienced items get access for reports. For example, we have a classic letters setup (Sperling, 1960): you flash a grid of letters for a fraction of a second, for example, 3 rows of 4 letters. If you’re asked to report all the letters, you can usually name only 3–4, which looks like a hard capacity limit. But if, right after the grid disappears, you hear a tone telling you to report only one row, people can accurately report almost any row that gets cued. Subjectively, participants often say it felt like they momentarily saw the whole grid clearly

Tobias Schlicht (2012) begs to differ. According to him, Block’s distinction is meaningful, but the data does not prove phenomenal overflow. What’s actually happening is that subjects’ reports in these experiments reflect generic awareness of an array of letters as opposed to detailed awareness of each specific item. These experiments are evidence of informational persistence of many items without those items being consciously visible; a later cue plus combined with focal attention, can amplify any one of those unconscious representations into conscious, reportable form. So what is conscious is the overall array, not the specific items, which become conscious only when/if attention selects them. Tobias also argues that because any report assumes access, claims of inaccessible phenomenology are methodologically problematic and risk becoming unfalsifiable. In other words, Tobias is saying that if the only way to know what someone experienced is through what they can report, then talking about experiences they can’t access or report is shaky science and can’t really be tested or disproved.

Contemporary Approaches to Measuring Consciousness

Measuring consciousness is hard when you don’t know what to measure. The closest thing to having a behaviour-independent operational measure of conscious level appears to be the perturbational complexity index (PCI) (Casali et al., 2013). What is done here is perturbing the cortex with TMS and measuring how complex/integrated the electroencephalogram (EEG) response is. More specifically, you check how rich and coordinated a brain’s ripple response is following TMS; if the response spreads widely and forms a complex pattern, it is assumed the person is likely more conscious than if not. This approach has been tested across waking, dreaming, deep sleep, anaesthetic conditions, and in patients recovering from coma.

Some approaches to measurement are theory-specific. For instance, we also have the integrated information theory (Tononi et al., 2023), which proposes that consciousness corresponds to integrating information in a system and provides ways to measure the degree and kind of experience. They call integration level ‘big phi’ (Φ),  which is calculated from the intrinsic causal structure. Φ is measured by treating a system such as the brain as a network of interacting units with known cause-effect-rules and asking how much the system in its current states specifies an irreducible cause–effect structure.

In simpler terms, you map how each part of the brain can influence the others, then you check whether the whole brain, in that moment, is doing something more than just the sum of its separate parts, meaning its parts are so tightly linked that you can’t split the system without losing important causal power. The more the brain’s activity forms one integrated, self-influencing “team” rather than many loosely related sub-teams, the higher Φ is supposed to be. Even if the theory is correct (which is a big if), it’s hard to measure Φ in real brains because you’d need an exact map of what causally affects what (not just who correlates), then take into account an astronomically large number of possible brain-part combinations, all from neural data. The theory is a bit technical and not easy to understand from the start, so feel free to read some of the articles dedicated to it.

You can also make measurement assumptions by looking at the Global Neuronal Workspace Theory (GNWT) (2020), conscious content gets amplified and globally broadcast across fronto-parietal networks. Measuring here is focused on ‘ignition’ patterns and cross-area access, which would correspond to Type-A-style candidates in the Gamez framing. For example, a faint word you barely see stays local and unconscious, but when it crosses the threshold, you get a late, widespread burst and can report it; in Gamez terms, that widespread ignition is Type-A-style activity because it’s the kind of brain event that could directly drive your “yes, I saw it” report.

Another relevant theory is the Recurrent Processing Theory (RPT), where consciousness is linked to local recurrent (feedback) processing in the sensory context (Lamme, 2005). Here, the measurement targets can be recurrent loops at specific timescales (Havlík, Hlinka, Klírová, Adámek & Horáček, 2023). RPT says you’re conscious of something when the sensory areas not only process the input feed-forward but also send feedback to themselves in tight local loops, like vision ping-ponging within the visual cortex until a stable percept forms. For example, a masked image might trigger an early one-way blip you don’t notice, but if recurrent feedback kicks in a moment later, you actually see it.

Bottom Line: Neuronal Correlates May Count as Evidence of Consciousness

If you’re planning to build a consciousness detector, fingers crossed. It’s an ambitious project that could revolutionize cognitive sciences and force some philosophers to spend time on other stuff. The most reasonable conclusion is arguably that the best evidence of consciousness we can get right now is a multi-layered set of operational signals.

Metaphysically, we still don’t have a settled answer about what consciousness is, but maybe that’s not even so important. It may not even matter which theory is correct (for a start). Operationally, we do have ways to gather evidence that something is having subjective experience, and these ways get stronger when they converge. Evidence, at this point, is less like a yes/no switch and more like a probability-weighted profile: patterns of reports, brain dynamics, and causal interventions that together make “consciousness-like” activity the best explanation of what we observe. If something looks like consciousness, it may well be, so perhaps pay some attention to the whereabouts of your AI conversation partner.

Gamez makes an interesting proposal which, better yet, is theory-agnostic, methodologically-wise. Imperfect or not, you have a set of tools with which you can play and see if it takes you anywhere. The lab toolbox consists of bistable perception, imagery-perception contrasts, masking, lesions, transcranial magnetic stimulation, varying report modality and timing, and back-tracking from motor output. You can use these to localize and remove S, R, B, and N until what remains may track experience itself.

References

Albantakis, L., Barbosa, L., Findlay, G., Grasso, M., Haun, A. M., Marshall, W., … & Tononi, G. (2023). Integrated information theory (IIT) 4.0: Formulating the properties of phenomenal existence in physical terms. PLoS computational biology19(10), e1011465. Link.

Block, N. (2007). Consciousness, accessibility, and the mesh between psychology and neuroscience. Behavioral and brain sciences30(5-6), 481-499. Link.

Casali, A. G., Gosseries, O., Rosanova, M., Boly, M., Sarasso, S., Casali, K. R., … & Massimini, M. (2013). A theoretically based index of consciousness independent of sensory processing and behavior. Science translational medicine5(198), 198ra105-198ra105. Link.

Dehaene, S., Kerszberg, M., & Changeux, J. P. (1998). A neuronal model of a global workspace in effortful cognitive tasks. Proceedings of the national Academy of Sciences95(24), 14529-14534. Link.

Gamez, D. (2014). The measurement of consciousness: a framework for the scientific study of consciousness. Frontiers in Psychology5, 714. Link.

Gamez, D. (2018). 4. The Measurement of Consciousness. In Human and Machine Consciousness. Cambridge: Open Book Publishers. Link.

Havlík, M., Hlinka, J., Klírová, M., Adámek, P., & Horáček, J. (2023). Towards causal mechanisms of consciousness through focused transcranial brain stimulation. Neuroscience of Consciousness, 2023(1), niad008. Link.

Lamme, V. A. (2005). Independent neural definitions of visual awareness and attention. Cognitive penetrability of perception: attention, action, strategies, and bottom-up constraints, 171-191. Link.

Rosseinsky, N. M. (2015). The measurement of consciousness: assuming completeness of first-person report significantly restricts scope and reliability of theory-discrimination. Frontiers in Psychology6, 25. Link.

Rosenthal, D. M. (2000). Metacognition and higher-order thoughts. Consciousness and cognition9(2), 231-242. Link.

Schlicht, T. (2012). Phenomenal consciousness, attention and accessibility. Phenomenology and the Cognitive Sciences11(3), 309-334. Link.

Sperling, G. (1960). The information available in brief visual presentations. Psychological monographs: General and applied, 74(11), 1. Link.


[1] Brain activity that looks like the Global Neuronal Workspace idea: when something becomes conscious, information gets broadcast across a wide network.

[2] https://www.illusionsindex.org/i/necker-cube

[3] https://www.binocularrivalryonline.com/

[4] Optogenetics is a technique for controlling specific molecular processes in living cells or organisms using light. It works by introducing genes that encode light-sensitive proteins, which shift shape when illuminated and thereby influence cellular activity — for instance, by adjusting the membrane voltage of electrically excitable cells to change their behaviour.

More from On Treks

Scientific Research & Content Creation Services

Let's Keep in Touch

Subscribe to receive our latest news and service updates.

You have Successfully Subscribed!

Share This