select search filters
briefings
roundups & rapid reactions
before the headlines
Fiona fox's blog

expert reaction to study describing a language decoder reconstructing meaning from brain scans

A study published in Nature Neuroscience looks at semantic reconstruction of continuous language from non-invasive brain recordings.

 

Prof. Dr. Rainer Goebel, Head of the Department of Cognitive Neuroscience, Maastricht University, said:

“This speech decoder works non-invasively, i.e. no electrodes need to be inserted into the brain (no invasive neurosurgery required as with ‘Neuralink’, for example). Invasive speech decoders (with electrodes in the brain) are in principle superior to non-invasive methods (fMRI, EEG/MEG, fNIRS), because electrode derivations can capture the best spatio-temporal information, namely spikes of single neurons directly or the measurement of very local electric field potentials in the neighbourhood of an electrode. Such invasive electrical recordings have both an extremely high temporal resolution (better than milliseconds) and an extremely high spatial resolution (micrometres to millimetres). However, electrodes cannot (yet?) be implanted in the whole brain or in very large parts (such as the three language-relevant regions in the article). Today, implantation of 1000 to 10,000 electrodes is state-of-the-art and for large brain areas, hundreds of thousands of electrodes would certainly be necessary.”

“Of the non-invasive methods, fMRI provides the best data for decoding BCIs – we use it ourselves in my lab as our main method for ‘scientific’ BCIs. fMRI BCIs are very important for research, but unfortunately they are not ‘everyday’ because you have to put patients or subjects in a scanner every time for every BCI application (training or use). So you can’t use such BCIs at the bedside or at home.”

“The decoder presented in the study must first learn the relationship between speech semantics and brain activity so that it can later match the measured brain activity to the most likely phrases in new texts. Since brains are roughly similar in structure but quite different at the level of a resolution of a few millimetres, the relation between speech segments and brain activity patterns has to be established individually for each subject and that takes many hours of training.”

“The decoder does not work with acoustic information (heard speech) as input, but with already recoded (processed) semantic speech information for which a speech model (GPT-1) was used (invasive decoders can work directly with heard speech information). Thus, no words are fed into the decoder, but sequences of semantically encoded word representations (‘vectors’).”

“A central idea of the work was to use an AI language model to greatly reduce the number of possible phrases consistent with a brain activity pattern. It’s a bit like ‘ChatGPT’: based on past words and phrases, it suggests the word/phrase that best matches the semantic context. One could perhaps also say that the use of an AI language model can ‘mask’ the weakness of fMRI – the low temporal resolution – quite well. It’s a bit like having misspelled words corrected in Word based on a word database.”

“The decoder was successful in that many selected phrases in new (untrained) stories contained words from the original text, or at least had similar meaning. However, there were also quite a lot of errors, which is very bad for a full BCI, since for critical applications (for example, communication with locked-in patients) it is most important not to generate false statements.”

When asked to what extent the speech decoder was able to predict the meaning of imaginary stories and silent movies, and what significance this has for the future application of this technology:

“This part was rather exploratory with little data, which is also not very convincing (many more errors). Here, only future full studies need to show whether an fMRI-BCI is really fit for such applications (besides the practical considerations of fMRI).”

“As previously pointed out, brains do not ‘fit’ well on each other at the level of voxels (3D pixels) in the range of millimetres, so a decoder trained on subject A will not work very well on subject B (as also shown in the article). This has the advantage that one cannot ‘unintentionally’ read out thoughts, because a new subject must first agree to actively listen to stories in an fMRI scanner for many hours (in several sessions). After that, he would have to be willing – again in an fMRI scanner – to actively participate and (quietly) speak. Only if all this is given, outsiders could ‘read’ more or less correct phrases. If the person appears to be doing all this but is then doing another task in his or her mind, the ‘unintentional eavesdropping’ would still not work.”

“Applications of the non-invasive speech decoder therefore require the full willingness to participate in a long series of uncomfortable experiments, with the willingness to also actively cooperate mentally. So it is virtually impossible for someone to secretly gain access to a person’s mind with the non-invasive speech decoder.”

“Certainly, rules should be set in general about how we want to use BCIs in the future and how we can protect our privacy in the process, but potential dangers will certainly come from the invasive procedures (such as Neuralink is pushing) and certainly not from non-invasive, non-portable fMRI BCIs.”

“Overall, I think the article is interesting as it shows that temporally rapid sequential stimuli (words/phrases) can be captured to some extent with fMRI. A major point of criticism for me, however, is that there is a lot of ‘hand waving’ here and too much is promised by the exaggeratedly presented reference to BCI applications: As a layperson, one certainly gets the impression that powerful non-invasive speech decoding BCIs will soon find their way into our everyday lives. However, the relevant studies are lacking and the exemplary results for converting inner speech or watching movies, while ‘statistically significant’, are too poor to be trusted BCI. I venture the prediction that fMRI-based BCIs will (unfortunately) remain limited to research with a small number of subjects – as in this study.”

 

Prof Tara Spires-Jones, Deputy Director, Centre for Discovery Brain Sciences at the University of Edinburgh, Group Leader at the UK Dementia Research Institute, and BNA President, said:

“This study by Dr Huth and team shows that non-invasive brain scans from 3 people could be used to reconstruct some aspects of speech that people were listening to or imagining.  While it is very interesting from a neuroscience perspective, this study only used 3 participants who had to listen to 16 hours of stories while in a brain scanner to be able to achieve the reported decoding.   Even with 16 hours of recordings to train the computational models used, the decoded stimuli were not always accurate. Thus, while I enjoyed reading the study and it is a step closer to non-invasive speech prediction from brain scans, there is likely a very long way to go before we need to worry about scanners detecting our thoughts.”

 

 

Semantic reconstruction of continuous language from non-invasive brain recordings’ by Jerry Tang et al. was published in Nature Neuroscience at 16:00 UK Time Monday 1 May 2023.

DOI: 10.1038/s41593-023-01304-9

 

 

Declared interests

Prof Tara Spires-Jones: “I have no conflicts with this study.”

For all other experts, no reply to our request for DOIs was received.

 

in this section

filter RoundUps by year

search by tag