AI Learns to 'See' Through Human Eyes: Japan Decodes Visual Imagery from Brain Activity
Arkadiy Andrienko
Researchers from Japan have presented a method that allows an algorithm to generate text descriptions of what a person sees or remembers, based on their brain activity data. The technology, detailed in a scientific paper, doesn't read thoughts in the literal sense but demonstrates a novel approach to decoding visual imagery.
In the experiment, volunteers viewed hundreds of short, silent video clips depicting simple scenes—for example, someone walking down a street or a bird landing on a branch. Their brain activity was recorded via MRI, and a specially trained model then matched patterns in this activity not to specific video frames, but to abstract semantic features extracted from text descriptions of the videos.
Once trained, the system could use new brain data to gradually 'assemble' words, forming coherent sentences. Descriptions like "a man in a blue shirt opens a window" or "a dog is lying on the grass" were generated. In tests where the goal was to identify which specific clip a person had watched from a hundred options based on the generated description, accuracy reached roughly 50%. When participants weren't watching a video but simply imagining it from memory, the accuracy decreased but remained significantly above random guessing.
Interestingly, the method still worked even when data from the brain's classical speech areas was excluded from the analysis. This suggests that the meaning of a visual scene is encoded in a distributed manner across the brain, and the algorithm can detect these "semantic traces" in other regions.
The technology is still far from practical application. Its operation requires lengthy individual scanning sessions in expensive equipment and a strictly defined set of visual stimuli. It cannot yet decode arbitrary thoughts or internal monologue.
However, this approach opens prospects for developing brain-computer interfaces, especially for assisting people with severe speech and motor impairments. Simultaneously, the work reignites questions about the future boundaries of neurotechnology and the protection of mental privacy. The current results are more of a proof of concept, demonstrating how modern language models can become a bridge between language and neural activity, translating complex brain signals into understandable text.
-
Elon Musk Announces Grokipedia, an AI-Powered Encyclopedia to Rival Wikipedia -
ASUS ROG Unveils AI-Powered Gaming Router -
Google AI Studio to Introduce AI-Powered Video Game Generation Tool -
Opera Launches Neon, A Premium AI-Centric Browser with Integrated Assistant -
Microsoft Launches Its Own MAI-Image-1 Image Generator


