Tech Wavo
  • Home
  • Technology
  • Computers
  • Gadgets
  • Mobile
  • Apps
  • News
  • Financial
  • Stock
Tech Wavo
No Result
View All Result

This Japanese AI Can Instantly Describe What You’re Seeing or Imagining

Tech Wavo by Tech Wavo
November 16, 2025
in Gadgets
0


What if your brain could write its own captions, quietly, automatically, without a single muscle moving?

That is the provocative promise behind “mind-captioning,” a new technique from Tomoyasu Horikawa at NTT Communication Science Laboratories in Japan (published paper). It is not telepathy, not science fiction, and definitely not ready to decode your inner monologue, but the underlying idea is so bold that it instantly reframes what non-invasive neurotech might become.

At the heart of the system is a surprisingly elegant recipe. Participants lie in an fMRI scanner while watching thousands of short, silent video clips: a person opening a door, a bike leaning against a wall, a dog stretching in a sunlit room.

As the brain responds, each tiny pulse of activity is matched to abstract semantic features extracted from the videos’ captions using a frozen deep-language model. In other words, instead of guessing the meaning of neural patterns from scratch, the decoder aligns them with a rich linguistic space the AI already understands. It is like teaching the computer to speak the brain’s language by using the brain to speak the computer’s.

Once that mapping exists, the magic begins. The system starts with a blank sentence and lets a masked-language model repeatedly refine it—nudging each word so the emerging sentence’s semantic signature lines up with what the participant’s brain seems to be “saying.” After enough iterations, the jumble settles into something coherent and surprisingly specific.

A clip of a man running down a beach becomes a sentence about someone jogging by the sea. A memory of watching a cat climb onto a table turns into a textual description with actions, objects, and context woven together, not just scattered keywords.

What makes the study especially intriguing is that the method works even when researchers exclude traditional language regions in the brain. If you silence Broca’s and Wernicke’s areas from the equations, the model still produces fluid descriptions.

It suggests that meaning—the conceptual cloud around what we see and remember—is distributed far more widely than the classic textbooks imply. Our brains seem to store the semantics of a scene in a form the AI can latch onto, even without tapping the neural machinery used for speaking or writing.

The numbers are eyebrow-raising for a technique this early. When the system generated sentences based on new videos not used in training, it helped identify the correct clip from a list of 100 options about half the time. During recall tests, where participants simply imagined a previously seen video, some reached nearly 40 percent accuracy, which makes sense since that memory would be closest to the training.

For a field where “above chance” often means 2 or 3 percent, these results are startling—not because they promise immediate practical use, but because they show that deeply layered visual meaning can be reconstructed from noisy, indirect fMRI (functional MRI) data.

Yet the moment you hear “brain-to-text,” your mind goes straight to the implications. For people who cannot speak or write due to paralysis, ALS or severe aphasia, a future version of this could represent something close to digital telepathy: the ability to express thoughts without moving.

At the same time, it triggers questions society is not yet prepared to answer. If mental images can be decoded, even imperfectly, who gets access? Who sets the boundaries? The study’s own limitations offer some immediate reassurance—it requires hours of personalized brain data, costly scanners, and controlled stimuli. It cannot decode stray thoughts, private memories, or unstructured daydreams. But it points down a road where mental privacy laws may one day be needed.

For now, mind-captioning is best seen as a glimpse into the next chapter of human-machine communication. It shows how modern AI models can bridge the gap between biology and language, translating the blurry geometry of neural activity into something readable. And it hints at a future in which our devices might eventually understand not just what we type, tap or say but what we picture.

Filed in General. Read more about AI (Artificial Intelligence), Brain, Japan, Machine Learning, Ntt and Science.

Previous Post

System to Stop Fraud Rings

Next Post

MAFS UK 2025: When Does the Final Episode Air?

Next Post
MAFS UK 2025: When Does the Final Episode Air?

MAFS UK 2025: When Does the Final Episode Air?

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

SwitchBot S20 robot vacuum review: impressive mopping, but it’s not without its quirks

by Tech Wavo
November 16, 2025
0
SwitchBot S20 robot vacuum review: impressive mopping, but it’s not without its quirks
Computers

Why you can trust TechRadar We spend hours testing every product or service we review, so you can be sure...

Read more

MAFS UK 2025: When Does the Final Episode Air?

by Tech Wavo
November 16, 2025
0
MAFS UK 2025: When Does the Final Episode Air?
Mobile

MAFS finale: Watch the final episode live tonight at 9pm on E4 It runs from 9pm to 10.40pm MAFS vow...

Read more

This Japanese AI Can Instantly Describe What You’re Seeing or Imagining

by Tech Wavo
November 16, 2025
0
This Japanese AI Can Instantly Describe What You’re Seeing or Imagining
Gadgets

What if your brain could write its own captions, quietly, automatically, without a single muscle moving?That is the provocative promise...

Read more

System to Stop Fraud Rings

by Tech Wavo
November 16, 2025
0
System to Stop Fraud Rings
News

Banks are losing more than USD 442 billion every year to fraud according to the LexisNexis True Cost of Fraud...

Read more

Site links

  • Home
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms of use
  • Home
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms of use

No Result
View All Result
  • Home
  • Technology
  • Computers
  • Gadgets
  • Mobile
  • Apps
  • News
  • Financial
  • Stock