MIT Student Develops AI System to Visualize Music in Real-Time

May 19, 2026

A graduate student at MIT has developed an artificial intelligence system that translates audio streams into dynamic visual representations using neural cellular automata (NCA). Mariano Salcedo, enrolled in MIT’s new Music Technology and Computation Graduate Program, has created a web-based interface that allows users to generate music-driven visuals from any audio source, potentially transforming applications in live performance, audio engineering, and multimedia production.

Technical Framework

The system combines classical cellular automata with machine learning techniques to produce images that can regenerate and evolve in response to audio input. Neural cellular automata differ from traditional image generation methods by creating self-organizing visual systems that react to stimuli with inherent unpredictability. When paired with music, these patterns visualize sonic characteristics including energy, frequency, and amplitude in real-time.

Salcedo’s web interface enables users to adjust parameters controlling how the NCA system responds to musical energy, creating customizable visual performances. The platform works with any audio stream, making it adaptable for various professional applications from concert visualization to sound design workflows.

Industry Implications

The technology addresses growing demand for real-time audio visualization in entertainment and production environments. Unlike static visual effects, the self-organized nature of NCA systems produces emergent patterns that cannot be fully predetermined, offering visual artists and sound engineers new creative tools. The approach also provides researchers with methods for studying audio-visual relationships through machine learning frameworks.

Key Takeaway

Engineers and developers working in multimedia, live entertainment, or audio production should monitor neural cellular automata applications as they mature. The technology’s ability to generate responsive, regenerative visuals from audio streams presents opportunities for integration into existing production pipelines, particularly where real-time visualization enhances user experience or analytical capabilities.

Article Source: Seeing sounds | Image: Photo by Google DeepMind via Pexels