Rishit Dagli
I am broadly interested in learning algorithms, computer vision, graphics, learning theory, and math. My research interests are in building models that can produce physically realistic dynamic worlds as well as understand this complex (4D) world.
I am an undergraduate student in CS and Math at University of Toronto. Previously, I took a 1-year break from my undergrad to work at
as a research intern at the intersection of AI, vision, and graphics. Before that, I was a research intern at
AI Research in 2024 with Roland Memisevic, Guillaume Berger on video-audio-language models (VLMs). Even before that, I was a research engineering intern at
with Josh Mesout where I focused on improving inference performance of multimodal models.
In a past life, I used to work on software engineering and building robot hardware. I used to contribute to some popular open-source software.
selected publications
- FreeForm: Reduced-Order Deformable Simulation from Particle-Based Skinning EigenmodesCVPR 2026 (* joint first authors)
-
-
Can Vision-Language Models Answer Face to Face Questions in the Real-World?ICLR 2026 (* joint first authors) - RoboLab: A High-Fidelity Simulation Benchmark for Analysis of Task Generalist PoliciesarXiv 2026
-
Squeeze3D: Your 3D Generation Model is Secretly an Extreme Neural CompressorarXiv 2025 - SEE-2-SOUND: Zero-Shot Spatial Environment-to-Spatial SoundSIGGRAPH Posters 2025