Rishit Dagli
CS, Math Undergrad at UofT
 
 I am very interested in learning algorithms, computer vision, graphics, learning theory, and math (number theory and topology).
I am currently on a break from my undergrad and I am working at NVIDIA on the intersection of AI, vision, and graphics research. After I switched boats to research I interned at Qualcomm AI Research in 2024 with Roland Memisevic and Guillaume Berger and at Civo in 2023 with Josh Mesout.
In a past life, I used to work on software engineering and robotics. I used to contribute extensively to/ maintain some popular open-source projects which can be found on my github and software.
I am looking for a PhD position starting Fall 2026, please reach out if you can help in any way.
news
| Oct 3, 2024 | We released a new 7B VLM and large-scale dataset for video understanding. Arxiv. Dataset. (code release soon, in the hands of corporate overlords) | 
|---|---|
| Jun 18, 2023 | We released the first vision (images and video)-spatial audio model as a step towards complete generation. Arxiv. Code and Web Demo. | 
selected publications
-  
  -  
 Can Vision-Language Models Answer Face to Face Questions in the Real-World?arXiv 2025 (* joint first authors) -  
 SEE-2-SOUND: Zero-Shot Spatial Environment-to-Spatial SoundSIGGRAPH Posters 2025