publications
2025
-
Can Vision-Language Models Answer Face to Face Questions in the Real-World?arXiv 2025 (* joint first authors) -
SEE-2-SOUND: Zero-Shot Spatial Environment-to-Spatial SoundSIGGRAPH Posters 2025
2024
-
AirLetters: An Open Video Dataset of Characters Drawn in the AirECCV HANDS Workshop 2024
2023
2021
technical reports/mini-projects/school course projects
-
Orchestrating Machine Learning on Edge Devices with PyTorch and WebAssembly (Oral)PyTorch Conference 2023 (Oral) -
CPPE-5: Medical Personal Protective Equipment DatasetSpringer Nature Computer Science 2021