publications
2025
- Can Vision-Language Models Answer Face to Face Questions in the Real-World?arXiv 2025 (* joint first authors)
- SEE-2-SOUND: Zero-Shot Spatial Environment-to-Spatial SoundSIGGRAPH Posters 2025
2024
- AirLetters: An Open Video Dataset of Characters Drawn in the AirECCV HANDS Workshop 2024
2023
2021
- CPPE-5: Medical Personal Protective Equipment DatasetSpringer Nature Computer Science 2021
technical reports/mini-projects
- Orchestrating Machine Learning on Edge Devices with PyTorch and WebAssembly (Oral)PyTorch Conference 2023 (Oral)