Two papers have been accepted to AAAI 2025
The AAAI Conference on Artificial Intelligence (AAAI) is one of the major conferences in the field of artificial intelligence. (Accept. rate 23.4%)
Title: Zero-shot Depth Completion via Test-time Alignment with Affine-invariant Depth Prior
Authors: Lee Hyoseok(POSTECH), Kyeong Seon Kim(POSTECH), Kwon Byung-Ki(POSTECH), Tae-Hyun Oh(POSTECH)
To predict dense depth that aligns with the given sparse depth, we propose a test-time alignment method that incorporates optimization loops to enforce the measurements as hard constraints. We also propose a prior-based outlier filtering to ensure reliable guidance within the optimization loop. Our zero-shot depth completion method demonstrates generalization across various domain datasets, achieving an average performance improvement of 10.8% over previous state-of-the-art methods and improving spatial understanding by sharpening details.
Title: SoundBrush: Sound as a Brush for Visual Scene Editing
Authors: Kim Sung-Bin(POSTECH), Kim Jun-Seong(POSTECH), Junseok Ko, Yewon Kim(POSTECH), Tae-Hyun Oh(POSTECH)
We propose SoundBrush, a model that uses sound as a brush to edit and manipulate visual scenes. Inspired by existing image-editing works, we frame this task as a su- pervised learning problem and leverage various off-the-shelf models to construct a sound-paired visual scene dataset for training. SoundBrush can accurately manipulate the overall scenery or even insert sounding objects to best match the audio inputs while preserving the original content. Furthermore, by integrating with novel view synthesis techniques, our framework can be extended to edit 3D scenes, facilitating sound-driven 3D scene manipulation.