Research
My research lies at the intersection of Computer Vision, Geometry, and Machine Learning, with a long-term aim of developing intelligent models that emulate the human visual system. Integrating vision (comprehending the visual world) with geometry (representing the structure of objects in space) is essential for developing machine learning models that can interpret and reason about 3D environments from 2D visual data.
In the short term, my research mainly focuses on data- and computation-efficient deep learning. I approach this through three key directions.
Synthetic Training
Training models with synthetic samples can enhance model performance without data acquisition and annotation. I'm particularly interested in data synthesis, especially geometry-based data synthesis, which generates massive diverse samples with precise annotations. As humans, we perceive, understand, and distill knowledge into well-defined, closed-form rules. Using these rules, we can develop mechanical simulators that generate and simulate diverse samples with controlled randomness. This follows a teacher-student learning paradigm, where the teacher (the simulator) acquires and distills knowledge, then transfers it to the student (the model) through providing a large number of examples (generated large-scale training samples). In the sense, synthetic data serves as a bridge between human understanding and the model.