Session Information

Cluster 2

Interacting AI with Reality

Co-Chairs

Hyunjin Park, Yu-Seop Kim, Jaesik Park

Description

This session highlights how AI transforms fields like healthcare, robotics, and 3D spatial understanding. Key topics include neural scene reconstruction, cross-modal learning for embodied perception, and 3D shape assembly through equivariant learning. Researchers will also explore precision medicine and continuous learning systems, emphasizing human feedback and multimodal adaptability for real-world innovation.

# Neural Scene Reconstruction and Cross-Modal Learning # AI in Precision Medicine and Continuous Learning

Program

Day 1 (December 5)
10:00-10:40 Chair: Hyunjin Park
Neural Scene Reconstruction from Videos of Propagating Light David Lindell
(U of Toronto)
14:40-15:45 Chair: Yu-Seop Kim
Recent Advances in Medical AI for Cancer and Chronic Disease Screening and Precision Diagnosis and Treatment Le Lu
(Alibaba DAMO Academy)
AI Methods in Medicine Hyunjin Park
(Sungkyunkwan U)
Day 2 (December 6)
09:40~11:10 Chair: Jaesik Park
Finetuning Strategies for Improved Generalization Rei Kawakami
(Tokyo Institute of Technology)
Cross-Modal Representation Learning and Knowledge Transfer for Embodied Robotic Perception and Interaction Byoung-Tak Zhang
(Seoul Nat’l U)
3D Geometric Shape Assembly via Equivariant Learning Minsu Cho
(POSTECH)
Day 1 (December 5)
10:00~10:40
Chair: Hyunjin Park
Neural Scene Reconstruction from Videos of Propagating Light David Lindell
(U of Toronto)
14:40-15:45
Chair: Yu-Seop Kim
Recent Advances in Medical AI for Cancer and Chronic Disease Screening and Precision Diagnosis and Treatment Le Lu
(Alibaba DAMO Academy)
AI Methods in Medicine Hyunjin Park
(Sungkyunkwan U)
Day 2 (December 6)
09:40~11:10
Chair: Jaesik Park
Finetuning Strategies for Improved Generalization Rei Kawakami
(Tokyo Institute of Technology)
Cross-Modal Representation Learning and Knowledge Transfer for Embodied Robotic Perception and Interaction Byoung-Tak Zhang
(Seoul Nat’l U)
3D Geometric Shape Assembly via Equivariant Learning Minsu Cho
(POSTECH)
닫기

Talk Title

Neural Scene Reconstruction from Videos of Propagating Light

Abstract

Modern ultrafast cameras can capture scenes with effective frame rates exceeding hundreds of billions of frames per second—fast enough to resolve the movement of light itself. Videos captured with these cameras reveal the normally invisible "dance" of propagating light as it scatters, refracts, and diffracts through a scene. In this talk, I describe techniques that harness information contained in videos of propagating light to perform scene reconstruction. By combining physics-based rendering and neural networks we can render light propagation from novel viewpoints, observe viewpoint-dependent changes in light transport predicted by Einstein, recover material properties, and reconstruct accurate 3D geometry. Finally, I discuss future directions, such as how generative models can be used with ultrafast cameras to image in near-complete darkness.

Short bio

David Lindell is an Assistant Professor in the Department of Computer Science at the University of Toronto. His research combines artificial intelligence, applied optics, emerging sensor platforms, and physics-based algorithms to enable new capabilities in visual computing. Prior to joining the University of Toronto, he received his Ph.D. from Stanford University. He is a recipient of the 2021 ACM SIGGRAPH Outstanding Dissertation Honorable Mention Award, the 2023 Marr Prize, two 2023 Sony Research Awards, and a 2024 Google Research Scholar Award.

닫기

Talk Title

Recent Advances in Medical AI for Cancer and Chronic Disease Screening and Precision Diagnosis and Treatment

Abstract

From this talk, I will give a comprehensive overview in the field of multi-cancer early detection and our recent efforts and achievements of using AI and non-contrast CT scans to screen and detect a number of types of (early) cancers (pancreatic, liver, esophageal, gastric, colon/rectum, lung and breast) and major chronic diseases (quantitative and precision management of cardiovascular disease, osteoporosis, liver steatosis/fibrosis, fat and muscle anomalies, etc.). I will demonstrate our extensive quantitative results with promising clinical indications in opportunistic screening from the first 2/3 of the talk. After that, the remaining (equally or even more) important question is how to identify the proper cancer diagnosis and corresponding treatment options for screening-positive patients with high accuracy, manageable cost and high accessibility to deliver the ultimate clinical health benefits. I will cover the automated TNM staging work covering several cancers (laryngeal, esophageal, lung, pancreatic). All contents are based on peer-reviewed publications.

Short bio

Dr. Le Lu leads the global Medical AI R&D efforts for Alibaba DAMO Academy since August 2021. He had worked at PAII Inc., NVIDIA AI-Infra division, National Institutes of Health (NIH), Clinical Center in 2013-2021. Dr. Lu founded Nvidia’s medical image deep learning group in 2017. He was a senior staff scientist at Siemens Corporate Research from 2006 until 2013. Dr. Lu is an IEEE Fellow class of 2021 on medical imaging, AI and oncology imaging, cited for his contributions in machine learning methods for cancer detection and diagnosis; a MICCAI society board member (2021-2024, Chair for industrial affairs), IEEE Signal Processing Society Distinguished Industrial Speaker (2021-2023), an Associate Editor for IEEE Trans. Pattern Analysis and Machine Intelligence and Intelligent Oncology. Dr. Lu received NIH Clinical Center Director Award in 2017 and NIH Mentor of the Year award in 2015. Dr. Lu will be a General Chair for MIDL conference 2026. Dr. Lu received his Ph.D. degree of Computer Science from the Johns Hopkins University in 2007. He published 300+ peer-reviewed journal articles and leading conference papers including Nature Medicine, Nature Communications, Annals of Surgery, Radiology, Clinical Cancer Research, IEEE Trans. Medical Imaging, IEEE CVPR/ICCV/ECCV/AAAI/NeurIPS/MICCAI/IPMI/ICML/ICLR, etc. with overall citations of ~29800. He won many best paper or best paper finalist awards in RSNA, MICCAI and other leading clinical conferences with his postdoc trainees and colleagues (RSNA 2016/2018 Informatics Category Research Trainee Award, MICCAI 2017 Young Scientist Award Runner-up, MICCAI 2018 Young Scientist Publication Impact Award, MICCAI 2019/2020 MedIA Best Paper Special Issue, AFSUMB 2021 Young Investigator Award). He also published the most cited IEEE TMI paper since 2016, and the most cited medical imaging paper in IEEE CVPR in the last 10 years. He is the co-inventor of more than 150 US/PCT/CN patents (granted & pending). He also published 85 peer-reviewed clinical abstracts in leading clinical annual meetings, such as RSNA, ASTRO, AHNS, AFSUMB, AASLD, ACR Convergence and EULAR. Dr. Lu was the key technical leader for NIH-ChestXray-14 in 2017 and NIH-DeepLesion public datasets in 2018 (Clinical leader Dr. Ronald Summers).

닫기

Talk Title

AI Methods in Medicine

Abstract

AI methods are making great strides in many domains and medicine is no exception. Similar to other domains, the application of AI needs careful domain-specific considerations. With medicine, issues of explainability, expert intervention, and data imbalance are particularly important. Here, I will introduce the research topics of our projects. First, I will discuss a recent paper to synthesize breast imaging (mammogram) reflecting various clinical conditions using radiomics features. Second, I will also discuss a paper on early detection of Alzheimer’s disease leveraging precise timing of conversion event and representation learning using functional MRI. Other relevant projects will be introduced. Finally, future plans for the projects will be presented in the end.

Short bio

Hyunjin Park received a B.E. in electrical engineering from Seoul National University, Seoul, Korea in 1997, an M.S. in electrical engineering from University of Michigan in 2000, and a Ph.D. in biomedical engineering from University of Michigan, Ann Arbor, USA in 2003. He was a research faculty with University of Michigan Hospital in the Department of Radiology. He is currently a Professor of Electrical Engineering and Artificial Intelligence at Sungkyunkwan University, Suwon, Korea. His research interests include image processing methods for medical imaging, medical image analysis for cancer management, and computer vision applications for medical imaging. He is on the editorial boards of several journals with 200+ papers, 6000+ citations, and an h-index of 40 (google scholar) as of October 2024.

닫기

Talk Title

Finetuning Strategies for Improved Generalization

Abstract

Finetuning is an essential method for adapting pretrained networks to downstream tasks in practice. We developed two training methods for this post-training process. The first is based on the idea that minima in flat loss landscapes lead to better generalization compared to those in sharp landscapes. Thus, we propose PoF, a post-training that splits a pretrained network into a feature extractor and classifier, perturbing the latter and updating the former to reach a flatter minimum. The second is based on the statistics of the bias-variance ratio of mini-batch gradients for each layer. When the bias (the average) of gradients is dominant, the parameters require updates, while the variance will be larger at a stationary point such as the minimum. Thus, we present BVG-LS that adaptively select layers to update based on these statistics. Experiments show effectiveness of both methods.

Short bio

Rei Kawakami received her Ph.D. in Information Science and Technology from the University of Tokyo in 2008. She joined the Tokyo Institute of Technology, Tokyo, Japan, as an Associate Professor in 2022. Prior to this, she worked on computer vision problems such as object recognition, motion prediction, and basic learning algorithms as a researcher at various institutions, including Denso IT Laboratory Inc., the University of Tokyo, and Osaka University. She served as a program chair in ACM MM Asia 2022 and MVA 2021, and served as an area chair in many conferences such as CVPR 2024, ICCV 2023, CVPR 2022, etc. She is a member of IEEE, IEICE, and IPSJ.

닫기

Talk Title

Cross-Modal Representation Learning and Knowledge Transfer for Embodied Robotic Perception and Interaction

Abstract

CAs AI evolves toward open-world environments, the need for systems that can continuously learn, adapt, and transfer knowledge with minimal intervention is becoming increasingly important. This year, our research focused on developing cross-modal representation learning by building multimodal frameworks for a Universal Learning Machine, a system that autonomously learns through interaction with its environment. Through real-world applications in areas such as object manipulation and human-robot collaboration, we demonstrate how these systems continuously integrate and expand their understanding across complex, dynamic scenarios. This work represents a significant step toward creating a Universal Learning Machine that learns continuously, generalizes across domains, and interacts intelligently with its surroundings.

Short bio

Byoung-Tak Zhang is the POSCO Chair Professor at Seoul National University's (SNU) Computer Science and Engineering Department and Director of the AI Institute, SNU He served as President of the Korean Society for Artificial Intelligence (2010-2013) and the Korean Society for Cognitive Science (2016-2017). He earned his PhD in computer science from the University of Bonn, Germany, and his BS and MS from SNU Prior to joining SNU in 1997, he was with the German National Research Center for Information Technology (GMD). He's been a Visiting Professor at MIT CSAIL, Samsung SAIT, German BMBF Excellence Centers in CoTeSys and CITEC, and Princeton Neuroscience Institute. Presently, he serves as Associate Editors for various journals in AI. Among his accolades are the Red Stripes Order of Service Merit, INAK Award, and the IEEE Distinguished Service Award.

닫기

Talk Title

3D Geometric Shape Assembly via Equivariant Learning

Abstract

Understanding 3D shapes and manipulating them is a fundamental problem in many scientific domains, including computer vision and robotics. In our daily lives, we humans easily utilize objects around us and even create new objects by manipulating and assembling them, which requires understanding the attributes of individual objects and establishing relations between different objects. However, despite recent advances in artificial intelligence, such a feat remains out of touch with reality for machines. In this talk, I will introduce our recent work on learning to analyze 3D shapes, infer relations between them, and assemble parts into a whole. In particular, I will focus on incorporating equivariance as an inductive bias into a learner to take advantage of data symmetry in effectively combining 3D shapes while enhancing generalization.

Short bio

Minsu Cho is an Associate Professor at POSTECH, South Korea, leading POSTECH Computer Vision Lab. Before joining POSTECH in the fall of 2016, he worked as a postdoc and a starting researcher at Inria WILLOW team and École Normale Supérieure, Paris, France. He completed his Ph.D. in 2012 at Seoul National University, Korea. His research lies in computer vision and machine learning, especially in the problems of visual semantic correspondence, symmetry analysis, object discovery, action recognition, and minimally-supervised learning. He is interested in the relationship between correspondence, symmetry, and supervision in visual learning. He is an editorial board member of the International Journal of Computer Vision (IJCV) and IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) and has been serving as an area chair in conferences, including CVPR, ICCV, and NeurIPS. In 2020, he was inducted into the Young Korean Academy of Science and Technology (Y-KAST).