Chair(主持人) | Bin Xiao (Chongqing University of Posts and Telecommunications, China) |
Speakers(报告人) |
Zhaoxiang Zhang (Institute of Automation, Chinese Academy of Sciences, China) Yi Yang (Zhejiang University, China) Shuqiang Jiang (Institute of Computing Technology, Chinese Academy of Sciences, China) Cewu Lu (Shanghai Jiao Tong University, China) |
Speaker 1 |
Speaker: Zhaoxiang Zhang (Institute of Automation, Chinese Academy of Sciences, China) Title: World Simulator: Exploration and Fusion of Multipath Abstract:The development of artificial intelligence is evolving rapidly. On one hand, new technologies represented by multimodal large models and generative large models are emerging constantly; on the other hand, new applications represented by embodied intelligence and Agents are continuously deepening. Among the integration of these technologies and applications, world simulators are the most crucial core enabling technology. This report delves into the significant value and feasibility of world simulators, exploring key technical approaches and our preliminary findings. Practical applications in areas such as intelligent perception, autonomous driving, robotics, internet control, and smart cities are discussed to illustrate their effectiveness. Finally, the report outlines the prospects for the multi-path integration of world simulators. Biography: Zhaoxiang Zhang, Ph.D., is a researcher, doctoral supervisor, Changjiang Distinguished Professor, and executive deputy director of the New Laboratory of Pattern Recognition at the Institute of Automation, Chinese Academy of Sciences, and a professor at the University of Chinese Academy of Sciences. His research interests include pattern recognition, embodied intelligence, and agent learning. He has published over 200 papers in top journals such as IEEE T-PAMI, IJCV, JMLR, National Science Review and top conferences including CVPR, ICCV, ECCV, NeurIPS, ICLR, AAAI, IJCAI. He was granted over 30 patents. He has led or is leading multiple national research projects including the National Natural Science Foundation of China Key Project, Key International (Regional) Cooperation Research Project, Joint Key Support Project with CETC, National Key R&D Project. He served as Area Chair for top conferences like CVPR, ICCV, NeurIPS, ICLR multiple times. He won the First Prize for Scientific and Technological Progress of Beijing Science and Technology Award as the first accomplisher. |
Speaker 2 |
Speaker: Yi Yang (Zhejiang University, China) Title: Content Generation and Engineering Simulation based on Knowledge-Driven Artificial Intelligence Abstract: Knowledge-driven artificial intelligence for domain-specific applications focuses on efficiently integrating pre-trained large models, prior knowledge, and domain-specific models. This talk will cover multi-knowledge-driven content generation technologies for applications such as digital human reconstruction, animation, and cross-media content generation. First, the talk will explore methods that incorporate geometric and other prior information for digital human reconstruction and animation. Next, it will discuss techniques for controllable content generation through audio, text, layout structure, and other data. Then, practical case studies will be presented to highlight hybrid model collaboration mechanisms, such as specialized knowledge embedding and structured expression. Finally, the talk will introduce the use of generative AI in engineering simulation. Biography:Yi Yang is a Qiushi Chair Professor at Zhejiang University. He currently serves as the Vice Dean of the College of Computer Science at Zhejiang University, and Director of the Microsoft-Ministry of Education Key Laboratory of Visual Perception. His research focuses on artificial intelligence and its applications. His published papers have received over 70,000 citations on Google Scholar, with an H-index of 131. He has been consecutively named a Clarivate Analytics Highly Cited Researcher for the past 6 years. He has received numerous international awards in the AI field, including the National Outstanding Doctoral Dissertation Award from the Ministry of Education (2010), the Australian Research Council DECRA Fellowship (2013), the Australian Computer Society Gold Disruptor Award (2016), Google Faculty Research Award (2016), Australian Career Achievement Award (2019), Amazon Machine Learning Research Award (2020), AAAI Most Influential Paper Award (2021), and the Best Paper Award at ACM MM (2023). He has also led teams to win over 20 world championships in international research competitions. |
Speaker 3 |
Speaker: Shuqiang Jiang (Institute of Computing Technology, Chinese Academy of Sciences, China) Title: Embodied Navigation Combining Exploration and Imagination. Abstract:Embodied AI represents a significant manifestation of artificial intelligence in the real physical world, which has showcased great application potentials in dynamic open-world environments. Embodied navigation refers to the ability of the agent to perceive and understand the environment based on task objectives (such as language instructions), then predict and execute movement actions, thereby progressively completing tasks. It is the key technology for embodied intelligent systems to interact with the real world. Existing methods for embodied navigation largely rely on current and past visual observations for short-term and single-step action prediction, lacking the capability for evaluating unobserved environments and conducting long-term action planning. Physiological studies have indicated that humans not only depend on current observations but can also imagine unobserved environments from prior memories, constantly refining and enhancing their understanding of the environment by combining exploration and imagination. Thus, endowing agents with the ability to “imagine” thereby aiding them in predicting the layout of unobserved environments, assessing the long-term value of navigation actions, and realizing more efficient and accurate navigation decisions, emerges as a significant research challenge. This report will first introduce the research background of embodied AI and embodied navigation and then report on the research progress in embodied navigation combining exploration and imagination, including self-supervised generative map and lookahead exploration with neural radiance representation, and finally introduce the adaptation of embodied navigation from simulator to the real world, providing demonstrations. Biography: Shuqiang Jiang (SM’08) is a professor at the Institute of Computing Technology (ICT), Chinese Academy of Sciences (CAS) and a professor at the University of CAS. He is also affiliated with the Key Laboratory of Intelligent Information Processing, CAS. His research interests include multimedia analysis and multimodal intelligence. He has authored or coauthored more than 200 papers on the related research topics. He was supported by National Science Fund for Distinguished Young Scholars in 2021. He won the CAS International Cooperation Award for Young Scientists, the CCF Award of Science and Technology, Wu Wenjun Natural Science Award for Artificial Intelligence, CSIG Natural Science Award, and Beijing Science and Technology Progress Award. He is the Associate Editor of IEEE TMM and ACM ToMM, vice Chair of IEEE CASS Beijing Chapter, vice Chair of ACM SIGMM China chapter. He has served as an organization member of more than 20 academic conferences, including the general chair of ICIMCS 2015, program chair of ICIMCS2010, PCM2017, ACM Multimedia Asia2019, He has also served as an area chair or TPC member for many conferences, including ACM Multimedia, CVPR, ICCV, IJCAI, ICME, ICIP, etc. |
Speaker 4 |
Speaker: Cewu Lu (Shanghai Jiao Tong University, China) Title: Exploring the Embodied Intelligence PIE Framework: Perception(P), Imagination(I), Execution(E), and the Role of Large Models Abstract: This lecture introduces the speaker’s Embodied Intelligence PIE solution. P (Perception) introduces the full perception and interactive perception of robots. I (Imagination), an introduction to the speaker’s conceptually driven simulation reasoning framework for the physical world. E (Execution) introduces the ideas and execution of the general meta-operation skills. Based on the above three modules, the exploration and preliminary results of the embodied PIE large model are introduced. Finally, the work of embodied cognitive intelligence is introduced, focusing on verifying the stable implicit relationship between brain neural behavior and physical behavior. Biography:Lu Cewu is a professor and Changjiang Distinguished Professor of Shanghai Jiao Tong University. He is the winner of the Xplorer Award, and was awarded the Overseas High-level Youth Talent Introduction in 2016. In 2018, he was selected as 35 Innovators Under 35 (MIT TR35) by MIT Technology Review. In 2019, he was awarded Qiu Shi Outstanding Young Scholar. In 2020, he was awarded the Special Prize of Shanghai Science and Technology Progress Award (ranked third). In 2022, he won the Outstanding Scientific Research Achievement Award of Colleges and Universities from the Ministry of Education, and got one of the Best Papers Award in IROS (6/3579). In 2023, he was nominated for the Best Paper in RSS (four in total). He has published more than 100 papers in high-level journals and conferences such as Nature, Nature Machine Intelligence, TPAMI as the corresponding author or the first author. He is a reviewer for Sicence, Nature, Cell and other journals, and the area chair of NeurIPS, CVPR, ICCV, ECCV, IROS, ICRA. His Research interests include embodied intelligence and computer vision. |