计算机视觉研究与应用创新论坛

特邀讲者



KEYNOTE SPEAKERS

                                                                                      
Lihi Zelnik-Manor                             Ramin Zabih                             Long Quan                            
Technion, Israel                             Google Research & Cornell Tech, USA                             HKUST,HK                            




INVITED SPEAKERS

                                                                                      
吴毅红                             贾佳亚                             李玺                            
中国科学院自动化研究所                             香港中文大学                             浙江大学                            


                                                                                      
凌海滨                             梅林                             山世光                            
亮风台                             公安部第三研究所                             中国科学院计算所                            


                                                                                      
陶大程                             陶海                             杨铭                            
University of Technology Sydney                             北京文安                             地平线机器人技术                             


SPEECHES INFORMATION

Lihi Zelnik-Manor

Technion
Israel.

Title: Separating the Wheat from the Chaff in Visual Data

Abstract: By far, most of the bits in the world are image and video data. YouTube alone gets 300 hours of video uploaded every minute. Adding to that personal pictures, videos, TV channels and the gazillion of security cameras shooting 24/7 one quickly sees that the amount of visual data being recorded is colossal. In this talk I will discuss the problem of “saliency prediction” - separating between the important parts of images/videos (the “wheat”) from the less important ones (the “chaff”). Predicting what people find important could be useful for many applications. In advertising, it may be important for the producer to know if the key concept catches the viewer’s eye. Alternatively, if one knows where people are likely to look, relevant content can be placed there. In video editing knowing where viewer’s look could help create smoother shot transitions. Reliable gaze prediction could drive gaze-aware compression or key-frame selection.

In this talk I will discuss approaches for saliency prediction in images and videos and how the quality of these algorithms can be assessed. I will further explore the meaning of saliency in the context of different tasks, some of which call for specific tailored definition of ``importance''. Finally, realizing that there could be different definitions to saliency I will discuss approaches to predicting task-oriented saliency.

Bio: Lihi Zelnik-Manor is an Associate Professor in the Faculty of Electrical Engineering in the Technion, Israel. Between 2014 and 2016 she was a visiting Associate Professor at CornellTech. Prior to the Technion, she worked as a post-doctoral fellow in the Department of Engineering and Applied Science in the California Institute of Technology (Caltech). She holds a PhD and MSc (with honors) in Computer Science from the Weizmann Institute of Science and a BSc (summa cum laude) in Mechanical Engineering from the Technion.

Prof. Zelnik-Manor’ awards and honors include the Israeli high-education planning and budgeting committee (Vatat) scholarship for outstanding Ph.D. students, the Sloan-Swartz postdoctoral fellowship, the best Student Paper Award at the IEEE SMI'05, the AIM@SHAPE Best Paper Award 2005 and the Outstanding Reviewer Award at CVPR'08. She is also a recipient of the Gutwirth prize for the promotion of research and several grants from ISF, MOST, the 7th European R&D Program, and others. Prof Zelnik-Manor has served as Area Chair for ECCV and CVPR multiple times, as Program Chair of CVPR’16 and as Associate Editor at TPAMI. She has further had industrial collaborations with Intel, Adobe, and Microsoft Research.

More
Ramin Zabih

Google Research
Cornell Tech
USA.

Title: Challenges and Opportunities in Higher-Order Inference

Abstract: Many problems in computer vision involve making inferences about a pixel in the presence of locally ambiguous evidence. Markov Random Fields (MRF's) provide a natural way to formulate such problems, but the MRF inference problem is computationally extremely difficult. Graph cut techniques have been quite successful for 1st-order MRF's, and commonly produce results that are within a few percent of the global minimum. However, there is considerable evidence that a wide range of vision problems require higher-order priors. In this talk I will describe my research group's recent work on higher-order MRF inference.

This is joint work with many co-authors, but primarily with my PhD students Alex Fix and Chen Wang.

Bio: His research interests lie in computer vision and in medical imaging. He has worked on a variety of problems in early vision, including motion and stereo; many of these problems can be solved very accurately using algorithms based on graph cuts, which was given the Test of Time award at ICCV 2011 and the Koenderink prize at ECCV 2012. He served as a Program Chair for CVPR 2007 and was a General Chair for CVPR 2013. He was the Editor-in-Chief of the IEEE Transactions on Pattern Analysis and Machine Intelligence from 2009 through 2012, and from 2013 through mid 2015 he chaired the PAMI-TC, which runs the main vision conferences. He is also the president and founder of the Computer Vision Foundation. In 2018 he will be a general chair for ECCV.

Since the fall of 2013 he is at CornellNYC Tech with a joint appointment in Weill Cornell Radiology. He is now on leave from Cornell, running a group at Google.

More
权龙

HKUST
HK.

Title: Mapping the World with Drones

Abstract: In the first part of the talk, I will review the state of the art of the three dimensional reconstruction from images or photographs developed in the past three decades in computer vision. In the second part of the talk, I will focus on the most recent exciting work of large-scale 3D reconstruction from drone photographs, and showcase the performances of our approach over a large samples of case studies of hundreds square kilometres in both high-rise metropolitan areas and low-rise rural areas in different cities of different countries. I will also demonstrate the online cloud platform and portal www.altizure.com, developed and funded by the HKUST team.

Bio: Long QUAN is a Professor of the Department of Computer Science and Engineering at the Hong Kong University of Science and Technology (HKUST). He received his Ph.D. in 1989 in Computer Science from INPL, France. He entered as a permanent researcher into the Centre National de la Recherche Scientifique (CNRS) in 1990 and was appointed at the Institut National de Recherche en Informatique et Automatique (INRIA) in Grenoble, France. He joined the HKUST in 2001, and was the founding Director of the HKUST Center for Visual Computing and Image Science. He is a Fellow of the IEEE Computer Society.

He works on vision geometry, 3D reconstruction and image-based modeling. He supervised the first French Best Ph.D. Dissertation in Computer Science of the Year 1998 (le prix de thèse SPECIF 1998, now le prix de thèse Gilles Kahn), the Piero Zamperoni Best Student Paper Award of the ICPR 2000, and the Best Student Poster Paper of IEEE CVPR 2008. He co-authored one of the six highlight papers of the SIGGRAPH 2007. He was also elected as the HKUST Best Ten Lecturers in 2004 and 2009. He has served as an Associate Editor of IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) and a Regional Editor of Image and Vision Computing Journal (IVC). He is on the editorial board of the International Journal of Computer Vision (IJCV), the Electronic Letters on Computer Vision and Image Analysis (ELCVIA), the Machine Vision and Applications (MVA), and the Foundations and Trends in Computer Graphics and Vision. He was a Program Chair of IAPR International Conference on Pattern Recognition (ICPR) 2006 Computer Vision and Image Analysis, is a Program Chair of ICPR 2012 Computer and Robot Vision, and is a General Chair of the IEEE International Conference on Computer Vision (ICCV) 2011.

More
吴毅红

中国科学院自动化研究所

Title: 基于图像的定位:发展与应用

Abstract: 基于图像的相机定位是三维计算机视觉的一个基本问题,其任务是根据相机拍摄的图像估计相机的姿态。它是虚拟现实,增强现实,无人驾驶,机器人导航等诸多应用的核心技术。在这次讲座中,首先介绍什么是基于图像的定位以及热点应用,之后介绍图像定位的历史与发展,然后介绍我们在城市场景大数据中的快速定位研究、刚体SLAM、在手机上实时定位的增强现实研究,最后是对定位技术的展望和发展趋势分析。

Bio: 吴毅红,中国科学院自动化研究所模式识别国家重点实验室,研究员、博士生导师。研究方向为多视几何、相机标定与定位、SLAM、移动视觉等。2001年毕业于中国科学院系统科学研究所,获博士学位。在重要期刊和会议上包括PAMI、IJCV、ICCV等发表论文70余篇。目前为《计算机辅助设计与图形学学报》编委、《计算机科学与探索》编委,《The Open Computer Science Journal》编委。

More
贾佳亚

香港中文大学

Title: Computer Vision that Mimics and Surpasses Human Ability

Abstract: This talk covers general review of computer vison research in recent years from two perspectives. It will be first exemplified by the computer vision goals that cannot be easily achieved by human. These tasks involve solving a series of low-level problems such as filtering, stereo matching, depth estimation, deconvolution, and motion estimation. Then a few hot topics to simulate human intelligence in image understanding will be introduced, which include semantic segmentation, object classification, and object detection. Several techniques developed in our team will be demonstrated.

Bio: Jiaya Jia is currently a professor in Department of Computer Science and Engineering, The Chinese University of Hong Kong (CUHK). He heads the research group focusing on computational photography, machine learning, practical optimization, and low- and high-level computer vision. He currently serves as an associate editor for the IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) and served as an area chair for ICCV and CVPR. He was also on the technical paper program committees of SIGGRAPH, SIGGRAPH Asia, ICCP, and 3DV for several times, and cochaired the Workshop on Interactive Computer Vision, in conjunction with ICCV 2007. He received the Young Researcher Award 2008 and Research Excellence Award 2009 from CUHK.

More
李玺

浙江大学

Title: 人工智能驱动的视觉特征计算、学习及其应用

Abstract: 互联网和物联网时代催生了海量视频大数据,从这些海量视频数据中有效提取知识迫切需要各种人工智能的技术和手段。因此,如何进行人工智能驱动的视觉计算已经成为当今知识经济时代亟待解决的核心技术问题。本报告主要围绕数据驱动的人工智能学习方法,进行大规模图像/视频数据的视觉特征学习,从目标视觉感知特性、视觉特征表达、深度学习器构建机制、高层语义理解等多维度视角进行了深入剖析,并引入了大规模视觉特征学习所涉及的主要研究问题和技术方法。然后系统地回顾了视觉特征表达和学习领域的不同发展阶段,介绍了近年来我们利用视觉特征学习进行视觉语义分析和理解所做的一系列代表性的研究工作及其实际应用。报告的最后将和大家一起探讨一下涉及视觉特征学习所面临的一些开放性问题和难题。

Bio: 浙江大学教授,博导,现就职浙江大学计算机学院人工智能研究所,入选第五批中国国家“青年千人计划”和浙江省151第二层次人才。主要从事计算机视觉、模式识别和机器学习等领域的研究和开发。在目标跟踪、目标行为识别、图像标注、视频检索、哈希(hashing)函数学习、深度特征学习等方面取得了深入系统的研究成果,其中在视频的运动跟踪、理解与检索等方面的研究具有特色和优势,取得了多项具有国际影响力的创新性成果。本人在国际权威期刊和国际顶级学术会议发表文章80多篇。担任神经计算领域知名国际刊物Neurocomputing和Neural Processing Letters的Associate Editor,同时担任多个计算机视觉和模式识别方面的国际刊物和国际会议的审稿人和程序委员。获得两项最佳国际会议论文奖(包括ACCV 2010和DICTA 2012),ICIP2015 Top 10% paper award,另外分别获得两项中国北京市自然科学技术奖(包括一等奖和二等奖),以及一项中国专利优秀奖。

More
凌海滨

亮风台

Title: 增强现实与游戏

Abstract: 随着Pokémon Go的走红,增强现实在游戏上的应用和潜力为广大游戏迷所接受,并且引发了相关的商业和投资方的关注。在本报告中,我们先介绍增强现实和游戏之间的历史渊源和发展,然后陈述二者结合的必然性和带来的优势。接下来我们会结合实践讨论相关的增强现实方面的技术,挑战,以及思路。最后,我们会对亮风台在相关方面的努力做一个总结和展示,并对未来进行展望。

Bio: 凌海滨博士于1997年和2000年于北京大学分别获得学士和硕士学位,之后于2006年于美国马里兰大学获得博士学位,然后在加州大学洛杉矶分校从事了一年的博士后研究。凌博士在2001年任微软亚洲研究院助理研究员,2007~2008年任西门子研究院研究员。从2008起任职于美国天普大学(Temple University),现在为计算机系副教授。此外,凌博士是亮风台科技的共同创始人并担任其首席科学家。其主要研究领域包括计算机视觉、增强现实、人机交互和医学图像,获2003年度ACM UIST最佳学生论文奖,2014年度美国自然科学基金CAREER Award。任CVPR 2014和CVPR 2016年的领域主席(Area Chair),并且担任IEEE Trans. on Pattern Analysis and Machine Intelligence和Pattern Recognition的编委。

More
梅林

公安部第三研究所

Title: 人工智能在公共安全中的应用

Abstract: 以深度学习为代表的新的人工智能技术成为当今各行业发展的突破点,报告围绕当前我国公共安全领域存在的突出问题,分析了深度学习技术在视频图像、音频以及大数据等方面应用的现状及前景,提出构建相关研究测评的标准化评测平台,并探讨了技术跟实际应用衔接的瓶颈问题。

Bio: 2000年获得西安交通大学工学博士学位。2000年至2006年,先后在复旦大学计算机科学与工程系、德国弗赖堡大学计算机系、德国人工智能研究中心作为博士后和高级访问学者开展研究工作。2007年,加入公安部第三研究所担任警用装备技术研发中心智能图像处理学科带头人,2008年任物联网技术研发中心副主任,2012年2月任物联网技术研发中心主任。2012年12月受聘公安部第三研究所研究员,2015年被上海市科委评为上海市优秀技术带头人。主要研究兴趣包括计算机视觉、人工智能、物联网应用、大数据处理等方面。负责规划了基于视频结构化描述技术的新一代视频监控网络体系、面向各级公安机关各警种的视频警务应用产品体系以及相关标准体系。为“十三五”期间公安视频监控的大规模深度应用奠定了基础。现任上海市图像图形学学会理事、中国指挥控制学会富媒体专业委员会委员、公安部社会公共安全应用物联网应用标准化技术委员会委员、ACM上海分会学术委员会委员、上海智能视频监控工程技术研究中心常务副主任,曾任BDSC 2014 Workshop主席、ICSSC 2013程序委员。近年来,先后在国内外权威期刊和会议上发表学术论文60余篇(其中被SCI/EI收录20余篇),申请国家发明专利近50项(授权9项),获得软件著作权登记6项。

More
山世光

中国科学院计算所

Title: 基于深度学习的人脸检测与识别进展及开放问题

Abstract: 报告将首先介绍深度学习在复杂条件下人脸检测、面部特征定位及识别等关键问题上的性能评测和应用进展情况,在此基础上讨论相关方向上的开放问题和发展趋势。之后,将介绍我们将传统方法与深度学习相结合在人脸识别等计算机视觉问题上取得的进展,如遮挡条件下的面部特征点定位方法,采用自纠错深度网络在大规模无监督训练数据集上应用深度学习进行图像检索的方法等。最后,将介绍我们发布的人脸识别引擎的原理和技术水平。

Bio: 山世光,博士,中科院计算所研究员、博士生导师,中科院智能信息处理重点实验室常务副主任。主要从事计算机视觉、模式识别、机器学习等相关研究工作。迄今已发表CCF A类论文50余篇,全部论文被Google Scholar引用9000余次。曾应邀担任过ICCV,ACCV,ICPR,FG等多个国际会议的领域主席(Area Chair),现任IEEE Trans. on Image Processing,Neurocomputing和Pattern Recognition Letters等国际学术刊物的编委(AE)。研究成果曾获2005年度国家科技进步二等奖和2015年度国家自然科学奖二等奖。他是2012年度基金委“优青”获得者,2015年度CCF青年科学奖获得者。

More
陶大程

University of Technology Sydney

Title: Multiview Learning

Abstract: In recent years, many algorithms for learning from multi-view data by considering the diversity of different views have been proposed. These views may be obtained from multiple sources or different feature subsets. For example, a person can be identified by face, fingerprint, signature or iris with information obtained from multiple sources, while an image can be represented by its color or texture features, which can be seen as different feature subsets of the image. In this talk, we will organize the similarities and differences between a wide variety of multi-view learning approaches, highlight their limitations, and then demonstrate the basic fundamentals for the success of multi-view learning. The thorough investigation on the view insufficiency problem and the in-depth analysis on the influence of view properties (consistence and complementarity) will be beneficial for the continuous development of multi-view learning.

Bio:Dacheng Tao is Professor of Computer Science with the Centre for Quantum Computation & Intelligent Systems, and the Faculty of Engineering and Information Technology in the University of Technology, Sydney. He mainly applies statistics and mathematics to data analytics problems and his research interests spread across computer vision, data science, image processing, machine learning, and video surveillance. His research results have expounded in one monograph and 100+ publications at prestigious journals and prominent conferences, such as IEEE T-PAMI, T-NNLS, T-IP, JMLR, IJCV, NIPS, ICML, CVPR, ICCV, ECCV, AISTATS, ICDM; and ACM SIGKDD, with several best paper awards, such as the best theory/algorithm paper runner up award in IEEE ICDM’07, the best student paper award in IEEE ICDM’13, and the 2014 ICDM 10 Year Highest-Impact Paper Award. He is a Fellow of the IEEE, IAPR, OSA, and SPIE.

More
陶海

北京文安

Title: Recent Advances in Embedded Computer Vision Applications

Abstract: This talk will demonstrate our recent progress in developing embedded systems in several key computer vision sub-fields including video-based face recognition, vehicle attribute analysis, urban management event detection, and high density crowd counting. The developed algorithms combine the traditional feature-plus-classifier approach with the recent advances in deep learning to make high performance computer vision systems practical and enable products in several vertical markets including intelligent transportation systems (ITS), business intelligence (BI), and smart video surveillance.

We will demonstrate a single-GPU video analytic box that can process up to 8 channels of analog or 2 channels of 1080p HD video inputs and a prototype 40-GPU server system capable of processing up to 80 channels of 1080p video inputs.

Bio: Dr. Tao received BS and MS degrees in Automation from Tsinghua University in 1991 and 1993, respectively. He received the PhD degree in Electrical Engineering from the University of Illinois at Urbana-Champaign in 1999. From 1999 to 2001, he was a member of technical staff in the Vision Technology Laboratory at Sarnoff Corporation, NJ. From July 2001 to June 2010, he served as an assistant and then an associate professor in the Department of Computer Engineering at the University of California at Santa Cruz. Dr. Tao holds more than 10 US patents and published more than 130 papers in the field of image processing and computer vision. Dr. Tao served as the associate editor of Computer Vision and Applications and Pattern Recognition and has been a reviewer of CVPR, ICCV, ECCV and other computer vision related conferences. Dr. Tao is a founder and the CEO of Beijing Vion Technology, Inc., a company focusing on developing world leading computer vision and artificial intelligence algorithms and products, with various applications in intelligent transportation systems (ITS), public safety, and business intelligence. 

More
杨铭

地平线机器人技术

Title: From Computer Vision Research to Product: Challenges and Opportunities

Abstract: This talk will cover both a brief introduction of Horizon Robotics and the personal learning in developing products primarily using computer vision techniques. Artificial intelligence startup Horizon Robotics, founded in June 2015, strives to innovate turn-key solutions that integrate software, hardware, and cloud systems, to make human life more convenient, safe, and fun. In productionizing image recognition techniques, especially using deep convolutional neural networks, the major technical challenges include, but not limited to, the balance between computational efficiency and recognition accuracy (i.e., the cost vs. performance), the trade-off of developing time against functionalities, the issues on product consistency, reliability and the deliverables. Nevertheless, the rapid advance of computer vision technology opens up more business opportunities such as smart home and autonomous driving, etc.

Bio: Dr. Ming Yang is the Co-founder & Vice President of Horizon Robotics Inc. He is one of the founding members of the Facebook Artificial Intelligence Research (FAIR) and a former senior researcher at NEC Labs America. Dr. Yang is a well-recognized researcher in computer vision and machine learning. His research interests include object tracking, face recognition, massive image retrieval and multimedia content analysis. Dr. Yang owns 14 US patents, and has over 50 publications in top international conferences and journals with more than 3400 citations, h-index 28. During his tenure at Facebook, Dr. Yang led the deep learning research project “DeepFace”, which had a significant impact in the deep learning research community and got widely reported by various media including Science Magazine, MIT Tech Review and Forbes. Dr. Ming Yang received his B.Eng. and M.Eng. degree from the Department of Electrical Engineering at Tsinghua University and Ph.D. degree from the Department of Electrical Engineering and Computer Science at Northwestern University.

More