报告题目：High Resolution Spectral Video Capture and Computational Photography [Slides]
报告摘要：Spectral capture technique collects information with more color channels than traditional trichromatic sensing. Therefore, it provides more detailed properties of the light source and the scene. Possible applications span across lots of fields such as remote sensing, materials science, bio-photonics, environmental monitoring, and so on. Spectral capture technique needs to record massive data in spatial, temporal and spectral domains, traditional spectral capture systems suffer from temporal and spatial scanning, thus is not suitable for video capture. Nowadays, with rapid development in sampling theory and electronic techniques, spectral video acquisition is becoming tractable. In this talk, we present recent progresses on the high resolution spectral video acquisition. Prism-Mask Imaging Spectrometer (PMIS) is proposed which accomplishes high quality video capture in three domains: spectral (1nm), spatial (one mega-pixels) and temporal (15 fps) resolution. Both the optical principle and the prototype setup of the PMIS are introduced. A bunch of computer vision applications (object tracking, skin detection, automatic white balance, etc) based on PMIS are demonstrated.
Besides the PMIS camera, in this talk, an emerging filed called Computational Photography is also introduced, which aims to overcome the limitations of the traditional camera by using computational techniques to produce a richer, more vivid, perhaps more perceptually meaningful representation of our visual world (quote from the CMU computational photography course introduction). With novel computational imaging techniques, we show amazing image/video galleries including gigapixel image/video, ultra-fast video capture (the speed of light), and light field modeling for digital movie industry.
 Ma C, Cao X*, Tong X, Dai Q, and Lin S. Acquisition of High Spatial and Spectral Resolution Video with a Hybrid Camera System. International Journal of Computer Vision (IJCV), Volume 110, Issue 2 (2014), Page 141-155
 Ma C, Cao X*, Wu R, and Dai Q. Content-adaptive high-resolution hyperspectral video acquisition with a hybrid camera system. Optics Letters, 39(4), 937-940, 2014
 Feng J, Fang X, Cao X*, et al. Advanced hyperspectral video imaging system using Amici prism. Optics express, 2014, 22(16): 19348-19356
 Yue, T., Suo, J., Wang, J., Cao, X., & Dai, Q. Blind Optical Aberration Correction by Exploring Geometric and Visual Priors. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1684-1692, 2015
报告人简介：Xun Cao received his Ph.D. degree from Tsinghua University, Beijing, China. He has been a visiting researcher at Philips Research, Aachen, Germany in 2008 and Microsoft Research Asia, in 2009 and 2010, and a visiting scholar at The University of Texas at Austin from 2010 to 2011. He joined the faculty at School of Electronic Science and Engineering, Nanjing University in 2012, and he is now an Associate Professor.
Dr. Cao’s research interests include Image Based Modeling and Rendering, 2D-to-3D Conversion and Computational Photography. His multi-view stereo modeling algorithm is Ranked No.1 accuracy according to the Middlebury vision evaluation. His 2D-to-3D conversion algorithm has been applied in the commercial software “Roxio Creator”, which has a major market share in North America. Dr. Cao also developed one of the first spectral video camera – PMIS (Prism-Mask Imaging Spectrometer), which can capture spectral information dynamically. Dr. Cao has published 27 papers on leading journals and conferences including IEEE T-PAMI, IJCV, CVPR, ICCV, etc, and he holds 30+ U.S. and China patents, he was awarded the First Prize of the National Technology Invention Award in 2012.
报告题目：Joint Embeddings of Shapes and Images via CNN Image Purification [Slides]
报告摘要：Both 3D models and 2D images contain a wealth of information about everyday objects in our environment. However, it is difficult to semantically link together these two media forms, even when they feature identical or very similar objects. Real-world images are naturally variable in a number of characteristics such as viewpoint, lighting, background elements, and occlusions. This variability makes it challenging to match images with each other, or with 3D shapes. We propose a joint embedding space populated by both 3D shapes and 2D images, where the distance between embedded entities reflects the similarity between the underlying objects represented by the image or 3D model, unaffected by all the aforementioned nuisance factors. This joint embedding space facilitates comparison between entities of either form, and allows for cross-modality retrieval. We construct the embedding space using an all-pairs 3D shape similarity measure, as 3D shapes are more pure and complete than their appearances in images, leading to more robust distance metrics. We then employ a Convolutional Neural Network (CNN) to “purify” images by muting the distracting factors. The CNN is trained to map an image to a point within the embedding space, such that it is close to a point attributed to a 3D model of a similar object to the one depicted in the image. This purifying capability of the CNN is accomplished with the help of a large amount of training data consisting of images synthesized from 3D shapes. Our deep embedding brings 3D shapes and 2D images into a joint embedding space, where cross-view image retrieval, image-based shape retrieval, as well as shape-based image retrieval tasks are all naturally supported. We evaluate our method on these retrieval tasks and show that it consistently out-performs state-of-the-art methods. Additionally, we demonstrate the usability of a joint embedding in a number of computer graphics applications.
 Yangyan Li*, Hao Su*, Charles R. Qi, Noa Fish, Daniel Cohen-Or, and Leonidas Guibas, “Joint Embeddings of Shapes and Images via CNN Image Purification”, ACM Transactions on Graphics (Proceedings of SIGGRAPH Asia), 2015. (*Joint first authors).
 Hao Su*, Charles R. Qi*, Yangyan Li, and Leonidas Guibas. “Render for CNN: Viewpoint Estimation in Images Using CNNs Trained with Rendered 3D Model Views”, IEEE International Conference on Computer Vision (ICCV), 2015. (*Joint first authors).
李扬彦，2013年获中国科学院深圳先进技术研究院博士学位，现为斯坦福大学几何计算研究组（Geometric Computation Group）博士后学者（Postdoctoral Scholar）。其研究兴趣涉及计算机图形学和计算机视觉领域，主要集中于三维数据处理方向。