报告题目：DeepFont: Large-Scale Real World Font Recognition From Images [Slides]
报告摘要：As font is one of the core design concepts, automatic font identification and similar font suggestion from an image or photo has been on the wish list of many designers. We study the Visual Font Recognition (VFR) proble, and advance the state-of-the-art remarkably by developing the DeepFont system. First of all, we build up the first available large-scale VFR dataset, consisting of both labeled synthetic data and partially labeled real-world data. Next, to combat the domain mismatch between available training and testing data, we introduce a Convolutional Neural Network (CNN) decomposition approach, using a domain adaptation technique based on a Stacked Convolutional Auto-Encoder (SCAE) that exploits a large corpus of unlabeled real-world text images combined with synthetic data preprocessed in a specific way. Moreover, we study a novel learning-based model compression approach, in order to significantly reduce the DeepFont model size without notably sacrificing its performance. The DeepFont system achieves an accuracy of higher than 80% (top-5) on our collected dataset, and also produces a good font similarity measure for font selection and suggestion.DeepFont has been deployed in a few latest Adobe products (Photoshop, TypeFace, etc.)
报告人简介：Zhangyang (Atlas) Wang is currently a 3rd year Ph.D. student in ECE@UIUC, working with Prof. Thomas Huang. Previously, he obtained B.E. degree from USTC, in 2012. Atlas’s research interests encompass a variety of computer vision and data mining problems, in particular relying solid machine learning and optimization tools. One of his current focus is to understand and interpret the intriguing behaviors of deep networks, from both cognitive scientific and numerical perspectives. Atlas did several internships with MSR (2015), Adobe Research (2014) and US Army Research (2013), during which he worked on various projects and solved practical problems on image enhancement, image classification/recognition, and distributed machine learning system.
报告嘉宾2：陈天奇（University of Washington）
报告题目：Replicable Parts for Large-scale Deep Learning [Slides]
报告摘要：In this talk, I will introduce the lessons we learned in DMLC to develop large-scale (deep) machine learning toolkits. Specifically, I will discuss how the problem of building such system can be decomposed into small essential parts, and how these parts can combined together to make the entire code-base more concise, flexible and fast. This talk will cover the topics on tensor expression, parameter server synchronization and operation scheduling. I will also briefly talk about distributed data loading API if time permits.
报告人简介：Tianqi is a PhD student at University of Washington, working on Large-scale machine learning. He has expertise in both machine learning theory and engineering. He has been publishing papers in top machine learning conferences. He is main contributor of several popular open-source machine learning packages, including xgboost, cxxnet, mshadow. He was also winner of two KDDCup challenges.