Machine Learning for Bioimage Analysis Reading Group 2016

Machine Learning for Bioimage Analysis Reading Group 2016

Jan. 07, 2016 -- Yu Zhang
Paper: Alex Krizhevsky, Ilya Sutskever and Geoffrey E. Hinton. ImageNet Classification with Deep Convolutional Neural Networks. NIPS, 2012.

Jan. 21, 2016 -- He Zhao
Paper: A. Mishra, K. Alahari and C. V. Jawahar. Image Retrieval using Textual Cues. ICCV, 2013.

Jan. 28, 2016 -- Xiaowei Zhang
Paper: B. Romera-Paredes and Philip H. S. Torr. An embarrassingly simple approach to zero-shot learning. ICML, 2015.

Feb. 4, 2016 -- Lin Gu
Paper: Shaoqing Ren, Xudong Cao, Yichen Wei and Jian Sun. Global Refinement of Random Forest. CVPR, 2015.

Feb. 11, 2016 -- Chi Xu
Paper: Xiaodong Yang and YingLi Tian. Super Normal Vector for Activity Recognition Using Depth Sequences. CVPR, 2014.

Feb. 18, 2016 -- Yongzhong Yang
Paper: Gunhee Kim, Seungwhan Moon and Leonid Sigal. Ranking and Retrieval of Image Sequences from Multiple Paragraph Queries. CVPR, 2015.

Feb. 25, 2016 -- Lakshmi Govindaraja
Paper: Brenden M. Lake, Ruslan Salakhutdinov and Joshua B. Tenenbaum. Human-level concept learning through probabilistic program induction. Science, 2015.

Mar. 3, 2016 -- Li Cheng
Title: Fisher Information and Applications (slides)
Related Papers: Z. Wang, A. Stocker and D. Lee. Optimal Neural Population Codes for High-dimensional Stimulus Variables. NIPS 2013.
S. J. Maybank. A Fisher–Rao Metric for Curves Using the Information in Edges. Journal of Mathematical Imaging and Vision, 2015.
S. J. Maybank, S. Ieng, R. Benosman. A Fisher-Rao metric for paracatadioptric images of lines. International Journal of Computer Vision, 2012.

Mar. 10, 2016 -- Yu Zhang
Paper: Andrej Karpathy and Li Fei-Fei. Deep Visual-Semantic Alignments for Generating Image Descriptions. CVPR, 2015.

Mar. 24, 2016 -- Dr. Boxin Shi (guest speaker)
Title: Conventional Computer Vision Problems Meet Unconventional Cameras
Abstract: The performance of how machines understand the real world depends on innovations in both visual computing algorithms and imaging sensor designs. Conventional computer vision algorithms are mainly designed for a 2D array of pixels from an ordinary imaging sensor. With recent progress in imaging devices, what can be captured and encoded in an image have been greatly extended, thus how to utilize such information in computational photography becomes an emerging topic. This talk will introduce several solutions that complementarily integrate the novel design of visual computing algorithms and imaging sensors to conquer the bottleneck of classic computer vision problems such as super resolution, high dynamic range imaging, and 3D reconstruction.

Mar. 31, 2016 -- Li Cheng
Title: Recent Developments in Weiqi Machines

Apr. 07, 2016 -- Amir Shahroudy (guest speaker)
Title: Human Activity Recognition in Depth Videos
Abstract: Introduction of depth sensors made a big impact on research in visual recognition. By providing 3D information, these cameras help us to have a view-invariant and robust representation of the observed scenes and human bodies. Detection and 3D localization of human body parts are done more accurately and more efficiently in depth maps in comparison with RGB counterparts. In this talk I will introduce three of our recent works on this domain. Having the 3D structure of the body parts, the articulated and complex nature of human actions makes the task of action recognition difficult. One approach to handle this complexity is dividing it to the kinetics of body parts and analyzing the actions based on the partial descriptors. In our first work, we propose a joint sparse regression based learning method which utilizes the structured sparsity to model each action as a combination of multimodal features from a sparse set of body parts. In addition to depth based representation of human actions, commonly used 3D sensors also provide RGB videos. It is generally accepted that each of these two modalities has different strengths and limitations for the task of action recognition. Therefore, analysis of the RGB+D videos can help us to better study the complementary properties of these two types of modalities and achieve higher levels of performance. Our solution for this is a new deep autoencoder-based factorization network to separate input multimodal signals into a hierarchy of shared and modality-specific components. Currently available depth-based and RGB+D-based action recognition benchmarks have a number of limitations, including the lack of training samples, distinct class labels, camera views and variety of subjects. We collected a large-scale dataset for RGB+D human action recognition with more than 56 thousand video samples and 4 million frames, collected from 40 distinct subjects. Our dataset contains 60 different action classes including daily actions, mutual actions, and medical conditions. In addition, we propose a new recurrent neural network structure to model the long-term temporal correlation of the features of each body part, and utilize them for better action classification. Experimental results show the advantages of applying deep learning methods over state-of-the-art hand-crafted features on the suggested cross-subject and cross-view evaluation criteria for our dataset. The introduction of this large scale dataset will enable the community to apply, develop and adapt various data-hungry learning techniques for the task of depth-based and RGB+D human activity analysis.

Apr. 14, 2016 -- Xiaowei Zhang
Paper: Y. Ioannou, D. Robertson, D. Zikic, P. Kontschieder, J. Shotton, M. Brown, A. Criminisi. Decision Forests, Convolutional Networks and the Models in-Between. arXiv:1603.01250, 2016.

Apr. 21, 2016 -- Chi Xu
Paper: Yong Du, Wei Wang, Liang Wang. Hierarchical Recurrent Neural Network for Skeleton Based Action. CVPR, 2015.

Jun. 09, 2016 -- Zhuwen Li (guest speaker)
Title: Simultaneous Clustering and Model Selection for Tensor Affinities
Abstract: While clustering has been well studied in the past decade, model selection has drawn less attention. In this talk, we consider model selection in the domain where the affinity relations involve groups of more than two nodes. More specifically, we perform simultaneous clustering and model selection, given some higher order affinity tensor A with non-negative entries as input. By solving both the clustering and model selection all at once, we are able to exploit the rich structural properties of the problem, and coerce the affinity tensor towards an ideal affinity tensor that is globally consistent with the revealed cluster structures. However, solving this problem is non-trivial: 1) the original constraints in our problem are either intractable or even undefined in general in the higher order case; 2) there is also the practical issue of scalability. To solve the first problem, we transform it in a equivalent form amenable for numerical implementation. To scale to large problem sizes, we propose an alternative formulation, so that it can be efficiently solved via stochastic optimization in an online fashion. We evaluate our algorithm with different applications to demonstrate its superiority, and show it can adapt to a large variety of settings.

Jun. 16, 2016 -- Satyam
Title: Very deep networks and RNNS - the deepest of them all
Abstract: This talk gives a brief introduction to the new set of very deep architectures that have been proposed recently (Highway nets,fitnets, grad nets,residual nets etc). We briefly introduce RNNs and its many variations. We also look at the chronic issues with training RNNs and how they have motivated new architectures in the past 30 years.

Jun. 23, 2016 -- Huazhu Fu (guest speaker)
Title: Automatic AS-OCT Segmentation and Measurement
Abstract: The anterior chamber angle (ACA) is a useful adjunct in the diagnosis and treatment of glaucoma. In this paper, we propose an automatic ACA segmentation and measurement method for Anterior Segment Optical Coherence Tomography (AS-OCT) imagery. It is a challenging problem because the location and shape of the iris can vary significantly among different AS-OCT images and patients. To address this problem, we propose to generate initial markers on the cornea and iris boundaries through marker transfer from hand-labeled boundary markers in a set of reference images. The reference images provide guidance for locating approximate boundary points over a diverse set of cases. The initial markers are then refined based on general AS-OCT structural information. These markers facilitate segmentation of major clinical structures which are used to recover standard clinical ACA measurements. These measurements can not only support clinicians in making anatomical assessments, but also be utilized as features for detecting anterior angle closure in automatic glaucoma diagnosis. Experiments on clinical datasets demonstrate the effectiveness of this data-driven approach.

Jun. 30, 2016 -- Lakshmi Govindaraja
Paper: A. B. Wiltschko, M. J. Johnson, G. Iurilli, R. E. Peterson, J. M. Katon, S. L. Pashkovski, V. E. Abraira, R. P. Adams, S. R. Datta. Mapping Sub-Second Structure in Mouse Behavior. Neuron, 2015.

Jul. 21, 2016 -- Hoang Quang Minh (guest speaker)
Title: A Distributed Variational Inference Framework for Unifying Parallel Sparse Gaussian Process Regression Models. ICML, 2016.
Abstract: This paper presents a novel distributed variational inference framework that unifies many parallel sparse Gaussian process regression (SGPR) models for scalable hyperparameter learning with big data. To achieve this, our framework exploits a structure of correlated noise process model that represents the observation noises as a finite realization of a high-order Gaussian Markov random process. By varying the Markov order and covariance function for the noise process model, different variational SGPR models result. This consequently allows the correlation structure of the noise process model to be characterized for which a particular variational SGPR model is optimal. We empirically evaluate the predictive performance and scalability of the distributed variational SGPR models unified by our framework on two real-world datasets.

Aug. 04, 2016 -- Yu Zhang
Paper: Markus Oberweger, Gernot Riegler, Paul Wohlhart, and Vincent Lepetit. Efficiently Creating 3D Training Data for Fine Hand Pose Estimation. CVPR, 2016.

Aug. 11, 2016 -- Xiaowei Zhang
Paper: Marcin Andrychowicz, Misha Denil, Sergio Gomez, Matthew W. Hoffman, David Pfau, Tom Schaul, and Nando de Freitas. Learning to learn by gradient descent by gradient descent. arXiv:1606.04474, 2016.

Aug. 18, 2016 -- Sergey Kushnarev (guest speaker)
Paper: Jia Dua, Alvina Goh, Sergey Kushnarev, and Anqi Qiu. Geodesic Regression on Orientation Distribution Functions with its Application to an Aging Study. NeuroImage, 2014.

Aug. 25, 2016 -- Liu Jun (guest speaker)
Paper: Jun Liu, Amir Shahroudy, Dong Xu, Gang Wang, Spatio-Temporal LSTM with Trust Gates for 3D Human Action Recognition, ECCV, 2016.

Sep. 01, 2016 -- Li Cheng
Title: Information Bottleneck and Applications

Sep. 08, 2016 -- Chi Xu
Paper: C. Xu, L. Zhang, L. Cheng, R. Koch. Pose Estimation from Line Correspondences: A Complete Analysis and A Series of Solutions. IEEE T-PAMI, 2016.

Sep. 22, 2016 -- Dr. Jiashi Feng (guest speaker)
Title: Deep Learning Solutions for Visual World Understanding
Abstract: Deep learning has revolutionized the computer vision research significantly. In this talk, I will first briefly introduce the deep learning approaches developed by my group for solving several fundamental computer vision research problems as well as their applications in practice, including face analytics, human behavior understanding, urban scene understanding and generic object recognition. Then I will concentrate on presenting solutions to several practical issues in these applications in order for letting computers understand visual world more intelligently, such as how to achieve robustness to various noisy signal, how to integrate contextual information effectively and how to build a network to learn in an endless way. Towards solving these issues, I will introduce three types of new deep neural network models in details, which are recurrent attentive neural networks, multi-path feedback networks, and self-learning networks. Finally, I will conclude by introducing some trends and problems in modern computer vision research.

Sep. 29, 2016 -- Lakshmi Govindaraja
Paper: Zdenek Kalal, Krystian Mikolajczyk, and Jiri Matas, Tracking-Learning-Detection, IEEE TPAMI, 2010.

Oct. 06, 2016 -- He Zhao
Paper: Aayush Bansal, Xinlei Chen, Bryan Russell, Abhinav Gupta, and Deva Ramanan, PixelNet: Towards a General Pixel-Level Architecture, arXiv:1609.06694v1, 2016.

Oct. 20, 2016 -- Xiaowei Zhang
Paper: Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun, Deep Residual Learning for Image Recognition, CVPR 2016.

Nov. 24, 2016 -- Li Cheng
Paper: J. Caicedo and S. Lazebnik, Active object localization with deep reinforcement learning. ICCV, 2015.

Nov. 17, 2016 -- Sai-Kit Yeung (guest speaker)
Title: Data-driven Computer Graphics Modeling

Dec. 01, 2016 -- Yu Zhang
Paper: Sergey Ioffe and Christian Szegedy. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. ICML, 2015.

Dec. 08, 2016 -- Chi Xu
Paper: Jiajun Wu, Chengkai Zhang, Tianfan Xue, William Freeman, and Joshua Tenenbaum. Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling. NIPS, 2016.

Dec. 15, 2016 -- Xiaowei Zhang
Title: Learning Structured and Contextual Features for 2D and 3D Filamentary Segmentation

Dec. 22, 2016 -- Lakshimi
Paper: Sven Eberhardt, Jonah Cader and Thomas Serre. How Deep is the Feature Analysis underlying Rapid Visual Categorization?. NIPS, 2016.