The characteristics of the nuclei are often observed by pathologists when they assess the progression and presence of cancer cells in tissue biopsies. Cancerous tissue typically contains cells with enlarged, irregularly-shaped (pleomorphic) and darkly-stained (hyperchromasia) nuclei with prominent nucleoli. However, at different stages of the disease, the nuclear structure and prominence of nucleoli can change. The Fuhrman grading system for clear cell Renal Cell Carcinoma (ccRCC) was developed around these observed changes in the nuclei. It provides rules to classify the different stages of disease progression. Early stage ccRCC tumors typically have small, round nuclei with inconspicuous nucleoli, while late stage tumors have enlarged and irregularly-shaped nuclei with prominent nucleoli. Following on from our work on nucleoli detection, we have developed new machine learning methodologies to perform automatic grading of ccRCC histopathological images. From the histopathological images, we extract features describing the properties of multiple nuclei concurrently. This enables us to train classifiers that can distinguish the level of pleomorphism of the nuclei in the tissue sample, resulting in a higher accuracy in the automated grading.
The visual assessment and severity grading of acne vulgaris by physicians can be subjective, resulting in inter‐ and intra‐observer variability. In this project, we developed and validate a Deep Learing algorithm for the automated calculation of the Investigator's Global Assessment (IGA) scale, to standardize acne severity and outcome measurements. The best classification accuracy was 67%. Pearson correlation between machine‐predicted score and human labels (clinical scoring and researcher scoring) for each model and various image input sizes was 0.77. Correlation of predictions with clinical scores was highest when using Inception v4 on the largest image size of 1200 × 1600. Two sets of human labels showed a high correlation of 0.77, verifying the repeatability of the ground truth labels. Confusion matrices show that the models performed sub‐optimally on the IGA 2 label.
A weakly supervised learning based clustering framework is proposed in this paper. As the core of this framework, we introduce a novel multiple instance learning task based on a bag level label called unique class count (ucc), which is the number of unique classes among all instances inside the bag. In this task, no annotations on individual instances inside the bag are needed during training of the models. We mathematically prove that with a perfect ucc classifier, perfect clustering of individual instances inside the bags is possible even when no annotations on individual instances are given during training. We have constructed a neural network based ucc classifier and experimentally shown that the clustering performance of our framework with our weakly supervised ucc classifier is comparable to that of fully supervised learning models where labels for all instances are known. Furthermore, we have tested the applicability of our framework to a real world task of semantic segmentation of breast cancer metastases in histological lymph node sections and shown that the performance of our weakly supervised framework is comparable to the performance of a fully supervised Unet model.
aAnomaly detection is a classical problem where the aim is to detect anomalous data that do not belong to the normal data distribution. Current state-of-the-art methods for anomaly detection on complex high-dimensional data are based on the generative adversarial network (GAN). However, the traditional GAN loss is not directly aligned with the anomaly detection objective: it encourages the distribution of the generated samples to overlap with the real data and so the resulting discriminator is ineffective as an anomaly detector. In this paper, we propose modifications to the GAN loss such that the generated samples lie at the boundaries of the real data distribution. With our modified GAN loss, our anomaly detection method, called Fence GAN (FGAN), directly uses the discriminator score as an anomaly threshold. Our experimental results on the MNIST, CIFAR10 and KDD99 datasets show that FGAN yields the best anomaly classification accuracy compared to state-of-the-art methods.
Adversarial attacks on convolutional neural networks (CNN) have gained significant attention and there have been active research efforts on defense mechanisms. Stochastic input transformation methods have been proposed, where the idea is to recover the image from adversarial attack by random transformation, and to take the majority vote as consensus among the random samples. However, the transformation improves the accuracy on adversarial images at the expense of the accuracy on clean images. While it is intuitive that the accuracy on clean images would deteriorate, the exact mechanism in which how this occurs is unclear. In this paper, we study the distribution of softmax induced by stochastic transformations. We observe that with random transformations on the clean images, although the mass of the softmax distribution could shift to the wrong class, the resulting distribution of softmax could be used to correct the prediction. Furthermore, on the adversarial counterparts, with the image transformation, the resulting shapes of the distribution of softmax are similar to the distributions from the clean images. With these observations, we propose a method to improve existing transformation-based defenses. We train a separate lightweight distribution classifier to recognize distinct features in the distributions of softmax outputs of transformed images. Our empirical studies show that our distribution classifier, by training on distributions obtained from clean images only, outperforms majority voting for both clean and adversarial images. Our method is generic and can be integrated with existing transformation-based defenses
Deep learning methods have shown superior performance on many machine learning tasks and applications, due to their ability to model high-level abstractions in the data through multiple processing layers in the neural network. Drawbacks of deep learning include the need for deep networks which resulted in non-intuitive interpretation of results, long learning process and inherent linearity of information propagation. Working on the principle of multiple processing layers, we generalize the traditional neural network where nodes encode probability distributions instead of real values. With our network, we are able to perform regression and classification tasks on probability distributions instead of on a single variable. Unlike the conventional neural network, our network exhibits non-linear level sets in the transformation in each node, increasing the degree of non-linearity in each processing layer. We have tested our network on several datasets for distribution-to-distribution regression and showed that it uses much less training data than an existing method which uses instance-based learning. Our network easily extends to other tasks such as classification of distributions and distribution-to-real regression, with applications in various fields like population studies, bioinformatics and finance.
The diagnosis and prognosis of cancers are major issue for a trained pathologist. Inter-observer variability and tediousness of tissue reading hamper the accuracy of assessment by the pathologist. The analysis of prominent nucleoli is one of the main methods of cancer assessment. We have developed an intelligent software to improve accuracy and reduce labor of tissue reading of prominent nucleoli assessment on HE stained slides. Our method can easily fit into the existing workflow of the pathologists work