“Deep Learning is an algorithm which has no theoretical limitations of what it can learn; the more data you give and the more
computational time you provide, the better it is”
– Geoffrey Hinton (Google)
The beginning ….
The revolutionary step of tissue glass slide digitization has opened up exciting possibilities in the world of digital pathology. We have seen gradual evolution over the years aimed at reducing manual intervention and automating digital pathology workflow. In the initial phase of digital pathology traditional computer vision methodologies which were more suited for Radiology were used for tissue detection, segmentation, morphometry etc. The main challenges for getting accurate interpretations were variability in staining of slides, slide preparation and different makes of scanners available in the market.
The limited information from pixel intensity space was insufficient for reliable and consistent outputs. Moreover, the domain experts give their interpretations based on years of knowledge and experience which was needed to be captured in the image processing tools. The complexity and variety of the image analysis problems necessitated the use of machine learning techniques to solve these problems.
Supervised machine learning approach to solve the image analysis problems is to train a statistical model using a set of training images labelled as ground truth by domain experts. The Model maps the features computed by image analysis algorithms to the output labels.
Efficient image analysis algorithms have been researched and developed for image analysis modules, including image pre-processing to improve initial image quality and segmentation techniques for detecting foreground objects. Other issues such as staining variation have been handled by stain normalization techniques (color normalization from photography), such as histogram equalization. Additional techniques like color deconvolution, separate components of histological stains (e.g. DAB, AEC, H&E, etc.) as they cannot be easily separated by splitting into the red, green and blue channels recorded by color cameras.
However, Feature engineering is a crucial aspect of conventional machine learning techniques. Traditional handcrafted features rely heavily on domain expertise of pathologist and clinicians. Finding the informative, discriminative and independent set of features for training the machine learning model is complex. Sometimes it is difficult to explicitly define a feature which can be interpreted and intuitive to the user and the domain experts. Since most of the features are derived from pixel intensity space, they are not invariant to differences to the input images and the staining variations.
With the advancements in Digital Pathology high volume of quality digitized data is available for the algorithm developers, scientists and pathologies around the world. Integration of Tele pathology in clinician’s workflow have resulted in greater collaborative work. With the advent of cloud computing and high end processors computing resources are available like never before. The environment is conducive for a novel approach to image analysis problems in digital pathology, known as deep learning, a learning system with multi-layered neural network architectures.
We saw the limitations of handcrafted features with conventional machine learning approaches. Deep learning takes feature engineering to the next level by automating feature engineering. There are Dl methodologies to directly learn from the raw data and map to the intended goals. The combination of handcrafted features and the DL discovered features can lead to reproducible and accurate results for problems related either to prediction or classification. The system could be partially supervised in the sense that the initial training is done with the labelled ground truth and then the system uses this training for unsupervised learning. However, some Deep Learning algorithms can become computationally-expensive when dealing with high-dimensional image data, due to the often slow learning process associated with a hierarchy of learning data abstractions and representations between layers. Convolutional neural networks effectively scale up high-dimensional data.
The development of Convoluted Neural networks (modelled on human brain) as deep network for analyzing and classifying image patterns have revolutionized medical imaging through DL. According to studies [Janowczyk et al. 2016], CNNs are the basis of some recent outstanding breakthroughs in the analysis of digital pathology images with applications e.g., for mitosis detection, nucleus segmentation, gland segmentation, and metastasis detection.
Fusion of modalities for better predictive modelling could be an application area of deep learning based algorithms. Content-based image retrieval for research and diagnostic purposes utilizing the vast digitized databases, could be another application area.
Despite the possibilities presented by CNN and DL algorithms in general, they have some limitations which are being overcome; primary limitations being:
- Deep Learning requires huge amounts of training data
- Deep Learning requires extensive computing power
- Architectures can be complex and must often be highly tailored to a specific application
- The resulting models may not be easily interpretable
Field-programmable gate arrays(FPGAs), graphic processor units (GPUs) and application-specific integrated circuits (ASICs) are being explored to exploit parallelism of the computational structure of neural networks, than the CPU’s in parallel.
Besides the training time, the major problem of these networks is the overfitting in domains which offer very small amounts of data. Many regularization methods are being developed to prevent overfitting.
Drawbacks are being addressed as the possibilities are mind boggling with the application of deep learning in digital pathology.