Cell types, basement membranes, and connective structures that organize tissues and tumors can be found in length ranges ranging from microscopic organelles to entire organs (0.1 to >104 m). Hematoxylin, eosin (H&E) and immunohistochemical microscopy have long been the methods of choice for examining tissue architecture. In addition, clinical histopathology is still the main method used to diagnose and treat diseases such as cancer. However, classical histology needs to provide more molecular data to properly classify disease genes, analyze developmental pathways, or identify cell subtypes.
It is sufficient to identify cell types, assess cell states (dormant, proliferating, dying, etc.) and study cell signaling pathways using high-plex imaging of healthy and diseased tissues (also known as spatial proteomics). In a conserved 3D environment, high-plex imaging also depicts the morphologies and locations of acellular structures required for tissue integrity. The resolution, field of view, and diversity (plex) of high-plex imaging modalities vary, but they all provide 2D images of tissue sections that are typically 5–10 µm thick.
The single-cell data generated by segmentation and quantification of multiplexed images perfectly complements the single-cell RNA sequencing (scRNASeq) data, which have greatly enhanced their understanding of healthy and pathological cells and tissues. However, unlike dissociative RNASeq, multiplex tissue imaging preserves morphological and spatial information. However, images of cultured cells, which have hitherto been the focus of biologically oriented imaging systems, are far more difficult to evaluate computationally than high-plex imaging data.
Techniques for cell segmentation of metazoans have undergone extensive development; However, segmenting tissue images poses a more difficult problem due to cell congestion and the diversity of cell shapes. Like the ubiquitous application of convolutional neural networks (CNNs) in image identification, object recognition, and synthetic imaging, segmentation algorithms using machine learning have recently become popular gone mainstream. Architectures such as ResNet, VGG16 and more recently UNet and Mask R-CNN have gained wide acceptance due to their ability to learn millions of parameters and generalize across datasets.
Since most cell types have only one cell nucleus, the localization of cell nuclei is an ideal starting point for segmentation of cultured cells and tissues. Nuclear stains with high signal-to-noise ratios are also common. Researchers have previously proposed two random forest-based approaches that use a set of decision trees to assign class probabilities to an image on a pixel-by-pixel basis, using multiple channels for class-by-pixel pixel classification. However, a major disadvantage of random forest models is that they are far less adaptive than CNNs. Therefore, much research needs to be done on the potential of leveraging CNNs with multi-channel data to improve core segmentation.
The most popular method of expanding training data to account for image artifacts is computational expansion, where images are randomly rotated, sheared, flipped, etc. before being preprocessed. This is done to prevent algorithms from capturing disjointed information about an image, including its orientation. Focus artifacts have so far been eliminated by using computed Gaussian blur to supplement training data. However, Gaussian blur is only a rough approximation of the blur present in any bandpass restricted optical imaging device, such as B. a real microscope, as well as the consequences of mismatched refractive indices and light scattering.
This research examines methods to improve the accuracy of multiplexed tissue images with typical imaging artifacts and image segmentation using machine learning techniques. By manually selecting a variety of normal tissues and tumors, they create a training and testing set with ground-truth annotations. They then used this data to measure the segmentation accuracy of three deep learning networks, each independently trained and tested: UNet, Mask R-CNN, and Pyramid Scene Parsing Network (PSPNet). The resulting models are a set of Universal Models for Identifying Cells and Segmenting Tissue (UnMICST), each based on a different type of ML network but using the same training data. Based on their study, they found two strategies to increase segmentation accuracy for all three networks. The first combines photographs of nuclear chromatin stained with DNA-intercalating dyes with photos of nuclear envelope staining (NES). The second involves natural augmentations—defined here as intentionally blurring and oversaturating photos in the training data to strengthen models against the types of artifacts seen in actual tissue images. They discover that the actual data enhancement far outperforms the traditional Gaussian blur enhancement and statistically improves the robustness of the model dramatically. The benefits of incorporating NES data and real augmentations are cumulative across different tissue types.
Try this paper and code. All credit for this research goes to the researchers on this project. Also don’t forget to participate our Reddit page and Discord Channelwhere we share the latest AI research news, cool AI projects and more.
Aneesh Tickoo is a Consulting Intern at MarktechPost. He is currently pursuing his bachelor’s degree in Data Science and Artificial Intelligence at Indian Institute of Technology (IIT), Bhilai. He spends most of his time working on projects that aim to harness the power of machine learning. His research interest is image processing and he is passionate about developing solutions for it. He loves interacting with people and working on interesting projects.