Pathology advances with GPUs

The Laboratory for Spectral Diagnosis at Northeastern University, under the leadership of Professor Max Diem, has a research focus on spectral diagnosis of disease. In this work, methods to investigate human cells, tissues, or body fluids spectroscopically are developed to arrive at a diagnosis of disease based on objective measurements and mathematical-statistical procedures. The spectroscopic techniques utilized are methods of vibrational spectroscopy, Raman scattering, and Fourier transform infrared absorption spectroscopy. Since disease changes the (bio)chemical composition of cells, tissues, and body fluids, molecular fingerprint techniques such as vibrational spectroscopy can be used to monitor these changes and interpret them in terms of a medical diagnosis.

Since disease changes the chemical composition on the cellular level, it is necessary to carry out the spectroscopic measurements on a microscopic level, matched in size to the typical size of cells. Using modern infrared and Raman micro-spectrometers, in which the vibrational spectrum is measured through microscopes, it is possible to acquire spectroscopic data from very small samples. The spatial resolution permits detection of subcellular components (mitochondria, nucleoli, condensed chromatin) and opens new avenues of monitoring cellular processes using the inherent spectral properties of molecular constituents. This method eliminates the need for stains or marker molecules.

hierarchical cluster analysis

Figure 1. Photomicrograph (left) of a stained lymph node thin section, containing metastatic cancer (circled), and a pseudo-color spectral map of the same section, acquired before staining. The spectral map was assembled by hierarchical cluster analysis.

Professor Diem talks about Pathologists and describes what you are seeing \ in Figure 1.

Get the Flash Player to see this player.

Work on tissue diagnostics is performed at the lab. The detection of primary tumors in a variety of tissue types, e.g, colon, cervix, breast, prostate, and secondary (metastatic) tumors in lymph nodes has been pursued. Figure 1, for example, shows an infrared pseudo-color map based on 40,000 individual infrared spectra acquired from a lymph node thin section and analyzed by unsupervised techniques of multivariate statistics (hierarchical cluster analysis), which shows the detailed location of a colon adenocarcinoma metastasis in this lymph node. Unsupervised methods of analysis require no reference data sets and are based solely on the detection of spectral similarity or dissimilarity using pattern recognition algorithms. The results shown in Figure 1 demonstrate that small but reproducible spectral differences between different tissue types and between normal and diseased tissue can be exploited using the statistical methods indicated. The results of such unsupervised analyses are subsequently correlated with tissue histo-pathology and used to train diagnostic algorithms based, for example, on Artificial Neural Network (ANN) methodology.


In order to facilitate algorithm development for these diagnostics methods, the lab has implemented the required mathematical and statistical algorithms using the MATLAB language. The software developed for the analysis of spectral imaging data sets is now commercially available at

One element of the hyperspectral image analysis workflow that requires more than a traditional desktop workstation or personal computer is Hierarchical Cluster analysis (HCA). HCA requires a large amount of data space and computation time (~11 hours) for typical datasets when using a single processor personal computer. Rather than the traditional approach of moving to a lower level programming language like C or C++ and complex parallel programming paradigms such as OpenMP or the Massage Passing Interface (MPI), the lab utilized graphics processing units or GPUs and the Jacket software platform. The solution allowed the lab to dramatically increase the performance of the analysis while substantially decreasing the amount of calendar time to reach the desired performance results.

hierarchical cluster analysis for gpu

Figure 2. A raster scanned through the focus of laser light with a step size of about 300 nm and a complete Raman spectrum are collected at every pixel. The resulting hyperspectral dataset contains between 10,000 and 20,000 spectra. Spectral images are obtained via multivariate methods such as HCA, for which a correlation matrix between all spectra is constructed. Spectra are clustered by similarity, and the spectral similarity is color-coded.


With the goal of dramtically decreasing runtimes of HCA algorithms, the lab took two steps to maximize performance results using the Jacket platform and an NVIDIA C1060. HCA was the bottleneck in performance and therefore was the target region for performance enhancement. The first step was to move this region of the algorithm to a higher level using well documented "vectorization" techniques. Vectorization is a key step in any parallel effort whether the target platform is GPUs, multi-core CPUs, or multi-node CPU clusters. The vectorization effort resulted in 100-fold improvement in run-times for the analysis. This 100-fold improvement will benefit all runtimes of the algorithms whether the target hardware platform is a CPU or GPU.

Although a 100-fold performance gain is significant, it was not sufficient to meet the needs of the lab. The second step taken to further improve performance, was to GPU-enable the algorithm using the Jacket software platform. After implementing Jacket datatypes and leveraging Jacket's run-time system for the CUDA capable GPU, the lab was able to recognize an additional 100 times performance improvement on top of the improvement resulting from vectorization. Jacket provided the incremental performance gain necessary to allow instant delivery of the overall hyperspectral image analysis workflow to medical professionals for immediate disease diagnosis.

Professor Diem and Milos Miljkovic on the results!

Get the Flash Player to see this player.


The Northeastern University team has analyzed dozens of tissue samples for the presence of cancer using diagnostic algorithms based on ANNs. Figure 3 shows a tissue section with a colon adenocarcinoma that was analyzed in less than a minute by an ANN to detect the cancerous areas shown in red in Panel 2C. The ultimate goal of the team is to establish diagnostic and prognostic methods that can be used in the operating room to define the margins of recession and to screen excised lymph nodes for the presence of metastatic disease. Recently, the method successfully detected micro-metastases (metastases measuring less than 2 mm in size) in lymph nodes, which are exceedingly difficult to detect in standard histo-pathology.

neural network results

The Authors

  • Laboratory Director: Professor Max Diem, PhD
  • Postdocs: Benjamin Bird, Milos Miljkovic
  • Graduate Students:Tatyana Chernenko, Erin Kingston, Antonella Mazur, Jennifer Schubert, Ellen Marcsisin, Evgenia Zuser
  • Undergraduate Student: Kathleen Lenau, Christina Uttero

« Back to Case Studies