Data Analytics

Data Analytics For Protein Crystallization

Protein crystallization is a difficult process that requires setting up thousands of experiments for a successful crystalline condition.
Some of our products are:

  • Online AED Analysis: Determine screening factors  that are most likely to lead to higher scoring outcomes, crystals
  • Crystal X2: Automated Protein Crystallization Trial Image scanning, analyzing and scoring system.
  • Image segmentation and classification for Protein Crystallization Trial Image.
  • Visualization of Plates for Protein Crystallization Analysis.

Online tools available in cloud computing

OAED Analysis:

To access the tool’s trial version, click this URL link: Online AED Analysis.

Please be aware that trial URL is under development and will be updated during the development cycle.

This online AED analysis program is hosted by Dr. Truong Tran, Assistant Professor of Computer Science, Penn State Harrisburg, Penn State University.

Main code repository: https://github.com/Truong-X-Tran/OAED

For the AED project, we highly recognize the contribution of :
Imren Dinç, MS of Computer Science, Graduated from The University of Alabama in Huntsville.
Rucha Daware, Graduate Student for the MS of Computer Science degree at Penn State Harrisburg, Penn State University.

Online Genetic Algorithm for Screenings (OGAS)

The OGAS tool project aims to generate the conditions necessary for successful crystal growth. OGAS uses a variation of the genetic algorithm that explores unexplored territories of the chemical search space, thereby increasing the probability of finding a new crystalline condition

To access the tool’s trial version, click this URL link: Online Genetic Algorithm for Screenings.

Please be aware that trial URL is under development and will be updated during the development cycle.

A tentative flowchart of the program.

Main code repository: https://github.com/Truong-X-Tran/OGAS

See more about Data Analytics for Protein Crystallization here: Data Analytics

View more about Data Analytics for Protein Crystallization

 

Protein Crystallization Segmentation and Classification Using Subordinate Color Channel

The accuracy of detecting protein crystals for fluorescence microscopy images is very critical for high throughput and automated systems. Although the trace fluorescent labeling method could highlight protein crystals, reflection, and emission from the fluorescence dye are not always due to crystal regions. Therefore, the analysis of the peak wavelength in the emission spectra of a fluorophore may not always yield effective results.

Trace fluorescent labeling typically involves fluorescence dye that can re-emit the illumination light at other wavelengths around the principal wavelength.

  •  The captured image has a primary color channel with respect to illumination light and fluorescence dye.
  •  Crystals will have a higher intensity than non-crystal areas.
  •  But there might be bright regions that may not be crystals, thereby making inaccurate and not robust trial images classification.
figure4         

In this study, we utilize the subordinate color channel besides the primary color in the image of trace fluorescently labeled protein solution.

This new method extracts proper features and successfully builds a high accuracy classifier with a low rate of misclassification of crystals as non-crystals.

Read full text at the following published articles:

  • Truong X. Tran, Marc L. Pusey, and Ramazan S. Aygun. Protein Crystallization Segmentation and Classification Using Subordinate Color Channel in Fluorescence Microscopy Images. Journal of Fluorescence 30, 637–656 (2020). https://doi.org/10.1007/s10895-020-02500-7.
  • Tran, Truong X., Ramazan S. Aygun, and Marc L. Pusey. Classifying protein crystallization trial images using subordinate color channel. In 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Kansas City, MO, 2017, pp. 1546-1553, DOI:10.1109/BIBM.2017.8217890. https://ieeexplore.ieee.org/document/8217890.

Mobile Fluorescence Imaging and Protein Crystal Recognition

This is the first system proposed for mobile fluorescent imaging and analysis. The contributions of this study are:

  • Presents a mobile fluorescence imaging system that uses a smartphone or tablet camera and optic lens system to capture the crystallization image at high resolution and magnification.
  • Proposes a deep CNN classification model deployed on a mobile device to recognize the presence of protein crystals in the screening plates.
  • Proposes a mobile application that can capture fluorescence images in normal imaging mode, directly capture images from plates for classification, and classify images from the library on the mobile device.

Read more about Mobile Fluorescence Imaging at:
• Tran, Truong X., Marc L. Pusey, and Ramazan S. Aygun. Mobile Fluorescence Imaging and Protein Crystal Recognition. 2020 IEEE 33rd International Symposium on Computer-Based Medical Systems (CBMS), Rochester, MN, USA, 2020, pp.83-88, DOI 10.1109/CBMS49503.2020.00023. (Proceeding July 28-30, 2020, virtual event due to COVID-19).

Mobile Scanner

A protein crystallization well plate is a rectangular platform that contains wells usually organized as a grid structure. The crystallization conditions are studied through a screening process by setting up the trial conditions in the well plate.

We develop a mobile scanner that identifies the well by using a coded template under the well plate. The mobile scanner provides two modes: image and video.

  • Image mode is used for a single well analysis.
  • Video mode is used to scan the complete plate. In the video mode, the mobile scanner app generates a tilemap of the plate.

Read full text at:

Schema Matching and Data Integration

The data representation, as well as naming conventions used in commercial screen files by different companies, make the automated analysis of crystallization experiments difficult and time-consuming.

We present an approach for computationally matching elements of two schemas using linguistic schema matching methods and then transform the input screen format to another format with naming defined by the user.
This approach is tested on a number of commercial screens from different companies and the results of the experiments showed an overall accuracy of 97% on schema matching which is significantly better than the other two matchers we tested. Our tool enables mapping a screen file in one format to another format preferred by the expert using their preferred chemical names.

Read full text at:
  • Shrestha, Midusha, Truong X. Tran, Bidhan Bhattarai, Marc L. Pusey, and Ramazan S. Aygun. Schema Matching and Data Integration with Consistent Naming on Protein Crystallization Screens. IEEE/ACM transactions on computational biology and bioinformatics, (2019). https://doi.org/10.1109/TCBB.2019.2913368.
  • https://ieeexplore.ieee.org/abstract/document/8700291

Visual-X2: interactive visualization and analysis tool for protein crystallization

In high-throughput systems, the crystallization experiments require the inspection and analysis of a large number of trial images. The visualization and analysis tools are needed to view and analyze the experimental results, and recommend novel crystalline conditions by analyzing prior results. It is essential to integrate all these components into a single system. Therefore, we developed Visual-X2, an interactive visualization software developed to aid the user for quick and efficient visualization and analysis of the results of the experiments. Visual-X2 has a number of useful features for visualization and analysis: dual plate view (thumbnail and symbolic), detailed well view with the scoring option, multiple-scan and time-course views, support for screening analysis based on multiple screens, three novel screen analysis methods (associative experimental design, GenScreen, and novelty methods), and generating a pipetting file with a family of conditions varying concentrations based on stock concentration.

Read full text at:
  • Suraj Subedi, Imren Dinc, Truong X. Tran, Diwas Sharma, Buddha R. Shrestha, Marc L. Pusey & Ramazan S. Aygun. Visual-X2: interactive visualization and analysis tool for protein crystallization. Network Modeling Analysis in Health Informatics and Bioinformatics 9, 15 (2020). https://doi.org/10.1007/s13721-020-0220-6.
  • https://link.springer.com/article/10.1007%2Fs13721-020-0220-6

Healthcare data source:

  • https://ohdsi.org/
  • https://ohdsi.github.io/TheBookOfOhdsi/
  • https://data.ohdsi.org/
  • TrinetX
  • The ACT Network
  • All of us research program