Educational projects and workshops by Pine-Biotech

Project-based learning is perhaps the most effective way to gain both an understanding of and practical experience in a technical subject. A growing number of public domain datasets is making it possible to conduct bioinformatic analysis of real data without having access to a lab or a sequencer. This is especially true for RNA-seq. The Pine Biotech team has been working on several publication-based educational projects, in which a portion of the original raw data is extracted and reduced so that it can be analyzed by students or professionals who have never run an RNA-seq. Using our beta version of the T-BioInfo platform, a simple RNA-seq can be run in just under an hour, allowing students to experience the power of machine learning for bioinformatics data analysis within the timespan of a single workshop.

Our goal at Pine Biotech is to make big data bioinformatic analysis easier for non-bioinformatician biologists.  Our main means of doing this is the development of our multi-omics analysis platform,T-BioInfo. Our second and related area is educational activities related to big data analysis using the T-BioInfo platform.

Increased and improved data collection, especially high-throughput data, has driven effective and personalized diagnostics and treatment. While today’s massive and exponentially growing body of data provides an unprecedented level of biomedical detail, the generated datasets are huge, heterogeneous, full of artifacts, and very complex. Realization of the potential of these resources – whether in basic science research, translational research, biotech, or clinical practice – requires practical education in the technological skills needed to harness them. Such education must go beyond theory to provide a practical understanding of the tools, approaches, logic, expected outcomes, and applications related to interpretation of such datasets.

We recently began to ask our contacts (http://goo.gl/forms/8VQhTRqrafZh6eFo1) how important it was for them personally to be able to analyze and interpret omics data. While the survey is ongoing, a wide range of people have already responded – professors and Ph.D. students as well as healthcare and biotech professionals.

screenshot-2016-12-15-16-20-59

While informatics literacy has increased substantially among students, researchers, and clinicians, most are not ready to use code line interface and/or are not inclined to invest the significant time needed to learn this skill. In addition, while solutions using a graphic user interface have started to appear, most of these require a good understanding of input/output and configuration of algorithms as well as the logic of constructing pipelines of algorithms. Further, working with big data requires a foundation in statistics and machine learning as well as substantial computational resources, which are expensive and require hardware expertise to assemble. All these factors hinder the usefulness of available public domain datasets and tools, limit the active adopters to those who have access to resources, and delay the adoption of big data for use in biomedical practice.

This is especially apparent farther away from established clusters of advanced universities and high-tech centers, located disproportionately in the US Northeast and California. Such institutions are supported by large grants and higher incomes, and therefore are already well-positioned in terms of both economic and human resources. In contrast, more isolated academic communities often struggle with a lack of the experience, skills, and resources needed to conduct technologically advanced research. This is also true of many countries outside the US.

To address this gap, we decided to develop a set of practical, modular courses in ‘omics data analysis based on public domain projects and our user-friendly, web-based bioinformatics analysis platform. The goal is to use our GUI-based platform to skip the complexities of coding and in-depth theory; instead, students are given a basic overview of the essential biological and informatics concepts and then jump right into practice. Toward that end, we have begun development of a series of hands-on workshops and online courses, with all of the data accessible online.

Research and educational Projects:

Here is an overview of our projects on Omics data:

Modeling Precision Medicine on Breast Cancer Cell Lines : 
The project includes Exome-seq, RNA-seq and IC50 datasets on a number of cancer cell lines alongside analyzing specific subtype and pathway responses of anticancer compounds  in breast cancer.

PDX Model-RNA-seq :
We are currently working on analysis of stroma-tumor interaction RNA-seq data set. For this specific project new approach was developed by the Tauber Center and is now included in the Machine Learning module of the T-Bioinfo platform that allows association of two expression tables (in this case the expression of tumor genes and the expression of stromal genes) and identifying key elements and their interplay.

American Gut- Microbiome:
This project is a focused dataset pulled from the original American Gut Microbiome project. Pine Biotech is running metagenomics pipelines for the 16s rRNA data and using machine learning techniques to integrate phenotypic attributes and geolocation.

Asthma-ChIP-seq and RNA-seq:
This is a multi-omics project combining ChIP-seq and RNA-seq analysis of primary human T-cells to understand the intricacies of Human immune system in Asthma.

Macrophages- RNA-seq:
This project focused on a previously published dataset, we has focused on defining the distinct expression of each macrophage populations during cancer growth using RNA-seq. With our machine learning techniques, such as Factor Regression Analysis, we are working to identify the specific mechanisms that define these different subtypes of macrophages.

Workshops

Hands-on workshop in RNA-seq:
A 1.5 hour workshop including a lecture-style introduction to the molecular biology and informatics concepts needed to conduct an RNA-seq analysis, followed by a longer, hands-on practical component.  For the hands-on portion of the workshop, the participants use their laptops as interfaces to our remote servers.

screenshot-2016-12-15-16-22-43 RNA-seq workshop at the New Orleans Center for BioInnovation

Our live, practical workshops are presented by two main presenters, with other members of the Pine Biotech team circulating among the participants and helping them. The main presenters are Dr. Claudia Copeland and Jaclyn Williams.

Dr. Claudia Copeland :

Dr. Claudia Copeland has a Ph.D. in molecular and cellular biology, with several academic publications as well as dozens of popular science articles and educational writing in chemistry and biology.  Dr. Copeland is directing Pine’s education and communication activities, including, with other Pine team members, conducting workshops, writing grants, and developing Pine’s online courses.

Jaclyn Williams :

Jaclyn has a M.S. from the University of Southern Mississippi (USM) Department of Biological Sciences. Jaclyn is assisting with biological interpretation of analysis results, coordinating our pilot projects and helping as an industry liaison.

  • Online workshop in RNA-seq:  starting in January of 2017, we will begin offering webinar versions of our live, hands-on RNA-seq workshops.  Since our servers are used for the hands-on portion of the analysis, with students using their laptops as an interface, this will allow students in communities remote from technology infrastructure to participate and gain practical experience in RNA-seq analysis.

More details can be found on our events page:
http://edu.t-bio.info/events/transcriptomics-rna-seq-workshop/

Pine-Biotech Educational website:

Our first online course, transcriptomics, will be launched in January 2017. We are developing online courses in genomics, transcriptomics, epigenomics, metagenomics, virology with cirseq, and machine learning.  The courses present conceptual material in a multi-media format, and include practical exercises that allow students to conduct hands-on analyses using prepared datasets designed for educational use.  These datasets have been trimmed from the original research datasets to become small enough and clean enough to allow a student to run an analysis in a reasonable amount of time.  At the end of each course, students can complete a test and gain a certificate for the skill being taught in the course.

Our Educational website: http://edu.t-bio.info

Newsletter