Cell Lines: multi-omics network of associations to model precision treatment

Modern advances in personalized medicine have used technology to characterize a patient’s fundamental biology, in terms of DNA, RNA, and protein. This can be used to classify a disease (such as breast cancer subtype) or to characterize important details of the patient’s disease(such as genes related to drug response for a particular treatment). These techniques can also be used in research for diseases such as cancer and genetic diseases.


The high variety of cancer mutations for each individual patient means that effective diagnosis and treatment of cancer must take into account a high degree of complexity. By sequencing individual cancer genomes, researchers and physicians may develop more targeted medical solutions. Cancer is the second major cause of mortality in the United States and targeted cancer therapies are a growing treatment type for many cancers as it brings an exponential increase in effectiveness over traditional cancer therapies.


Breast cancer can arise from many different types of mutations. As a result of these mutations, it  can be subdivided into a number of subtypes. Six major subtypes, previously identified and documented, are considered particularly useful for prognosis and treatment strategy. These subtypes respond differently to chemotherapy and hormone treatments. Currently, doctors only test for a handful of molecular signatures and over 40% of those patients’ cancers do not fit into one subtype. Cell lines are often used first in research for pre-clinical models, as they mirror many of the molecular characteristics of tumors, and are a less complicated model than a human.Cell lines are used to study cancer in a lab without human or animal subject involvement and are utilized to  model interactions between cell types and various drugs and therapeutics.


This project was inspired by Daemon et al., 2013, “Modeling precision treatment of breast cancer”, which focuses on over 70 different Breast cancer cell lines and over 90 different therapeutic agents. The project included SNP array (a type of microarray that discovers variations in the genome), RNA-seq (which looks at the whole transcriptome), exome-seq (exome capture, which looks at all of the expressed genes at a given point in time), genome-wide methylation (study of epigenetic alterations), and as well as integrating a number of algorithmic methods to identify molecular features,using advanced machine learning algorithms.


The TBioInfo platform has a number of advanced machine learning algorithms including,  Biassociation algorithm, which was used to integrate a number of different omics data types.This includes RNA expression, cell mutations, and drug effectiveness to find relationships and better understand how medications affect the breast cancer cells.This work was able to develop predictive drug response signatures and this research can be built upon with future clinical models. One issue with this study is a cell panel does not capture features such as tumor microenvironment, which is critical to understanding tumors.

Introducing Omics Logic

The Human Genome Project showed that while mankind’s genetic makeup is 99.1% identical, and only 0.9% of genetic variability creates vast variability that exists within the human species (Novelli 2010). Personalized medicine is the effort to prescribe the most appropriate drug for each individual patient based on their specific biology. Genetics explain some of the variations in responses seen during clinical trials. The variety of cancer mutations means that effective diagnosis and treatment of cancer must take into account a high degree of complexity. By sequencing individual cancer genomes, researchers and physicians may develop more targeted medical solutions. By using interpreted data in routine patient exams, clinicians can analyze how the symptoms of a disease in a patient correlate with their specific biology, resulting in a more effective treatment. Cancer is the second major cause of mortality in the United States, but targeted cancer therapies are bringing about an exponential increase in effectiveness over traditional cancer therapies.The potential exists to identify early indicators of disease, including cancer, in the form of biomarkers for early detection of a disease.


According to the Global Oncology Trend Report, global spending on cancer medications rose 10.3 percent in 2014, bringing the total to $100 billion, up from $75 billion in 2010. The rising cost of cancer treatment is linked to the emergence of precision therapeutics, which are costly to develop and often fail before they reach the market. While more effective, they target a smaller population which is hard to identify. The pharmaceutical industry recently turned to theoretical and computational modelling to improve the drug discovery process, lowering the cost of care in the process.


The cost of next-generation sequencing and other techniques that provide comprehensive whole-patient data is decreasing rapidly, making personalized multi-omics analysis increasingly cost effective and accessible. There is a long and costly effort to introduce precision treatment based on molecular data into both health delivery and pharma. A massive amount  of data is available, yet few know how to use the data insights effectively. To be truly effective, the data has to be analyzed effortlessly with an integrative approach. Using multi-omics data in developing an informed, personalized approach to treatment, access to effective clinical trials, and preventive strategies can provide major cost savings in terms of avoiding ineffective treatments, expensive diagnostic regimens, etc.


With affordable whole-patient scale data just around the corner, the challenge has now moved into the realm of extraction of meaningful insights from the data. To get the most value from multi-omics data analysis in clinical applications, Pine Biotech is developing an omics-first machine learning platform, OmicsLogic. The platform goes beyond analytics, integrating clinical knowledge with multi-omics raw data analysis for biomarker discovery and personalized molecular studies. As the field evolves and data continues to become available, algorithmic innovation is poised to be a driving force in solving healthcare ecosystem challenges. The wealth of data that is generated should be exploited – ultimately to improve care, and benefit consumers.




Novelli, G. Personalized genomic medicine. Int Emerg Med. 2010;5(Suppl 1):S81-90. doi:10.1007/s11739-010-0455-9

We’re in the News! We’re excited to share a funding update and new plans in developing and commercialization of a proprietary biomedical data analysis and machine learning platform. Genomeweb, Silicon Bayou, Nola.com, The New Orleans Advocate, and other local and national publications covered the story.

Pine Biotech recently announced that it secured over a million USD in seed funding from investors this May in support of the development and commercialization of their proprietary biomedical data analysis and machine learning platform. Incorporated in the end of 2014, Pine Biotech is commercializing a biomedical data analysis platform in collaboration with the Tauber Bioinformatics Research Center and the University of Haifa.

Screenshot 2017-06-14 14.11.01

The platform is designed to serve unmet needs in clinical studies, academic research and education. This solution is different than other biomedical analysis software currently available. The platform enables researchers to conduct comprehensive analysis of large genomic, transcriptomic, proteomic, structural and phenotyic data using an intuitive interface. Molecular data reveals important mechanisms of actions that are best studied as a system, making this integrative approach critical for understanding and treating disease. The machine learning platform utilizes algorithms developed over years of research and trained in many academic projects.

In addition to making multi-omics analysis accessible for non-bioinformaticians, the platform includes a machine learning toolkit and interactive visualization. “Our company’s focus is on analysis of molecular data, or “omics” data, because it contains information on an unprecedented level of precision,” says CEO Elia Brodsky, “By enabling researchers and clinicians to extract real insight from omics data, we hope that new and more effective approaches to diagnostics and therapeutics will be developed.” The funding comes in support of newly secured collaborations with government agencies, academic medical centers and technology partners. “Now our team we will be able to move our work out of the research space and start addressing clinical challenges together with our partners.”

img_5781“Integration of multi-omics and clinical data will be key to implementation of precision medicine. Innovative startups like Pine partnering with academic health centers are the engine that will produce the novel algorithms necessary for this quantum leap in health care.”

Dr. Lucio Miele, Lucio Miele, M.D., Ph.D.Professor and Department Head, LSU School of Medicine, Department of Genetics Director for Inter-Institutional Programs, LSU Stanley Scott Cancer Center and Louisiana Cancer Research Consortium



Free training in Bioinformatic analysis available with Pine Biotech’s T-BioInfo

Data analytics skills are in high demand in clinical research and treatment development, though most bioinformatics education courses focus on technical issues rather than the bigger picture of big data and its potential to change the way we view disease and treatment discovery. Learning how to run bioinformatic analysis is costly with the high prices of courses through a University or online education program – not to mention the countless hours which must be devoted to learning coding languages, scripts, and downloading software packages. The result is an approach to bioinformatics which ignores biology.


Our online educational modules are different – applying our algorithms and visualization tools to real publically available molecular data. Through our user-friendly visual interface cloud platform, T-BioInfo, we bypass the technical investment of education in bioinformatics. The T-bioInfo platform simplifies the computational approach to allow scientists from all backgrounds to move forward in the world of big data. “Analyzing large datasets can be a challenge for molecular biologists.” Said Dr. Christian Pfaller of the Cattaneo Lab in the Mayo Clinic. “T-Bio provides a comprehensive platform with a user-friendly graphical interface that allows a wide range of NGS algorithms.”


The online courses developed by Pine Biotech are centered on eliminating the disconnect between data gathering and data analysis by training professionals and students alike in bioinformatic analysis using the T-Bio Info platform. Projects crafted using publically available data contain detailed biological data, broken down into digestible, easy-to-manipulate visualizations, offering scientists and students alike a new means of working with ‘omics data – without expensive the technology and coursework. In a workshop conducted with University students, the students were able to complete our course modules, working with real scientific datasets – and passed content knowledge quizzes in a two hour session. Programs range from a basic introduction to bioinformatics, to in-depth project guides which lead the user through analysis.


With the abundance of data available, it is up to data scientists to derive the value of the information generated daily. The world of big data is constantly evolving, and scientists of all backgrounds need to find ways to integrate the value of the data within their own work in order to keep up with the deluge of new and backlogged information. We seek to streamline skill development in bioinformatics. Our approach focuses on the understanding of biological processes and molecular factors – without introducing complex computer science, empowering non-bioinformatician biologists to take full advantage of the endless data at their fingertips.


With the ability to understand large medical data, the possibilities for research and discovery are endless!


Our online education system is extensive, with full online courses in development for each of the analysis modalities.


Register for free at http://edu.t-bio.info/lp-courses/ to explore course modules:


  • Introduction to Biomedical Data Analysis
  • Transcriptomics: NGS Expression Profiling
  • Genomics: Mutation Variant Analysis
  • Microbiome: Microbial Diversity
  • Epigenetics: ChIP-Seq and WBGS profiling
  • Machine Learning: Understand your data


Each section contains step-by-step explanations from both data and biological perspectives to develop analysis logic that the user can take with them!

Screenshot 2017-06-08 14.50.51

A Personalized, Precise Approach to Fighting Cancer

Cancer is the second major cause of mortality in the United States and targeted cancer therapies are bringing about an exponential increase in effectiveness over traditional cancer therapies. Traditional cancer treatment, such as chemotherapy, has come a long way in the past five decades, and care can be delivered comfortably, in an outpatient setting with manageable side-effects.

Screenshot 2017-06-13 16.29.07

Cancer is an important challenge for which personalized molecular medicine shows great promise. Recent advances in immunotherapy and genetic testing have been proposed to help transform care from one-size-fits-all to a highly specialized range of options that could be adapted to fit an individual’s molecular features. However, we are still far from understanding and navigating cancer.


The variety of cancer mutations means that effective diagnosis and treatment of cancer must take an individual approach. By sequencing individual cancer genomes, researchers and physicians may develop more effective treatment solutions. A multi-omics approach based on big data analyses could lead to substantial advances in cancer treatment, ushering in an exciting new paradigm in cancer treatment.


Accessibility of new technologies such as next-generation sequencing remains a barrier for many patients. The skills needed, and associated costs of  technology have prevented physicians and patients from using precision oncology to its full potential so far. The skills and equipment required to collect and analyze genomic data can be expensive. And generally, Bioinformaticians have to go through rigorous training in linux, python, R, databases, visualization and an endless list of new algorithms to even understand the complex datasets.


To encourage more researchers to make use of the potential and power afforded by big data, we designed a free series of lessons which simplify the computational aspects of analysis to allow scientists from all backgrounds to move forward in the world of big data. The cancer series has practical, hands-on projects that allow students to practice analyses with data adapted from real datasets.



The course will cover some of the important aspects of breast cancer:

  1. molecular indicators of cancer: deregulation of cellular checks and balances, uncontrollable growth, alternate signaling, altered immune responses, etc.
  2. Factors that contribute to cancer heterogeneity – “levels of biological regulation”
  3. Response to treatment studied with cancer cell lines.
  4. Beyond cell lines – PDX models: the role of microenvironment and the use of animals in research.
  5. TCGA data – real patients: miRNA-seq, RNA-seq, Exome-seq and clinical data.
  6. Deeper look into clinical data: combinations of treatments, many ways to diagnose cancer, why molecular data is critical.
  7. Future of Cancer: New treatments and findings that could change current cancer treatments.

Exponential growth of data in biomedicine needs a skilled and experienced workforce

A wealth of public domain biomedical data is available to anyone online, however, data analysis technology can only be utilized by those trained to work with it, leaving researchers without the proper skills and tools drowning in the overwhelming amount of information. It might sound intimidating, but in reality, this is GOOD NEWS!

Data science is a fast-growing branch of the technology landscape, and with IT and analytic skills highly sought after by employers, data scientists pull in hefty salaries for their expertise. There is an urgent need for quick, cost-effective, and accurate data analysis in the field of biomedical research as well. This need is driven by the continually expanding wealth of patient data, coupled with the need to synthesize and understand the data’s importance in medicine.  One of the most exciting areas where new discoveries are being made all the time is in the molecular data field. The datasets there are huge, but every new project brings us closer to understanding complex diseases, evolution and human longevity.

Today, biomedical data is typically analyzed by a trained bioinformatician – a tech-era mash-up of “biologist” and “statistician” who breaks down and interprets large sets of medical data in a clinical or research setting. But for the most part, bioinformaticians have to go through rigorous training in linux, python, R, databases, visualization and an endless list of new algorithms that are complex and require deep understanding to use. Just to get to a meaningful project might take years depending on how well you are versed in all of these technologies. All of these skills essentially turn you into a computer scientist first and then place you under a biologist’s or clinicians’ authority to guide “real research”.

But more and more biologists, clinicians and even patients want to play a role in these discoveries. At some point, you had to know Fortran to use a computer – that is until a mac came out or Windows with their visual interfaces making it a household item. And then kids could play with the infinite scripts (games) that leveraged this complex technology. Well, that’s exactly what we would like to do – let’s make bioinformatics easy, visual and intuitive. We think we can do it together.

To start, we’ve put together a number of projects that one can analyze using our visual bioinformatics platform. Here are a few examples:

“PDX Models: Tumor-stroma interaction” inspired by the publication, Whole transcriptome profiling of patient-derived xenograft models as a tool to identify both tumors and stromal specific biomarkers by Dr. James Bradford. The project’s approach focused on comparing several different breast cancer types using RNA-seq and machine learning methods including 79 PDX mouse models with human primary tumors. PDX models maintain more similarities to the parental tumors than a traditional cell line does. Subsequently, alternative analysis of experimental data provided deeper insight into the problem and identified new biologically meaningful group-wise associations between tumor and stroma genes.

“Cell Lines: multi-omics network of associations” rests on the publication of Expression Profiling of Macrophages Reveals Multiple Populations with Distinct Biological Roles in an Immunocompetent Orthotopic Model of Lung Cancer. The approach focused on the molecular features associated with responses of a collection of 70 breast cancer cell lines to 90 experimental or approved therapeutic agents. This project included both RNA-seq, mutations, and IC50 drug values. During analysis, biassociation was utilized with the expression and mutation variant results as well as drugs (GI50) to find relationships between the datasets.

Explore these datasets online, and learn more about our bioinformatics platform T-BioInfo, at http://edu.t-bio.info/projects/

Pine Biotech’s T-BioInfo: One of theTop 5 best tools for biomedical data visualization

The completion of the Human Genome Project in 2006 proved to the world the importance and infinite potential of large medical data collection and analysis. Developments in next-generation sequencing and other advances in biotechnology generated a wealth of data — so much that it is sometimes considered a glut, with a bottleneck between data generation and meaningful analysis. A massive amount of data exists online, freely available to anyone, for either direct data mining or for combined analysis with self-generated laboratory data.

Harnessing the power of big data is the next frontier for biomedical research. Our product, T-BioInfo is cutting edge bioinformatic analytics software, capable of bringing genomic and microbial bioinformatic data to the greater public through our user-friendly interface.

The platform includes analysis tools for RNA-seq, ChIP-seq, bisulfite sequencing, de novo genome and transcriptome assembly, CirSeq, mass spectroscopy for proteomics and metabolomics, 3D biopolymer structures and similarity-based docking, unsupervised analyses (machine learning), and more. The platform streamlines the algorithmic steps needed to conduct these analyses and facilitates integration of multiple “omics” analyses. The GUI-based interface is designed so that researchers with little or no bioinformatics background can easily learn to use it.

Screen Shot 2017-06-08 at 8.25.48 AM

Check out the article from Labs Explorer here: https://www.labsexplorer.com/c/top-5-best-tools-for-biomedical-data-visualization_45

Educational Seminar At Xavier University

March 6, 2017

Key members of the Pine Biotech team will discuss the promising development of bioinformatics for research and personalized medicine and introduce Pine Biotech’s current educational and research projects. An introduction by CEO Elia Brodsky will be followed by a discussion of Pine Biotech’s curriculum and bioinformatics education approaches by Pine education director Dr. Claudia Copeland. This will be followed by an overview of Pine Biotech’s research projects by Jaclyn Williams, M.S., who has been employing her background in big data bioinformatic analysis in research collaborations between Pine and Haifa University, Stanford University, Boston University, and others.

The Biomedical Data Seminar details are as follows:

I.  Introduction: Elia Brodsky, Pine Biotech CEO (5 mins)

Brief introduction on the importance of big biomedical data analysis for precision medicine diagnostics, therapeutic target discovery and personalized treatment of complex diseases.

II.  Educational curriculum: Dr. Claudia Copeland, Director of Education (15 mins)

Dr. Copeland will focus on training the next generation of data users. In contrast to previous bioinformaticians, who needed to have backgrounds in computer science, this next generation will include biologists and medical professionals in academic research, Pharma R&D, and clinical trials.  We emphasize hands-on practical workshops that de-mystify biomedical data analysis and make analysis less difficult through supporting tools for non-computer science users to get results. This will cover the coursework that Dr. Copeland has been preparing as the Director of Education at Pine Biotech.

III. Q&A: 5 minutes
IV.  Project highlights: Jaclyn Williams M.S., Biologist (15 mins)

Jaclyn’s talk briefly reviews public domain datasets as example projects being prepared to highlight the importance of artifact detection and removal, machine learning for feature selection, and unsupervised analysis methods in heterogeneous datasets. Jaclyn is a biologist and project manager at Pine Biotech.

Projects that could be briefly discussed are:
(1) PDX cancer models for tumor-stroma interaction;
(2) cell line modeling of personalized treatment selection; and
(3) analysis of American Gut microbiome data.

V.  Q&A: 5 minutes


RSVP here : pine-biotech-biomedical-data-seminar

February update from Pine Biotech

This year’s Festival of Genomics in London was a great occasion to meet with companies and research organizations dealing with omics data and hear about the latest technologies and findings in this area. Dr. Priyal de Zoysa attended most of the conference and found that there are many opportunities for diversifying multi-omics integration functionality already present on the t-BioInfo platform. Prominent among the hot topics presented at FoG was the rapidly evolving area of personalized medicine which is transforming traditional drug discovery. This topic helped to kick-start our round-table discussion with Pharma and other research representatives on February 2nd and 3rd.  We now feel that t-BioInfo can play a major part in that transformation.

Our round-table and 1-2-1 discussions included over 20 people from various organizations, such as Pistoia Alliance, Astrazeneca, Merck, Intel, Elixir, ECRIN, The Institute of Cancer Research, and Genomics England as well as the Tauber Bioinformatics Research Center. The presentations of our projects and methods were warmly received by our audience.  As expected, the interest shown by the participants was fairly diverse – from incorporating real world evidence into early stage research and target/therapeutic discovery all the way into complex omics dataset processing in areas such as microbiome or multi-omics integration. It was also exciting to get positive feedback on our newly introduced methods, such as clinical data security and compliance for the clinical version of the T-BioInfo platform and advanced association algorithm, BiAssociation, which provides new capabilities for integration and multi-stage association network mapping. The presentations can be seen on this link: https://www.dropbox.com/sh/e2kxlc1j9lzk2xs/AAAaWN341UVwBg4KDHzagc-Ia?dl=1


As we continue to build these relationships, our goal is to make sure we are on track for our upcoming commercial release of the T-BioInfo platform for researchers by March. We plan the release of the educational and academic versions of the platform while we continue to work on clinical and translational research versions.

Currently, we are planning another round-table discussion in March in New Orleans that will be linked to strategic partnerships and commercialization plans.