Genetic Diseases



Improvements in genomic medicine have led to quick, accurate, and affordable techniques for diagnosing genetic diseases. Genome-wide studies have increased our understanding of some of the most prevalent chronic diseases such as cardiovascular disease and diabetes.

Attempts to integrate knowledge into clinical practices are still in the early stages. Improvements are needed in both the integration and implementation of applied genomic medicine.

Diseases such as cystic fibrosis, Huntington’s disease, and X-linked muscular dystrophy have all been characterized in terms of individual gene mutations, yet many genetic diseases lack such knowledge.

Additionally, genomic medicine can allow for a more structured approach to diagnosis of diseases by consideration of the genome. This will allow for acceleration of molecular diagnosis and reduction of the duration of empirical treatments and genetic counseling.

Pine has been involved in a number of studies involving important human genetic diseases, and we are currently looking to identify future partners to utilize our platform for their studies. One of these was the use of the T-BioInfo platform to perform ChIP-seq, a chromatin immunoprecipitation analysis, in patients with and without Alzheimer’s (AD) to identify single nucleotide polymorphism’s (SNPs) that differed across patients. The image above demonstrates the absolute test value for differentiation between AD and control patients. After further analysis, this can be used to identify significant biological networks



Screen Shot 2015-08-24 at 11.59.06 AM

Virtual screening methodologies are used to identify a limited number of promising compounds from the huge number of possibilities present in chemical libraries. This reduces cost and time associated with the identification of good drug targets.

For example, the Zinc Database (USCF) contains over 35 million compounds. Finding a small group of target molecules that the researcher should approach is extremely important. Identifying the right molecule requires a good understanding of its 3D structure, atomic properties, and interactions with proteins and other parts of the cell.

Cancer treatment is an exceptionally challenging area of drug discovery. Targeting cancerous cells while minimizing the toxic effects of the treatment is highly difficult. With targeted cancer therapies, treatment can be personalized and toxicity reduced. With this, a patient has the best chance of successful treatment.

Researchers have recently identified a small molecule called Curaxin as having superior anti-cancer properties and reduced toxicity compared to many other “traditional” treatment methods like chemotherapy. This small molecule works by activating p53 and inhibiting nuclear factor kB.

To test the T-Bioinfo platform’s ability to identify molecules similar in activity to Curaxin, Pine Biotech uses several improved molecular screening methods to guide drug discovery.

The library of small molecules includes TRYPOS key based approach to better characterize the physiochemical properties of the molecules, linear representation to allow for faster screening and a clustering process to dramatically increase efficiency and provide sufficient levels of accuracy.

This can be shown with the visual above: as Pine Biotech (QAlign) identified eighteen molecules with the same active ingredient as Curaxin, while MolSoft, an industry standard only identified seven molecules.


Bioinformatics Agricultural Applications


Grosmannia clavigera is a widespread pathogen of coniferous trees and a common symbiotic of bark beetles (MPB, Dendroctonus ponderosae). G. clavigera is known as blue stain fungus because of the characteristic blue-gray stain that it gives to infected wood. Over the course of an infection, G. clavigera spreads throughout the wood of the host tree. If left unchecked, G. clavigera infections will eventually disrupt the transport of nutrients and water, which will result in the death of the host tree. G. clavigera can also prevent the flow of sap, which drastically lowers the host tree’s resistance to infestation from bark beetles, who, in turn, carry spores and help propagate the fungus. The Pine beetle and its associated symbiotic fungus are the cause of a large wipeout of pines across North America. The USDA reported an impact on tree life spanning across 900 miles of trail in Colorado and Wyoming and a loss of 18 million hectares of Pine forest across western Canada.

To investigate how G. clavigera (i) survives under toxic conditions of monoterpenes found in trees’ bark as a defense mechanism and (ii) which genes are involved in the removal and consumption of monoterpenes as a carbon source, G. clavigera wild-type and G. clavigera ABC transporter mutant were grown on two media: malt extract agar (MEA), an enriched medium containing the full nutrient sources needed for successful fungal growth; and yeast nitrogen base (YNB), a minimal medium lacking both the carbon source and amino acids necessary for fungal growth. Both of these media were used to grow G. clavigera with and without the addition of monoterpenes.

It was found in the absence of common nutrients, the fungus is capable of using monoterpenes as an energy source. The up-regulation of enzymes involved in beta-oxidation of fatty acids in mitochondria suggest that the metabolism of monoterpenes into an energy source relies heavily on the mitochondria. The G. clavigera project shows that the T-Bioinfo platform allows analysis of transcriptome data in poorly understood genomes. The T-Bioinfo Platform assists in annotation of the genome as well as allowing for the identification of biochemical mechanisms.

T Bioinfo Bioinformatics platform Grosmannia clavigera


DARPA “Prophecy”

According to the National Institute of Allergy and Infectious Diseases, in the past 20 years, out of the newly recognized pathogens that impact human and animal health, approximately 44 percent are viruses. Many of these pathogens, particularly RNA viruses, are characterized by a high mutation rate that allows them to rapidly adapt to a changing environment, as occurred in the 2009 H1N1 pandemic. Additionally, many viruses undergo more widespread genetic events (e.g., rearrangements, reassortments) that significantly alter the viral genome. These changes can produce virions capable of evading existing vaccine-acquired and convalescent immunity in humans and animals.


The goal of DARPA’s Prophecy (Pathogen Defeat) program 1 was to organize a team of researchers that could identify how genetic events significantly alter the viral genome. Researchers from Stanford University UCSF and the Tauber Bioinformatics Research Center at Haifa University teamed up to develop a biological method (CirSeq) and computational approaches to investigate viral adaptation and interaction with hosts(host-pathogen circuitry). As a result, the funding from DARPA helped create the CirSeq method published in Nature in 2014 2 and the T-BioInfo platform’s section dedicated to Virology.


One of the outcomes of the DARPA project was a close collaboration between researchers studying viral SNVs, host data, including protein-protein interactions and post translational modifications of proteins. The computational challenges in analyzing this heterogenous datasets helped define the concept of “fitness” that can characterize a viral mutation into beneficial, neutral, detrimental and lethal. These characteristics can be mapped onto the protein surfaces to show which areas are highly conserved and are better targets for intervention, enabling a more precise way for vaccine and drug development.

View an interactive demo

The DARPA project was addressing an important challenge posed by Dengue, a virus that is characterized by similar symptoms, even though 4 defined virus types have been studied.3

Next: Read about Host-Pathogen Circuitry

Visual Computational Biology

Pine-Biotech-Big-Data_pdf__page_2_of_9_The need for education in computational biology is now greater than ever. Omics and large data sets have infiltrated many of the most current big research questions.

Education initiatives in bioinformatics will allow for communication of digital biology among colleagues and students. Pine Biotech users can utilize both tutorials and guides to act as an introduction into relevant biology that can be understood by users with various backgrounds.

This would allow for researchers at any level to be able to keep up to date with the bioinformatics workflows utilized by Pine Biotech. We believe this will allow for the incorporation of new perspectives in visual computational biology and bioinformatics.

Education for Personalized Medicine




Pine Biotech’s educational courses seek to provide learners with the skills to use this type of technology in their own work. Click here to explore online educational modules on a variety bioinformatic topics.

Big data represents one of the fastest growing areas in education and one of the biggest challenges currently facing tech companies in this era. Yet this challenge has huge benefits, as it offers the potential to disseminate knowledge with physicians struggling to stay current with the latest clinical practices.

Personalized medicine, made possible by big data, allows for the tailoring of treatments for an individuals needs. The testing of genetic information before treatment can allow patients to avoid taking medications that may be harmful or ineffective. This allows for physicians to tailor each individual patients treatment by their genetic information.

For example, Warfarin, a commonly prescribed anticoagulant, dosage can be changed for optimal treatment by understanding the genetic markers that affect a patient’s metabolism. While this field is still developing, a targeted drug therapy could be produced for each individual patient and increase the success rate for illnesses such as heart disease and cancer.


To defend against the threats posed by climate change, pathogens, and insects, bioinformatic analysis can be used in the field of agrotechnology to identify and enhance the natural defenses and adaptation of plants to changing conditions.

As the agriculture industry moves into a “data-centric” era, the information we glean will help farmers meet the growing needs of an expanding global population. The demand on agriculture is expected to increase by 70% in the next ten years, making an increase in productivity absolutely vital. Biotechnology, with a focus on “omics”, can help farmers to increase their productivity.

Pine Biotech wants to deliver the best data into farmers’ hands and improve agriculture by providing solutions for farmers to get more value from their crop growth than ever before.

Host-Pathogen Circuitry


Ebola, Chikungunya, Dengue and Zika have all been recently in the news. Many viruses, in particular RNA viruses, have short generation times and relatively high mutation rates (on the order of one point mutation or more per genome per round of replication for RNA viruses). This elevated mutation rate, when combined with natural selection, allows viruses to quickly adapt to changes in their host environment. The rapidity of viral evolution is problematic for the development of antiviral drugs, as resistant mutations often appear within weeks or months after the beginning of the treatment. One of the main theoretical models to study viral evolution is the quasispecies model, considering the virus as a quasispecies.1

It is important to remember that the rapid evolution of viruses is not one-sided. The host plays an important role, since the viruses use the cellular mechanisms of the host to replicate. Conceptualizing this interaction as a host-pathogen circuitry is a way to look at the complex relationship between viruses and hosts to better understand viral evolution. In this context, the interaction points and their origin are important to consider, such as viral single nucleotide varations, protein-protein interactions, post-translational modification of proteins, and host response data such as gene expression, genetic variations and even non-coding regions of host RNA.

In such a complex system, a major objective would be to identify key players in a network of dependencies and to select points of contact that could be influenced. This objective can only be achieved if the data that is generated is highly reliable and precise.

Next: Read about Precision Sequencing with CirSeq.

CirSeq: Precision Sequencing


The difficulty in developing a vaccine for RNA viruses lies in their ability to change rapidly and evolve to evade the host immune system and grow resistant to drug treatments. These RNA virus populations are often heterogeneous; composed of a group of related members, known as a quasispecies.

Until now, there have been no reliable systems to predict mutations responsible for the emergence of new viral strains. Traditional sequencing methods do not have the resolution needed to identify the mutations that create these subpopulations and thus cannot differentiate the mutations responsible for virus survival and adaptation.

Pine Biotech utilizes a combined biological and computational method to accurately measure the mutations of various RNA viruses, including Poliovirus. The biological method, known as CirSeq1, improves traditional sequencing methods. CirSeq allows for the identification of individual viral strains within a population, through highly accurate sequence data. In parallel, the utilization of our innovative sequencing platform allows for a new genetic approach to study the evolution of viruses within the context of their host. CirSeq works by converting viral RNAs into circular molecules. Viral mutations remain within the circular copies while errors are avoided.

When CirSeq was applied to poliovirus, researchers were able to define single nucleotide mutation rates and established a platform for studying the underlying evolution of virus populations in human cells. Additionally, this method can be used to determine the fitness of each base at every position in the genome, allowing analytical determination of which bases are neutral and which have been positively or negatively selected. This information can, in turn, be used to develop new innovative techniques.

Next: Read about Virology Studies on T-BioInfo.

Virology Studies on T-BioInfo

The T-BioInfo was first used by the team at Pine Biotech to conduct virology studies. Genome-wide fitness calculations enabled by CirSeq, combined with structural information, can provide high-definition, bias-free insights into structure-function relationships, potentially revealing novel functions for viral proteins and RNA structures, as well as nuanced insights into a viral genome’s phenotypic space. Such analyses have the power to reveal protein residues or domains that directly correspond to viral functional plasticity and may significantly inform our structural and mechanistic understanding of host–pathogen interactions.



For more information on the T-BioInfo platform, please visit the Pine Biotech T-BioInfo website.