Results of the 2015 Bioinformatics Research Survey

What are some bottlenecks you run into?

  • Resource allocation
  • The non-bioinformatically qualified people making the bioinformatics decisions
  • skills
  • Computational power
  • Lack of computational training and skills make it difficult to use the tools that are relevant & useful for my research
  • File format conversion; determining best approach when many are available (requires a lot of reading); genome assembly and other analyses take a lot of time and even with experience will occasionally fail after like a week of running and need to be restarted
  • data transfer time
  • Deal with large data volumes
  • low cpu speed
  • slow (old) computer
  • Storage issue, poor documentation
  • other users
  • Data accessibility
  • Space issues – for data and for doing analysis in the server
  • Bandwidth and disk IO
  • Lack of expertise in the field. Lack of easy-to-use software.
  • Little staff, little server
  • Lack of exactness of what is to be done with the data
  • Intense computing time
  • Bioinformatics is prediction. Without wet-lab experiments in silico drug design has no scope and wet-lab analysis for few drug design projects are not feasible or not possible to conduct
  • Storage of data
  • IT related problems (installation …)
  • lack of standart data formats
  • File format conversions
  • Lack of training (novice).
  • Visualization, lack of standardized formats, lack of metadata
  • Drive the project towards biological understanding. And to a lesser extent, bioinformatics handling of computer clusters.
  • Computing – multithreading
  • Too many users on the server
  • Adapting tools built for use on human databases. Converting identifiers between databases in a timely manner.
  • I need to move beyond simple scripting.
  • Infrastructure
  • I can find a lot of information on how to do analysis with genomes that are assembled and annotated but little information on the pre-processing steps.
  • Timelines
  • format parsing and installing other people’s software
  • space and memory
  • stand alone tools and software, use some biological databases
  • team
  • Unrealistic expectations
  • complexity of the software and unavaiability of required data or computational power
  • Lack of storage space, computational power.
  • PC power
  • Due to limited processing resources it is common to encounter freeze situations were you loose time. Furtheremore, it is often no possible to update the software as often as necessary.
  • Expertise
  • data conversion
  • Insufficient knowledge in specific topics
  • memory, storage space
  • Pipeline integration; documentation
  • Lab staff expecting miracles from badly designed experiments
  • lack of good planning from wet lab researchers
  • Processing speed
  • Storage Space, computer speed
  • The vast amount of ram that is required for genome analysis
  • Finding the right tool for the task (due to the huge number of options).
  • Server high demand and waiting time.
  • Mapping miRNA and metabolites to protein coding genes
  • Not enough samples
  • Memory, time for file transfer
  • sequence analysis, searching bibliographic databases etc.
  • Not enough information from PIs
  • Memory limitations. Inability to handle large sequences
  • Big data
  • Downloading software (it’s never as easy as it should be)
  • Inconsistent metadata format
  • Hardware limitations on my laptop
  • Preprocessing standards
  • Institutional server, expertise/training
  • unfamiliarity, lack of knowledge
  • Lack of space
  • Installing and configuring software
  • Figuring out what other people’s tools do. Often this means reading the code, but since I only code in Python and JS, sometimes I dont/cant ever know.
  • Waiting for the sequencing core and/or aligners

What problems running software do you most often encounter with your present solution?

  • On graphical tools, not enough command line support
  • Not knowing/understanding syntax rules, file naming rules for existing scripts
  • lack of proper documentation
  • Standard workflows
  • with lack of memory
  • No documentation. Source code is sloppy/unreadable, and No clear guide on how to improve specific types of analyses
  • Ram of pc
  • Understanding the process method
  • most bioinformatic software run on Linux/UNIX
  • Incompatibility or running into walls during installation
  • updation of software
  • installation, format issues, and lack of documentation
  • nothing
  • Most of the accurately predicting software’s are expensive to use. Even after prediction like drug-like molecules are difficult to test.
  • Standardization of data and lack of ease in handling huge data
  • poor documentation and examples
  • Graphics
  • Using poorly documented programms that are quite difficult to use in a modular way
  • permissions in institutional machines
  • Softwares suitable to work with polyploid genomes. Solution: Combine softwares or custom scripts.
  • i dnt hv a specific information
  • Run-time errors
  • ?
  • installing dependencies, getting all the input files in the correct format
  • not supporting scripts
  • Understanding the output & documentation
  • low speed cpu
  • Bugs in code
  • Unclear input requirements
  • Obsolete versions and functions, lack of clear documentation
  • Indent/variable errors for SQL databases
  • Interconnection with other tools and a lot of headaches trying to solve that
  • galaxy
  • Downloading prerequisite software packages
  • Clear output. Description incomplete.
  • Installation
  • Software crashes
  • Software version in differences
  • Mutually incompatible versions
  • Processing time, not actually software related.
  • Failure to run because of missing software dependencies
  • online tools are not reliable, sometime they work sometime they don’t work
  • Documentation is terrible and code lack tests (or test data).
  • Memory usage.
  • Problem with different linux versions
  • Mysterious errors
  • Limited computational capacity.
  • limited parameters information
  • Horrible documentation
  • installing softwares in my computer, as well as modify them according to my need.
  • None. I’m great at using existing tools.
  • Cpu
  • Downloading. I ask others for help
  • None, though sometimes documentation is lacking
  • CPU issues
  • Nothing
  • Limited resources
  • sometimes its hard to understand whats behind the scripting of the programs
  • installing multiple dependencies
  • Script
  • Difficult to install updated versions of software on the cluster.
  • Poorly documented perl scripts
  • need more data
  • Unfamiliarity with the software.
  • Documentation
  • proper corelation between in silico and invivo
  • slowness
  • Unusable documentation
  • slow in processing requests or freezing of screen due to low processor speed
  • Ability to configure own workflows
  • many softwares are not compatible with windows
  • Clear use cases
  • na
  • Differing input/output formats for different programs
  • No problems
  • NA
  • documentation of software, software version compatibility with computer OS, software Library dependency
  • Problems installing software because of dependencies
  • N/A
  • no problem as such
  • Unresolved dependency, poor documentation of programme
  • Lack of documentation; programs that do not compile;
  • Getting it compiled and working; Having to install same software on multiple systems; Software generally seems to be either too simple/user friendly (very little customization) or very complex with a steep learning curve (10’s of different options with little documentation); it can be difficult to spot minor errors in programming that result in subtle changes in the results; accessing high-memory machines (512GB+); the vast majority of bioinformatics programs have documentation that simply does not explain each and every step taken to get a result
  • software documentation is never explicit
  • Scripting or programming skills.
  • Crashes
  • Most tools are made for Mac users & don’t run properly on Windows. Also, dependencies.
  • i dont understand this question
  • Unclear installation/performing instructions
  • Lack of adequate documentation, abandonware.
  • Know-how
  • No
  • Lack of documentation
  • big data analytics
  • Nothing serious
  • Installation. Bad/no documentation
  • open source software instalation
  • segfaults
  • Not enough compute.
  • Different results obtains with different tools with same input.
  • Online Forums
  • lack of GUI
  • Documentation, hurdles to learning the software
  • API
  • time
  • I dont know what the software does. Like I know what its supposed to do, but I dont know the specifics, and that freaks me out.

If you train students, please say what in your view is the biggest challenge for training students in bioinformatics?

  • They Lack the basic knowledge about bioinfo
  • I teach a bioinformatics lab section. The hardest thing is keeping folks interested in the more mundane aspects of learning bioinformatics
  • Getting them to stop and think about what they’re doing rather than just randomly trying things until they get an answer they like.
  • fear of programming
  • Lack of interest in research
  • teaching biology as well as linux
  • scripting
  • That they should stay determined about what they want, because they can get lost, due to the very wide branching on bioinformatics which connects with other fields.
  • lacking of programming training
  • If they don’t continue using it, they lose it.
  • My students are ‘wet-lab Biologist’ they look at the computer as a black box
  • Lack of computing/programming basics
  • Poor notation/naming
  • Students fear of linux and CLI
  • Convincing them that once you get over the “hump” of learning about bioinformatics that it is a lot easier.
  • thinking about the broad perspectives
  • Setting their expectations realistically. They view BI as a cure-all solution, not a needs to an end. Most don’t realize that processing the data is merely the start
  • Be flexible and ever-changing
  • Establishing the lingo (Sometimes people use the same word to describe 2 different things).
  • commandline
  • Bioinformatics is a branch where other than coding/programming analysis of biological data also does. But most of the bioinformatics students lack the knowledge of Biology.
  • Practicing enough to get over the hump.
  • How diverse you have to be to understand it all
  • Lack of interest in computer science.
  • Lack of jobs in non-NGS data analysis. Lack of substantial mathematical background.
  • For Biology or chemistry background students, Bioinformatics is dry subject, so wont appreciate.
  • the amount of tools to choose from .
  • integration of two different fields like computer and biology is very hard task for them
  • Higher learning curve.
  • Awareness and confidence
  • To perform tasks in a clear manner: don’t leave mess and unclear steps on each step of analysis
  • Lack of software documentation
  • Scope regarding job
  • Programming and interpret the result
  • Exposure to good infrastructure which limits their scope of understanding in bioinformatics
  • Script
  • Whole genome analysis and ngs data analysis
  • Connecting students to the concepts
  • Understanding good experimental design
  • lack of CS skills
  • Finding a research topic interesting enough for the frustration of learning to program.
  • Lack of updation of databases
  • Logical thinking
  • N/A
  • Bringing multi-faceted knowledge at one go/step!
  • Make biologists understand the dry nature of bioinformatics and heuristics and of course, vice versa
  • Future proofing what they learn
  • Understanding the bigger context of the problems they are working on
  • As the field of bioinformatics is very huge and its hard t o impart every required skills to the students.If you teach them how to learn rather than what to do they will learn required skills on thier own
  • Good, relevant infrastructure.
  • Getting them to be skeptical about the results of bioinformatics programs
  • interest and commitment
  • getting them command-line competent
  • The lack of computing skills
  • The variety of backgrounds, everyone has different knowledge gaps that need to be filled in.
  • Lack of computational knowledge
  • Overcoming their fear of programming
  • Sequence analysis
View the full results on Google Docs.

Newsletter