  • DaTo: an atlas of biological databases and tools全球生物数据库与生物软件地图册

  • Nucleotide Sequence Databases核酸序列数据库

    • NCBI - National Center for Biotechnology Information
    • EBI - European Bioinformatics Institute
    • DDBJ - DNA Data Bank of Japan
  • Protein Sequence Databases蛋白质序列数据库

    • SWISS-PROT & TrEMBL - Protein sequence database and computer annotated supplement
    • UniProt - UniProt (Universal Protein Resource) is the world's most comprehensive catalog of information on proteins. It is a central repository of protein sequence and function created by joining the information contained in Swiss-Prot, TrEMBL, and PIR.
    • PIR - Protein Information Resource
    • HUPO - HUman Proteome Organization
  • Database Searching by Sequence Similarity序列相似性搜索

  • Sequence Alignment序列比对

    • T-COFFEE - multiple sequence alignment
    • ClustalW2 @ EBI - multiple sequence alignment
    • BOXSHADE - pretty printing and shading of multiple alignments
    • Splign - Splign is a utility for computing cDNA-to-Genomic, or spliced sequence alignments. At the heart of the program is a global alignment algorithm that specifically accounts for introns and splice signals.
    • Spidey - an mRNA-to-genomic alignment program
    • SIM4 - a program to align cDNA and genomic DNA
    • PipMaker - computes alignments of similar regions in two (long) DNA sequences
    • VISTA - align + detect conserved regions in long genomic sequences
  • Human Genome Databases人类基因组数据库


  • Next Generation Sequencing下一代测序

    • Journal Bioinformatics for Next Generation Sequencing
    •  Integrated solutions
      * CLCbio Genomics Workbench - de novo and reference assembly of Sanger, Roche FLX, Illumina, Helicos, and SOLiD data. Commercial next-gen-seq software that extends the CLCbio Main Workbench software. Includes SNP detection, CHiP-seq, browser and other features. Commercial. Windows, Mac OS X and Linux.
      * Galaxy - Galaxy = interactive and reproducible genomics. A job webportal.
      * Genomatix - Integrated Solutions for Next Generation Sequencing data analysis.
      * JMP Genomics - Next gen visualization and statistics tool from SAS. They are working with NCGR to refine this tool and produce others.
      * NextGENe - de novo and reference assembly of Illumina, SOLiD and Roche FLX data. Uses a novel Condensation Assembly Tool approach where reads are joined via "anchors" into mini-contigs before assembly. Includes SNP detection, CHiP-seq, browser and other features. Commercial. Win or MacOS.
      * SeqMan Genome Analyser - Software for Next Generation sequence assembly of Illumina, Roche FLX and Sanger data integrating with Lasergene Sequence Analysis software for additional analysis and visualization capabilities. Can use a hybrid templated/de novo approach. Commercial. Win or Mac OS X.
      * SHORE - SHORE, for Short Read, is a mapping and analysis pipeline for short DNA sequences produced on a Illumina Genome Analyzer. A suite created by the 1001 Genomes project. Source for POSIX.
      * SlimSearch - Fledgling commercial product.

      Align/Assemble to a reference
      * BFAST - Blat-like Fast Accurate Search Tool. Written by Nils Homer, Stanley F. Nelson and Barry Merriman at UCLA.
      * Bowtie - Ultrafast, memory-efficient short read aligner. It aligns short DNA sequences (reads) to the human genome at a rate of 25 million reads per hour on a typical workstation with 2 gigabytes of memory. Uses a Burrows-Wheeler-Transformed (BWT) index. Link to discussion thread here. Written by Ben Langmead and Cole Trapnell. Linux, Windows, and Mac OS X.
      * Exonerate - Various forms of pairwise alignment (including Smith-Waterman-Gotoh) of DNA/protein against a reference. Authors are Guy St C Slater and Ewan Birney from EMBL. C for POSIX.
      * GenomeMapper - GenomeMapper is a short read mapping tool designed for accurate read alignments. It quickly aligns millions of reads either with ungapped or gapped alignments. A tool created by the 1001 Genomes project. Source for POSIX.
      * GMAP - GMAP (Genomic Mapping and Alignment Program) for mRNA and EST Sequences. Developed by Thomas Wu and Colin Watanabe at Genentec. C/Perl for Unix.
      * gnumap - The Genomic Next-generation Universal MAPper (gnumap) is a program designed to accurately map sequence data obtained from next-generation sequencing machines (specifically that of Solexa/Illumina) back to a genome of any size. It seeks to align reads from nonunique repeats using statistics. From authors at Brigham Young University. C source/Unix.
      * MAQ - Mapping and Assembly with Qualities (renamed from MAPASS2). Particularly designed for Illumina with preliminary functions to handle ABI SOLiD data. Written by Heng Li from the Sanger Centre. Features extensive supporting tools for DIP/SNP detection, etc. C++ source
      * MOSAIK - MOSAIK produces gapped alignments using the Smith-Waterman algorithm. Features a number of support tools. Support for Roche FLX, Illumina, SOLiD, and Helicos. Written by Michael Strömberg at Boston College. Win/Linux/MacOSX
      * Novocraft - Tools for reference alignment of paired-end and single-end Illumina reads. Uses a Needleman-Wunsch algorithm. Can support Bis-Seq. Commercial. Available free for evaluation, educational use and for use on open not-for-profit projects. Requires Linux or Mac OS X.
      * PASS - It supports Illumina, SOLiD and Roche-FLX data formats and allows the user to modulate very finely the sensitivity of the alignments. Spaced seed intial filter, then NW dynamic algorithm to a SW(like) local alignment. Authors are from CRIBI in Italy. Win/Linux.
      * RMAP - Assembles 20 - 64 bp Illumina reads to a FASTA reference genome. By Andrew D. Smith and Zhenyu Xuan at CSHL. (published in BMC Bioinformatics). POSIX OS required.
      * SHRiMP - Assembles to a reference sequence. Developed with Applied Biosystem's colourspace genomic representation in mind. Authors are Michael Brudno and Stephen Rumble at the University of Toronto. POSIX.
      * Slider- An application for the Illumina Sequence Analyzer output that uses the probability files instead of the sequence files as an input for alignment to a reference sequence or a set of reference sequences. Authors are from BCGSC. Paper is here.
      * SOAP - SOAP (Short Oligonucleotide Alignment Program). A program for efficient gapped and ungapped alignment of short oligonucleotides onto reference sequences. The updated version uses a BWT. Can call SNPs and INDELs. Author is Ruiqiang Li at the Beijing Genomics Institute. C++, POSIX.
      * SSAHA2 - SSAHA (Sequence Search and Alignment by Hashing Algorithm) is a tool for rapidly finding near exact matches in DNA or protein databases using a hash table. Developed at the Sanger Centre by Zemin Ning, Anthony Cox and James Mullikin. C++ for Linux/Alpha.
      * SOCS - Aligns SOLiD data. SOCS is built on an iterative variation of the Rabin-Karp string search algorithm, which uses hashing to reduce the set of possible matches, drastically increasing search speed. Authors are Ondov B, Varadarajan A, Passalacqua KD and Bergman NH.
      * SWIFT - The SWIFT suit is a software collection for fast index-based sequence comparison. It contains: SWIFT — fast local alignment search, guaranteeing to find epsilon-matches between two sequences. SWIFT BALSAM — a very fast program to find semiglobal non-gapped alignments based on k-mer seeds. Authors are Kim Rasmussen (SWIFT) and Wolfgang Gerlach (SWIFT BALSAM)
      * Vmatch - A versatile software tool for efficiently solving large scale sequence matching tasks. Vmatch subsumes the software tool REPuter, but is much more general, with a very flexible user interface, and improved space and time requirements. Essentially a large string matching toolbox. POSIX.

      De novo Align/Assemble
      * ABySS - Assembly By Short Sequences. ABySS is a de novo sequence assembler that is designed for very short reads. The single-processor version is useful for assembling genomes up to 40-50 Mbases in size. The parallel version is implemented using MPI and is capable of assembling larger genomes. By Simpson JT and others at the Canada's Michael Smith Genome Sciences Centre. C++ as source.
      * ALLPATHS - ALLPATHS: De novo assembly of whole-genome shotgun microreads. ALLPATHS is a whole genome shotgun assembler that can generate high quality assemblies from short reads. Assemblies are presented in a graph form that retains ambiguities, such as those arising from polymorphism, thereby providing information that has been absent from previous genome assemblies. Broad Institute.
      * Edena - Edena (Exact DE Novo Assembler) is an assembler dedicated to process the millions of very short reads produced by the Illumina Genome Analyzer. Edena is based on the traditional overlap layout paradigm. By D. Hernandez, P. François, L. Farinelli, M. Osteras, and J. Schrenzel. Linux/Win.
      * SEQAN - A Consistency-based Consensus Algorithm for De Novo and Reference-guided Sequence Assembly of Short Reads. By Tobias Rausch and others. C++, Linux/Win.
      * SHARCGS - De novo assembly of short reads. Authors are Dohm JC, Lottaz C, Borodina T and Himmelbauer H. from the Max-Planck-Institute for Molecular Genetics.
      * SSAKE - The Short Sequence Assembly by K-mer search and 3' read Extension (SSAKE) is a genomics application for aggressively assembling millions of short nucleotide sequences by progressively searching for perfect 3'-most k-mers using a DNA prefix tree. Authors are René Warren, Granger Sutton, Steven Jones and Robert Holt from the Canada's Michael Smith Genome Sciences Centre. Perl/Linux.
      * SOAPdenovo - Part of the SOAP suite. See above.
      * VCAKE - De novo assembly of short reads with robust error correction. An improvement on early versions of SSAKE.
      * Velvet - Velvet is a de novo genomic assembler specially designed for short read sequencing technologies, such as Solexa or 454. Need about 20-25X coverage and paired reads. Developed by Daniel Zerbino and Ewan Birney at the European Bioinformatics Institute (EMBL-EBI).

      SNP/Indel Discovery
      * ssahaSNP - ssahaSNP is a polymorphism detection tool. It detects homozygous SNPs and indels by aligning shotgun reads to the finished genome sequence. Highly repetitive elements are filtered out by ignoring those kmer words with high occurrence numbers. More tuned for ABI Sanger reads. Developers are Adam Spargo and Zemin Ning from the Sanger Centre. Compaq Alpha, Linux-64, Linux-32, Solaris and Mac
      * PyroBayes - PyroBayes is a novel base caller for pyrosequences from the 454 Life Sciences sequencing machines. It was designed to assign more accurate base quality estimates to the 454 pyrosequences. Developers at Boston College.

      Genome Annotation/Genome Browser/Alignment Viewer/Assembly Database
      * EagleView - An information-rich genome assembler viewer. EagleView can display a dozen different types of information including base quality and flowgram signal. Developers at Boston College.
      * LookSeq - LookSeq is a web-based application for alignment visualization, browsing and analysis of genome sequence data. LookSeq supports multiple sequencing technologies, alignment sources, and viewing modes; low or high-depth read pileups; and easy visualization of putative single nucleotide and structural variation. From the Sanger Centre.
      * MapView - MapView: visualization of short reads alignment on desktop computer. From the Evolutionary Genomics Lab at Sun-Yat Sen University, China. Linux.
      * SAM - Sequence Assembly Manager. Whole Genome Assembly (WGA) Management and Visualization Tool. It provides a generic platform for manipulating, analyzing and viewing WGA data, regardless of input type. Developers are Rene Warren, Yaron Butterfield, Asim Siddiqui and Steven Jones at Canada's Michael Smith Genome Sciences Centre. MySQL backend and Perl-CGI web-based frontend/Linux.
      * STADEN - Includes GAP4. GAP5 once completed will handle next-gen sequencing data. A partially implemented test version is available here
      * XMatchView - A visual tool for analyzing cross_match alignments. Developed by Rene Warren and Steven Jones at Canada's Michael Smith Genome Sciences Centre. Python/Win or Linux.

      Counting e.g. CHiP-Seq, Bis-Seq, CNV-Seq
      * BS-Seq - The source code and data for the "Shotgun Bisulphite Sequencing of the Arabidopsis Genome Reveals DNA Methylation Patterning" Nature paper by Cokus et al. (Steve Jacobsen's lab at UCLA). POSIX.
      * CHiPSeq - Program used by Johnson et al. (2007) in their Science publication
      * CNV-Seq - CNV-seq, a new method to detect copy number variation using high-throughput sequencing. Chao Xie and Martti T Tammi at the National University of Singapore. Perl/R.
      * FindPeaks - perform analysis of ChIP-Seq experiments. It uses a naive algorithm for identifying regions of high coverage, which represent Chromatin Immunoprecipitation enrichment of sequence fragments, indicating the location of a bound protein of interest. Original algorithm by Matthew Bainbridge, in collaboration with Gordon Robertson. Current code and implementation by Anthony Fejes. Authors are from the Canada's Michael Smith Genome Sciences Centre. JAVA/OS independent. Latest versions available as part of the Vancouver Short Read Analysis Package
      * MACS - Model-based Analysis for ChIP-Seq. MACS empirically models the length of the sequenced ChIP fragments, which tends to be shorter than sonication or library construction size estimates, and uses it to improve the spatial resolution of predicted binding sites. MACS also uses a dynamic Poisson distribution to effectively capture local biases in the genome sequence, allowing for more sensitive and robust prediction. Written by Yong Zhang and Tao Liu from Xiaole Shirley Liu's Lab.
      * PeakSeq - PeakSeq: Systematic Scoring of ChIP-Seq Experiments Relative to Controls. a two-pass approach for scoring ChIP-Seq data relative to controls. The first pass identifies putative binding sites and compensates for variation in the mappability of sequences across the genome. The second pass filters out sites that are not significantly enriched compared to the normalized input DNA and computes a precise enrichment and significance. By Rozowsky J et al. C/Perl.
      * QuEST - Quantitative Enrichment of Sequence Tags. Sidow and Myers Labs at Stanford. From the 2008 publication Genome-wide analysis of transcription factor binding sites based on ChIP-Seq data. (C++)
      * SISSRs - Site Identification from Short Sequence Reads. BED file input. Raja Jothi @ NIH. Perl.
      **See also this thread for ChIP-Seq, until I get time to update this list.

      Alternate Base Calling
      * Rolexa - R-based framework for base calling of Solexa data. Project publication
      * Alta-cyclic - "a novel Illumina Genome-Analyzer (Solexa) base caller"

      * ERANGE - Mapping and Quantifying Mammalian Transcriptomes by RNA-Seq. Supports Bowtie, BLAT and ELAND. From the Wold lab.
      * G-Mo.R-Se - G-Mo.R-Se is a method aimed at using RNA-Seq short reads to build de novo gene models. First, candidate exons are built directly from the positions of the reads mapped on the genome (without any ab initio assembly of the reads), and all the possible splice junctions between those exons are tested against unmapped reads. From CNS in France.
      * MapNext - MapNext: A software tool for spliced and unspliced alignments and SNP detection of short sequence reads. From the Evolutionary Genomics Lab at Sun-Yat Sen University, China.
      * QPalma - Optimal Spliced Alignments of Short Sequence Reads. Authors are Fabio De Bona, Stephan Ossowski, Korbinian Schneeberger, and Gunnar Rätsch. A paper is available.
      * TopHat - TopHat is a fast splice junction mapper for RNA-Seq reads. It aligns RNA-Seq reads to mammalian-sized genomes using the ultra high-throughput short read aligner Bowtie, and then analyzes the mapping results to identify splice junctions between exons. TopHat is a collaborative effort between the University of Maryland and the University of California, Berkeley


  • Rice databases
    BGI-RISe: Beijing Genomics Institute Rice Information System
    RGP: Rice Genome Research Program
    RiceNetDB: is currently the most comprehensive regulatory database on Oryza sativa based on genome annotation. It was displayed in three levels: GEM, PPIs and GRNs to facilitate biomolecular regulatory analysis and gene-metabolite mapping.
    MOsDB: MIPS Oryza sativa database
    Oryzabase: Rice genetics and genomics
    Oryza Tag Line database: T-DNA insertion mutants of rice
    RRIN: Oryza sativa protein-protein interactions network
  • Plant Promoter and Regulatory Element Resources
    AGRIS Currently contains two databases, AtcisDB (Arabidopsis thaliana cis-regulatory database) and AtTFDB (Arabidopsis thaliana transcriptionfactor database).
    AthaMap A genome-wide map of putative transcription factor binding sites in Arabidopsis thaliana.
    AtProbe The Arabidopsis thaliana promoter binding element database, an aid to find binding elements and check data against the primary literature.
    DATF: Database of Arabidopsis Transcription Factors (DATF) contains known and predicted Arabidopsis transcription factors with sequences and many other features including 3D structure templates, EST expression information, transcription factor binding sites and Nuclear Location Signals.
    DoOP: Databases of Orthologous Promoters, a database containing orthologous clusters of promoters from Homo sapiens, Arabidopsis thaliana and other organisms.
    GRASSIUS A public web resource composed by a collection of databases, computational and experimental resources that relate to the control of gene expression in the grasses, and their relationship with agronomic traits. GRASSIUS currently contains regulatory information on maize, rice, sorghum and sugarcane.
    PlantCare Database of plant cis-acting regulatory elements and a portal to tools for in silico analysis of promoter sequences.
    PlantProm DB Database with annotated, non-redundant collection of proximal promoter sequences for RNA polymerase II with experimentally determined transcription start sites (TSS) from various plant species.
    PlantTFDB: Plant Transcription Factor Database, an integrative plant transcription factor database that provides a web interface to access large (close to complete) sets of transcription factors of several plant species, currently encompassing Arabidopsis thaliana (thale cress), Populus trichocarpa (poplar), Oryza sativa(rice), Chlamydomonas reinhardtii and Ostreococcus tauri.
    PPDB (Plant Promoter DB) Database that provides transcription start sites (TSS) and other structural information for Arabidopsis and rice promoters.
    Transfac Database on eukaryotic transcription factors, their genomic binding sites and DNA-binding profiles. Commercial site.


  • Databases of other Organisms其他物种数据库


  • Genome-wide Analysis基因组分析

    • MBGD - comparative analysis of completely sequenced microbial genomes
    • COGs - phylogenetic classification of orthologous proteins from complete genomes
    • STRING - detect whether a given query gene occurs repeatedly with certain other genes in potential operons
    • Pedant - automatic whole genome annotation
    • GeneCensus - various whole genome comparisons


  • Protein Domains: Databases and Search Tools蛋白质域

    • InterPro - integration of Pfam, PRINTS, PROSITE, SWISS-PROT + TrEMBL
    • PROSITE - database of protein families and domains
    • Pfam - alignments and hidden Markov models covering many common protein domains
    • SMART - analysis of domains in proteins
    • ProDom - protein domain database
    • PRINTS Database - groups of conserved motifs used to characterise protein families
    • Blocks - multiply aligned ungapped segments corresponding to the most highly conserved regions of proteins
    • TIGRFAMs - yet more protein families based on Hidden Markov Models


  • Motif and Pattern Search in Sequences序列模体和样式搜索

    • Gibbs Motif Sampler - identification of conserved motifs in DNA or protein sequences
    • AlignACE Homepage - gene regulatory motif finding
    • MEME  - motif discovery and search in protein and DNA sequences
    • SAM - tools for creating and using Hidden Markov Models
    • Pratt - discover patterns in unaligned protein sequences
    • Motivated Proteins - a web facility for exploring small hydrogen-bonded motifs 


  • Protein 3D Structure蛋白质三维结构

    • PDB - protein 3D structure database
    • RasMol / Protein Explorer - molecule 3D structure viewers
    • UCL BSM CATH classification
    • FSSP - fold classification based on structure-structure alignment of proteins
    • SWISS-MODEL - homology modeling server
    • Structure Prediction Meta-server
    • K2 - protein structure alignment
    • DALI - 3D structure alignment server
    • DSSP - defines secondary structure and solvent exposure from 3D coordinates
    • HSSP Database - Homology-derived Secondary Structure of Proteins
    • PredictProtein & PHD - predict secondary structure, solvent accessibility, transmembrane helices, and other stuff
    • Jpred3 - protein secondary structure prediction
    • PSIpred (& MEMSAT & GenTHREADER) - protein secondary structure prediction (& transmembrane helix prediction & tertiary structure prediction by threading)
    • Structural analysis
      Pfam: A collection of protein families
      SMART: Simple Modular Architecture Research Tool
      SCOP: Structural Classification of Proteins
      MEME: Motif-based sequence analysis tools
      Jpred3: A Secondary Structure Prediction Server
      PSIPRED: A highly accurate method for protein secondary structure prediction
      DiANNA: Cysteine state and Disulfide Bond partner prediction
      Robetta: Full-chain protein structure prediction server
      PhyML: Phylogenetic Tree by Maximum Likelihood
    • Molecular Graphics Software Links
      • BioBlender 
        Open Source viewer that includes features for morphing proteins and visualization of lipophilic and electrostatic potentials.
      • BRAGI 
        A protein visualization and modeling program
      • CCP4mg 
        Create beautiful publication quality images and movies. Users can superpose and analyse structures as well. The program runs 'out of the box' on Linux, MacOSX and Windows platforms.
      • Chimera 
        Interactive molecular modeling system, free to academic/non-profit; displays multiple sequence alignments and associated structures, atom-type and H-bond identification, molecular dynamics trajectories (AMBER format), and offers ligand-screening interface (DOCK), filter by number/position of H-bonds, and extensibility to create custom modules - for Windows, Linux, Mac OS X, IRIX, and Tru64 Unix
      • Cn3D 
        Simultaneously displays structure, sequence, and alignment, with annotation and alignment editing features, for use with 3-D structures from NCBI's Entrez; available for Windows, Macintosh, and Unix
      • CrystalMaker 
        A program for building, displaying and manipulating all kinds of crystal and molecular structures.
      • ePMV 
        Embedded Python Molecular Viewer (ePMV) is an open-source plug-in that runs molecular modeling software directly inside of professional 3D animation applications
      • Foldit 
        Foldit is a crowdsourcing computer game based on protein modeling.
      • iMol 
        Open GL graphics program displays small, large, and multiple molecules; measures distances and angles, superimposes structures, calculates RMSD between atom coordinates, structurally aligns chains, and displays dynamics trajectories. For Mac OS X incl. 10.2
      • Jmol 
        Jmol is a free, open source molecule viewer for students, educators, and researchers in chemistry and biochemistry. It is cross-platform, running on Windows, Mac OS X, and Linux/Unix systems.
      • Mage and Kinemages
        Interactive molecular display for research and educational uses. Free, open source for Windows and Mac (OSX or PPC), Unix, and Linux. A Java version does 3-D Web display without plug-ins.
      • Marvin
        Marvin is a collection of tools for drawing, displaying and characterizing chemical structures, queries, macromolecules and reactions for all operating systems, web pages and custom applications.
      • MembraneEditor
        Interactively generate heterogeneous PDB-based membranes with varying lipid compositions and semi-automatic protein placement. Supports membrane patches and vesicles, microdomains as well as stacking of monolayer and/or bilayer membranes.
      • Molecule World
        Molecule World 2.1 is an iPad application for viewing and manipulating 3D chemical and molecular structures. Structures can be downloaded and displayed from the PubChem, PDB, and NCBI structure databases together with the sequences for proteins and nucleic acids. Structures can be drawn as tubes, ball and stick, or space filling modes. Coloring options include residue, charge, hydrophobicity, rainbow, and molecule. Parts of structures can be hidden or displayed with mixed coloring and drawing modes.
      • Molecule World for iPhone
        Molecule World for iPhone can be used on the iPhone or iPod touch to display and manipulate 3D chemical and molecular structures from the PubChem, PDB, or NCBI structure databases. Drawing options include ball and stick, space fill, and ball and stick modes. Coloring options include rainbow, residue, charge, hydrophobicity, and molecule. Proteins, nucleic acids, and heterogens can be displayed in different modes.
      • Molecule World DNA Binding Lab
        A classroom ready iPad application for exploring the ways chemicals and proteins bind to DNA. The DNA Binding Lab uses Molecule World?s rendering engine and display features to highlight different molecules and understand how they intact. The DNA Binding Lab includes instructions, three examples, and 40 unknowns that can be assigned to students. Photo sharing capabilities allow students to share their work with teachers to aid with assessment.
      • Molecules
        An iPhone application for PDB structures
      • MolScript
        A program for displaying structures in both detailed and schematic formats and writing images in various formats for Unix
      • MolviZ.org
        Free, interactive visualization tutorials
      • MVM
        Molecular Visualization Program and GUI of ZMM. MVM is a free molecular viewer that can be used to display protein, nucleic acids, oligosacharides, small and macromolecules. It has an intuitive interface. In addition to being a molecular viewer, it is the user interface of a very powerful molecular mechanics engine (ZMM).
      • PMV (Python Molecular Viewer) 
        An interactive molecular visualization and modeling environment for manipulation and viewing of multiple molecules.
      • PocketMol 
        Program to view and manipulate PDB files on a PocketPC 
      • POLYVIEW 
        • POLYVIEW-2D
          Protein structure annotation using sequence profiles
        • POLYVIEW-3D
          Versatile annotation and high quality visualization of macromolecular structures
        • POLYVIEW-MM
          Analysis and visualization of macromolecular motions
      • ProteinScope 
        Free viewer to display and manipulate PDB files and create animations and slides of proteins for Windows. Online ordering of protein 3D prints in several color schemes.
      • Prosat 
        Mapping protein sequence annotations onto a protein structure and visualizing them simultaneously with the structure.
      • PyMOL 
        A free and open-source molecular graphics system for visualization, animation, editing, and publication-quality imagery. PyMOL is scriptable and can be extended using the Python language. Supports Windows, Mac OSX, Unix, and Linux
      • QuteMol 
        An open source (GPL), interactive, high quality molecular visualization system. QuteMol exploits the current GPU capabilites through OpenGL shaders to offers an array of innovative visual effects.
      • RasMol
        A free viewing system for PDB coordinate files that runs on Mac (PPC), Windows, Unix, and Linux systems. Open source versions are also available.
      • Raster3D
        A set of tools for generating high quality raster images of proteins or other molecules. Freeware for Mac OSX, Windows, Unix, and Linux
      • RasTop (v. 2.0) 
        A free user-friendly graphical interface to RasMol molecular visualization software (v., available for Windows and Linux
      • Ribbons
        A program for molecular illustration and error analysis, for for Mac OSX, Windows, Unix, and Linux
      • RCSB MBT Viewers 
        The MBT toolkit is a framework that allows to create various viewers. It is used for 4 different viewers on the RCSB PDB web site.
      • RmscopII 
        A Tcl/Tk script responsible to redirect PDB files or RasMol scripts to multiple RasMol sessions; can be used as a Web browser helper application or as a standalone program for Mac (OSX or PPC), Windows, or Unix
      • Schrödinger Product Suites 
        Schrödinger's full product offerings range from general molecular modeling programs to a comprehensive suite of drug design software, as well as a state-of-the-art suite for materials research. All products are run with Maestro, a unified interface for all Schrödinger software, which is available for Mac, Windows, and Linux.
      • SPADE 
        The Structural Proteomics Application Development Environment (SPADE) provides community tools for development and deployment of essential structure and sequence equipment. Includes a chemical probing suite to support experimental verification of predicted structural models. Written in Python with scripting tools available. Runs on Windows, Linux and Mac.
      • STRAP 
        Align proteins by sequence and 3D structure.
      • Swiss PDB viewer
        A 3D graphics and molecular modeling program for the simultaneous analysis of multiple models and for model-building into electron density maps. The software is available for Mac (OSX or PPC), Windows, Linux, or SGI
      • UGENE 
        A free and open-source tool with PDB format visualization support written in fast memory efficient C++ code. Supports Windows, Mac OSX, Unix, and Linux.
      • VMD
        VMD (Visual Molecular Dynamics) runs on many platforms including MacOS X, and several versions of Unix and Windows. VMD provides visualization, analysis, and Tcl/Python scripting features, and has recently added sequence browsing and volumetric rendering features. VMD is distributed free of charge.
      • YASARA 
        A complete molecular graphics and modeling program, including interactive molecular dynamics simulations, structure determination, analysis and prediction, docking, movies and eLearning for Windows, Linux and MacOSX.
      • Zeus 
        A molecular visualization tool that supports PDB, MOL, MOL2/SYBYL and XYZ file formats. The rendering engine can output high quality molecular graphics. Zeus provides a sequence search that can highlight within the molecular structure. Ramachandran plots of internal dihedral angles can be generated and exported. PDB files can be automatically downloaded from the RSCB PDB.


  • Phylogeny & Taxonomy进化与分类


  • Gene Prediction基因预测


  • Gene Expression Databases基因表达数据库


  • Gene Regulation基因调控

    • TRAFAC - For identifying conserved and shared cis regulatory elements between a pair of genes.
    • CisMols - For identifying conserved and shared cis regulatory elements between a set of co-expressed genes.
    • EPD - eukaryotic promoter database
    • DBTSS - DataBase of Transcriptional Start Sites (human)
    • SCPD - Saccharomyces cerevisiae promoter database
    • DCPD - Drosophila Core Promoter Database
    • RegulonDB - a database on transcriptional regulation in E. coli
    • DPInteract - protein binding sites on E. coli DNA
    • PromoterInspector - prediction of promoter regions in mammalian genomic sequences
    • MatInspector - search for transcription factor binding sites
    • Cister - cis-element cluster finder
    • Gene regulatory Tools


  • Small RNA/MicroRNA小分子RNA
    • microRNA.org: microRNA Targets & Expression Profiles
    • PmirKB - Plant microRNA Knowledge Base
    • MicroRNA Target Prediction
      miRanda — miRNA target prediction for human, drosophila and zebrafish genomes
      miRBase — a comprehensive repository for miRNAs and their predicted targets
      miRDB — an online database for miRNA target prediction and functional annotations in animals
      miRNAMap — a genomic maps of microRNA genes and their target genes in mammalian genomes
      miR2Disease— a database providing comprehensive resource of miRNA deregulation in various human diseases
      TarBase — a comprehensive database of experimentally supported animal microRNA targets
      PicTar — microRNA targets for vertebrates, fly and nematodes
      TargetScan — a search for the presence of conserved sites that match the seed of each miRNA
      Target Gene Prediction at EMBL — miRNA-Target predictions for Drosophila miRNAs

    • Databases for microRNA Expression
      microRNA.org— predicted microRNA targets & target downregulation scores. Experimentally observed expression patterns
      HMDD — Human MicroRNA Disease Database (HMDD) is a database that contains the experimentally supported miRNA-disease association data, which are manually curated from publications. The dysfunction evidence or miRNAs
      and literature PubMed ID are also given
      TransmiR — a web query-driven database integrating the experimentally supported transcription factor and miRNA regulatory relations

    • RNA Secondary Structure Prediction
      DIANA MicroTest— a prediction of miRNA-mRNA interaction
      mfold — tools for predicting the secondary structure of RNA and DNA, mainly by using thermodynamic methods
      microInspector —a web tool for detection of miRNA binding sites in an RNA sequence
      miRNA Bioinfor —miRNA End Energy calculator which takes miRNA duplex to calculate free energy for 5 base pairs at one end plus a dangling nucleotide
      miRRim— a method for detecting miRNA foldbacks based on hidden Markov model (HMM)
      MXSCARNA— a multiple alignment tool for RNA sequences using progressive alignment based on pairwise structural alignment algorithm of SCARNA. Good for large scale analyses.
      RNAhybrid— a tool for finding the minimum free energy hybridisation of a long and a short RNA

    • MicroRNA Homologous Prediction
      miRNAminer — a web-based tool used for homologous miRNA gene search in several species
      miRviewer —a global view of homologous miRNA genes in many species
      RISCbinder— prediction of guide strand of microRNAs
      Mireval — Sequence evaluation of microRNA properties

    • MicroRNA Deep Sequencing
      miRanalyzer— A microRNA detection and analysis tool for next-generation sequencing experiments
      miRNAkey— A software pipeline for the analysis of microRNA Deep Sequencing data
      miRDeep— Discovering known and novel miRNAs from deep sequencing data


  • Metabolic, Gene Regulatory & Signal Transduction Network Databases代谢、基因调控和信号转导网络数据库

  • Systems Biology系统生物学

  • Synthetic Biology合成生物学

    • DNA Tools

      • BBOCUS (BackTranslation Based On Codon Usage Strategy) by Ferro and Purrello lab, a re-implementation of the algorithm in Graziano Pesole's BACKTR. It's based on cluster analysis (Complete Linkage algorithm), that requires a similarity matrix D containing distance between each pair of sequences of mRNA.
      • Benchling by Benchling, Inc. Free online tools for vector editing, restriction analysis, primer search, multi-sequence alignment, and more.
      • Biopolymer calculator by Schepartz lab. Calculate extinction coefficients, Tm's, and base composition for your DNA or RNA; calculate amino acid composition and extinction coefficient for your protein.
      • Clipboard by Austin Che. Web tool for getting complement, reverse complement, translation and restriction enzyme analysis of a DNA sequence.
      • Cytostudio by Molecula Maxima. An integrated development environment and a compiler for a high-level bio-programming language for Synthetic Biology. Based on iGEM conventions.
      • DNAWorks by Hoover and Lubkowski. A web tool for optimizing melting temperature during gene synthesis.
      • Geneious by Biomatters. Comprehensive suite of tools for molecular biology.
      • Genome Compiler is the industry's most user friendly genetic engineering design tool. It allows you to manipulate genetic information; from genes to plasmids to whole genomes. You can rapidly access extensive libraries of genetic parts, and easily order your final design from a variety of providers.
      • GeneDesign by Boeke lab. Collection of online (and some command line) tools for codon optimization and shuffling, restriction site editing, and so on.
      • GeneDesigner by DNA2.0. Combine genetic building blocks by drag-and-drop, codon optimize, restriction site editing, sequence oligo design etc.
      • GenoCAD is a design tool that uses collections or libraries of genetic parts and explicit design rules describing how these parts should be combined to engineer genetic constructs.
      • NEB Cutter by New England Biolabs, Inc.Tool for finding restriction sites, et cetera.
      • Synthetic Gene Designer by Gang Wu. A web platform that allows codon optimization to various extent. Compatible with non-standard genetic codes.
      • Vector NTI by Informax, Inc. Free-to-academics tool for sequence analysis and data management.
      • j5, DeviceEditor, and VectorEditor online tools
        • j5: DNA assembly design automation for (combinatorial) flanking homology (e.g., SLIC/Gibson/CPEC/SLiCE/yeast) and type IIs-mediated (e.g., Golden Gate/FX cloning) assembly methods.
        • DeviceEditor: a visual DNA design canvas that serves as front-end for j5.
        • VectorEditor: a visual DNA editing and annotation tool.
    • RNA Tools

      • Appendix by Ambion, Inc. Website with many useful nucleic acid parameters.
      • mFold by Michael Zuker. is for predicting RNA and DNA folds, calculating Tm's and free energies.
    • Protein Tools

      • Cn3D by NCBI. A helper application for your web browser that allows you to view 3-dimensional structures from NCBI's Entrez retrieval service. It doesn't read PDB files but can be more straightforward to use than DeepView.
      • DeepView by GlaxoSmithKline & Swiss Institute of Bioinformatics. Awesome program for viewing and studying protein structure.
      • ExPASy Proteomics server by the Swiss Institute of Bioinformatics. Collection of links to many pages to calculate parameters of your favorite proteins.
      • Modeller by Sali Lab. For homology or comparative modeling of protein three-dimensional structures.
    • CAD Tools

      • TinkerCell by Deepak Chandran. Construct computational models using biological parts, cells, and modules.
      • Metabolic Tinker by Kent McClymont and Orkun Soyer. Construct thermodynamically feasible metabolic paths among user-defined compounds.
    • General Tools

      • Colibri by Institut Pasteur. E. coli genome site; get sequences, see the position of your gene in the chromosome, see the function of your gene, and other fun stuff. You can also search for protein sequences/motifs within the E. coli genome.
      • JBEI Registry. A site where you can explore the various features of the JBEI Registry software, and even get some work done! A DNA part, plasmid, microbial strain, and Arabidopsis Seed online repository with physical sample tracking capabilities.
      • PaR-PaR Laboratory Automation Platform allows researchers to use liquid-handling robots effectively, enabling experiments that would not have been considered previously. After minimal training, a biologist can independently write complicated protocols for a robot within an hour.


This page last modified: August 01, 2014