Home >> Links
In this page:
Nucleotide Sequence Databases核酸序列数据库
NCBI - National Center for Biotechnology
EBI - European Bioinformatics Institute
DDBJ - DNA Data Bank of Japan
Protein Sequence Databases蛋白质序列数据库
SWISS-PROT & TrEMBL - Protein sequence
database and computer annotated supplement
UniProt - UniProt (Universal
Protein Resource) is the world's most
comprehensive catalog of information on
proteins. It is a central repository of
protein sequence and function created by
joining the information contained in
Swiss-Prot, TrEMBL, and PIR.
PIR - Protein Information Resource
- HUPO - HUman Proteome Organization
Database Searching by Sequence Similarity序列相似性搜索
- T-COFFEE - multiple sequence alignment
ClustalW2 @ EBI - multiple sequence
- BOXSHADE - pretty printing and shading
of multiple alignments
Splign - Splign is a utility for
computing cDNA-to-Genomic, or spliced
sequence alignments. At the heart of the
program is a global alignment algorithm that
specifically accounts for introns and splice
Spidey - an mRNA-to-genomic alignment
SIM4 - a program to align cDNA
and genomic DNA
- PipMaker - computes alignments of
similar regions in two (long) DNA sequences
VISTA - align + detect conserved regions
in long genomic sequences
Human Genome Databases人类基因组数据库
Next Generation Sequencing下一代测序
Journal Bioinformatics for Next Generation
- Integrated solutions
CLCbio Genomics Workbench - de novo
and reference assembly of Sanger, Roche FLX,
Illumina, Helicos, and SOLiD data.
Commercial next-gen-seq software that
extends the CLCbio Main Workbench software.
Includes SNP detection, CHiP-seq, browser
and other features. Commercial. Windows, Mac
OS X and Linux.
Galaxy - Galaxy = interactive and
reproducible genomics. A job webportal.
Genomatix - Integrated Solutions for
Next Generation Sequencing data analysis.
JMP Genomics - Next gen visualization
and statistics tool from SAS. They are
working with NCGR to refine this tool
and produce others.
NextGENe - de novo and reference
assembly of Illumina, SOLiD and Roche FLX
data. Uses a novel Condensation Assembly
Tool approach where reads are joined via
"anchors" into mini-contigs before assembly.
Includes SNP detection, CHiP-seq, browser
and other features. Commercial. Win or MacOS.
SeqMan Genome Analyser - Software for
Next Generation sequence assembly of
Illumina, Roche FLX and Sanger data
integrating with Lasergene Sequence Analysis
software for additional analysis and
visualization capabilities. Can use a hybrid
templated/de novo approach. Commercial. Win
or Mac OS X.
SHORE - SHORE, for Short Read, is a
mapping and analysis pipeline for short DNA
sequences produced on a Illumina Genome
Analyzer. A suite created by the 1001
Genomes project. Source for POSIX.
SlimSearch - Fledgling commercial
Align/Assemble to a reference
BFAST - Blat-like Fast Accurate Search
Tool. Written by Nils Homer, Stanley F.
Nelson and Barry Merriman at UCLA.
Bowtie - Ultrafast, memory-efficient
short read aligner. It aligns short DNA
sequences (reads) to the human genome at a
rate of 25 million reads per hour on a
typical workstation with 2 gigabytes of
memory. Uses a Burrows-Wheeler-Transformed (BWT)
Link to discussion thread here. Written
by Ben Langmead and Cole Trapnell. Linux,
Windows, and Mac OS X.
Exonerate - Various forms of pairwise
alignment (including Smith-Waterman-Gotoh)
of DNA/protein against a reference. Authors
are Guy St C Slater and Ewan Birney from
EMBL. C for POSIX.
GenomeMapper - GenomeMapper is a short
read mapping tool designed for accurate read
alignments. It quickly aligns millions of
reads either with ungapped or gapped
alignments. A tool created by the 1001
Genomes project. Source for POSIX.
GMAP - GMAP (Genomic Mapping and
Alignment Program) for mRNA and EST
Sequences. Developed by Thomas Wu and Colin
Watanabe at Genentec. C/Perl for Unix.
gnumap - The Genomic Next-generation
Universal MAPper (gnumap) is a program
designed to accurately map sequence data
obtained from next-generation sequencing
machines (specifically that of Solexa/Illumina)
back to a genome of any size. It seeks to
align reads from nonunique repeats using
statistics. From authors at Brigham Young
University. C source/Unix.
MAQ - Mapping and Assembly with
Qualities (renamed from MAPASS2).
Particularly designed for Illumina with
preliminary functions to handle ABI SOLiD
data. Written by Heng Li from the Sanger
Centre. Features extensive supporting tools
for DIP/SNP detection, etc. C++ source
MOSAIK - MOSAIK produces gapped
alignments using the Smith-Waterman
algorithm. Features a number of support
tools. Support for Roche FLX, Illumina,
SOLiD, and Helicos. Written by Michael
Strömberg at Boston College. Win/Linux/MacOSX
Novocraft - Tools for reference
alignment of paired-end and single-end
Illumina reads. Uses a Needleman-Wunsch
algorithm. Can support Bis-Seq. Commercial.
Available free for evaluation, educational
use and for use on open not-for-profit
projects. Requires Linux or Mac OS X.
PASS - It supports Illumina, SOLiD and
Roche-FLX data formats and allows the user
to modulate very finely the sensitivity of
the alignments. Spaced seed intial filter,
then NW dynamic algorithm to a SW(like)
local alignment. Authors are from CRIBI in
RMAP - Assembles 20 - 64 bp Illumina
reads to a FASTA reference genome. By Andrew
D. Smith and Zhenyu Xuan at CSHL. (published
in BMC Bioinformatics). POSIX OS required.
SHRiMP - Assembles to a reference
sequence. Developed with Applied Biosystem's
colourspace genomic representation in mind.
Authors are Michael Brudno and Stephen
Rumble at the University of Toronto. POSIX.
Slider- An application for the
Illumina Sequence Analyzer output that uses
the probability files instead of the
sequence files as an input for alignment to
a reference sequence or a set of reference
sequences. Authors are from BCGSC. Paper is
SOAP - SOAP (Short Oligonucleotide
Alignment Program). A program for efficient
gapped and ungapped alignment of short
oligonucleotides onto reference sequences.
The updated version uses a BWT. Can call
SNPs and INDELs. Author is Ruiqiang Li at
the Beijing Genomics Institute. C++, POSIX.
SSAHA2 - SSAHA (Sequence Search and
Alignment by Hashing Algorithm) is a tool
for rapidly finding near exact matches in
DNA or protein databases using a hash table.
Developed at the Sanger Centre by Zemin
Ning, Anthony Cox and James Mullikin. C++
SOCS - Aligns SOLiD data. SOCS is built
on an iterative variation of the Rabin-Karp
string search algorithm, which uses hashing
to reduce the set of possible matches,
drastically increasing search speed. Authors
are Ondov B, Varadarajan A, Passalacqua KD
and Bergman NH.
SWIFT - The SWIFT suit is a software
collection for fast index-based sequence
comparison. It contains: SWIFT — fast local
alignment search, guaranteeing to find
epsilon-matches between two sequences. SWIFT
BALSAM — a very fast program to find
semiglobal non-gapped alignments based on
k-mer seeds. Authors are Kim Rasmussen
(SWIFT) and Wolfgang Gerlach (SWIFT BALSAM)
Vmatch - A versatile software tool for
efficiently solving large scale sequence
matching tasks. Vmatch subsumes the software
tool REPuter, but is much more general, with
a very flexible user interface, and improved
space and time requirements. Essentially a
large string matching toolbox. POSIX.
De novo Align/Assemble
ABySS - Assembly By Short Sequences.
ABySS is a de novo sequence assembler that
is designed for very short reads. The
single-processor version is useful for
assembling genomes up to 40-50 Mbases in
size. The parallel version is implemented
using MPI and is capable of assembling
larger genomes. By Simpson JT and others at
the Canada's Michael Smith Genome Sciences
Centre. C++ as source.
ALLPATHS - ALLPATHS: De novo assembly of
whole-genome shotgun microreads. ALLPATHS is
a whole genome shotgun assembler that can
generate high quality assemblies from short
reads. Assemblies are presented in a graph
form that retains ambiguities, such as those
arising from polymorphism, thereby providing
information that has been absent from
previous genome assemblies. Broad Institute.
Edena - Edena (Exact DE Novo Assembler)
is an assembler dedicated to process the
millions of very short reads produced by the
Illumina Genome Analyzer. Edena is based on
the traditional overlap layout paradigm. By
D. Hernandez, P. François, L. Farinelli, M.
Osteras, and J. Schrenzel. Linux/Win.
SEQAN - A Consistency-based Consensus
Algorithm for De Novo and Reference-guided
Sequence Assembly of Short Reads. By Tobias
Rausch and others. C++, Linux/Win.
SHARCGS - De novo assembly of short
reads. Authors are Dohm JC, Lottaz C,
Borodina T and Himmelbauer H. from the
Max-Planck-Institute for Molecular Genetics.
SSAKE - The Short Sequence Assembly by
K-mer search and 3' read Extension (SSAKE)
is a genomics application for aggressively
assembling millions of short nucleotide
sequences by progressively searching for
perfect 3'-most k-mers using a DNA prefix
tree. Authors are René Warren, Granger
Sutton, Steven Jones and Robert Holt from
the Canada's Michael Smith Genome Sciences
SOAPdenovo - Part of the SOAP suite. See
VCAKE - De novo assembly of short reads
with robust error correction. An improvement
on early versions of SSAKE.
Velvet - Velvet is a de novo genomic
assembler specially designed for short read
sequencing technologies, such as Solexa or
454. Need about 20-25X coverage and paired
reads. Developed by Daniel Zerbino and Ewan
Birney at the European Bioinformatics
ssahaSNP - ssahaSNP is a polymorphism
detection tool. It detects homozygous SNPs
and indels by aligning shotgun reads to the
finished genome sequence. Highly repetitive
elements are filtered out by ignoring those
kmer words with high occurrence numbers.
More tuned for ABI Sanger reads. Developers
are Adam Spargo and Zemin Ning from the
Sanger Centre. Compaq Alpha, Linux-64,
Linux-32, Solaris and Mac
* PyroBayes - PyroBayes is a novel base
caller for pyrosequences from the 454 Life
Sciences sequencing machines. It was
designed to assign more accurate base
quality estimates to the 454 pyrosequences.
Developers at Boston College.
Browser/Alignment Viewer/Assembly Database
EagleView - An information-rich genome
assembler viewer. EagleView can display a
dozen different types of information
including base quality and flowgram signal.
Developers at Boston College.
LookSeq - LookSeq is a web-based
application for alignment visualization,
browsing and analysis of genome sequence
data. LookSeq supports multiple sequencing
technologies, alignment sources, and viewing
modes; low or high-depth read pileups; and
easy visualization of putative single
nucleotide and structural variation. From
the Sanger Centre.
MapView - MapView: visualization of
short reads alignment on desktop computer.
From the Evolutionary Genomics Lab at
Sun-Yat Sen University, China. Linux.
SAM - Sequence Assembly Manager. Whole
Genome Assembly (WGA) Management and
Visualization Tool. It provides a generic
platform for manipulating, analyzing and
viewing WGA data, regardless of input type.
Developers are Rene Warren, Yaron
Butterfield, Asim Siddiqui and Steven Jones
at Canada's Michael Smith Genome Sciences
Centre. MySQL backend and Perl-CGI web-based
STADEN - Includes GAP4. GAP5 once
completed will handle next-gen sequencing
data. A partially implemented test version
XMatchView - A visual tool for analyzing
cross_match alignments. Developed by Rene
Warren and Steven Jones at Canada's Michael
Smith Genome Sciences Centre. Python/Win or
Counting e.g. CHiP-Seq, Bis-Seq, CNV-Seq
BS-Seq - The source code and data for
the "Shotgun Bisulphite Sequencing of the
Arabidopsis Genome Reveals DNA Methylation
Patterning" Nature paper by
Cokus et al. (Steve Jacobsen's lab at
CHiPSeq - Program used by Johnson et al.
(2007) in their Science publication
CNV-Seq - CNV-seq, a new method to
detect copy number variation using
high-throughput sequencing. Chao Xie and
Martti T Tammi at the National University of
FindPeaks - perform analysis of ChIP-Seq
experiments. It uses a naive algorithm for
identifying regions of high coverage, which
represent Chromatin Immunoprecipitation
enrichment of sequence fragments, indicating
the location of a bound protein of interest.
Original algorithm by Matthew Bainbridge, in
collaboration with Gordon Robertson. Current
code and implementation by Anthony Fejes.
Authors are from the Canada's Michael Smith
Genome Sciences Centre. JAVA/OS independent.
Latest versions available as part of the
Vancouver Short Read Analysis Package
MACS - Model-based Analysis for
ChIP-Seq. MACS empirically models the length
of the sequenced ChIP fragments, which tends
to be shorter than sonication or library
construction size estimates, and uses it to
improve the spatial resolution of predicted
binding sites. MACS also uses a dynamic
Poisson distribution to effectively capture
local biases in the genome sequence,
allowing for more sensitive and robust
prediction. Written by Yong Zhang and Tao
Liu from Xiaole Shirley Liu's Lab.
PeakSeq - PeakSeq: Systematic Scoring of
ChIP-Seq Experiments Relative to Controls. a
two-pass approach for scoring ChIP-Seq data
relative to controls. The first pass
identifies putative binding sites and
compensates for variation in the mappability
of sequences across the genome. The second
pass filters out sites that are not
significantly enriched compared to the
normalized input DNA and computes a precise
enrichment and significance. By Rozowsky J
et al. C/Perl.
QuEST - Quantitative Enrichment of
Sequence Tags. Sidow and Myers Labs at
Stanford. From the 2008 publication
Genome-wide analysis of transcription factor
binding sites based on ChIP-Seq data.
SISSRs - Site Identification from Short
Sequence Reads. BED file input. Raja Jothi @
this thread for ChIP-Seq, until I get
time to update this list.
Alternate Base Calling
Rolexa - R-based framework for base
calling of Solexa data. Project
Alta-cyclic - "a novel Illumina
Genome-Analyzer (Solexa) base caller"
ERANGE - Mapping and Quantifying
Mammalian Transcriptomes by RNA-Seq.
Supports Bowtie, BLAT and ELAND. From the
G-Mo.R-Se - G-Mo.R-Se is a method aimed
at using RNA-Seq short reads to build de
novo gene models. First, candidate exons are
built directly from the positions of the
reads mapped on the genome (without any ab
initio assembly of the reads), and all the
possible splice junctions between those
exons are tested against unmapped reads.
From CNS in France.
MapNext - MapNext: A software tool for
spliced and unspliced alignments and SNP
detection of short sequence reads. From the
Evolutionary Genomics Lab at Sun-Yat Sen
QPalma - Optimal Spliced Alignments of
Short Sequence Reads. Authors are Fabio De
Bona, Stephan Ossowski, Korbinian
Schneeberger, and Gunnar Rätsch. A paper is
TopHat - TopHat is a fast splice
junction mapper for RNA-Seq reads. It aligns
RNA-Seq reads to mammalian-sized genomes
using the ultra high-throughput short read
aligner Bowtie, and then analyzes the
mapping results to identify splice junctions
between exons. TopHat is a collaborative
effort between the University of Maryland
and the University of California, Berkeley
- Rice databases
BGI-RISe: Beijing Genomics Institute Rice Information System
RGP: Rice Genome Research Program
RiceNetDB: is currently the most comprehensive regulatory database on Oryza sativa based on genome annotation. It was displayed in three levels: GEM, PPIs and GRNs to facilitate biomolecular regulatory analysis and gene-metabolite mapping.
MOsDB: MIPS Oryza sativa database
Oryzabase: Rice genetics and genomics
Oryza Tag Line database: T-DNA insertion mutants of rice
RRIN: Oryza sativa protein-protein interactions network
- Plant Promoter and Regulatory Element Resources
AGRIS Currently contains two databases, AtcisDB (Arabidopsis thaliana cis-regulatory database) and AtTFDB (Arabidopsis thaliana transcriptionfactor database).
AthaMap A genome-wide map of putative transcription factor binding sites in Arabidopsis thaliana.
AtProbe The Arabidopsis thaliana promoter binding element database, an aid to find binding elements and check data against the primary literature.
DATF: Database of Arabidopsis Transcription Factors (DATF) contains known and predicted Arabidopsis transcription factors with sequences and many other features including 3D structure templates, EST expression information, transcription factor binding sites and Nuclear Location Signals.
DoOP: Databases of Orthologous Promoters, a database containing orthologous clusters of promoters from Homo sapiens, Arabidopsis thaliana and other organisms.
GRASSIUS A public web resource composed by a collection of databases, computational and experimental resources that relate to the control of gene expression in the grasses, and their relationship with agronomic traits. GRASSIUS currently contains regulatory information on maize, rice, sorghum and sugarcane.
PlantCare Database of plant cis-acting regulatory elements and a portal to tools for in silico analysis of promoter sequences.
PlantProm DB Database with annotated, non-redundant collection of proximal promoter sequences for RNA polymerase II with experimentally determined transcription start sites (TSS) from various plant species.
PlantTFDB: Plant Transcription Factor Database, an integrative plant transcription factor database that provides a web interface to access large (close to complete) sets of transcription factors of several plant species, currently encompassing Arabidopsis thaliana (thale cress), Populus trichocarpa (poplar), Oryza sativa(rice), Chlamydomonas reinhardtii and Ostreococcus tauri.
PPDB (Plant Promoter DB) Database that provides transcription start sites (TSS) and other structural information for Arabidopsis and rice promoters.
Transfac Database on eukaryotic transcription factors, their genomic binding sites and DNA-binding profiles. Commercial site.
Databases of other Organisms其他物种数据库
MBGD - comparative analysis of
completely sequenced microbial genomes
COGs - phylogenetic classification of
orthologous proteins from complete genomes
STRING - detect whether a given query
gene occurs repeatedly with certain other
genes in potential operons
Pedant - automatic whole genome
GeneCensus - various whole genome
Protein Domains: Databases and Search Tools蛋白质域
InterPro - integration of Pfam, PRINTS,
PROSITE, SWISS-PROT + TrEMBL
PROSITE - database of protein families
Pfam - alignments and hidden Markov
models covering many common protein domains
SMART - analysis of domains in proteins
ProDom - protein domain database
PRINTS Database - groups of conserved
motifs used to characterise protein families
Blocks - multiply aligned ungapped
segments corresponding to the most highly
conserved regions of proteins
TIGRFAMs - yet more protein families
based on Hidden Markov Models
Motif and Pattern Search in Sequences序列模体和样式搜索
Gibbs Motif Sampler - identification of
conserved motifs in DNA or protein sequences
AlignACE Homepage - gene regulatory
MEME - motif discovery and search in
protein and DNA sequences
SAM - tools for creating and using
Hidden Markov Models
Pratt - discover patterns in unaligned
Motivated Proteins - a web facility for
exploring small hydrogen-bonded motifs
Protein 3D Structure蛋白质三维结构
PDB - protein 3D structure database
RasMol / Protein Explorer - molecule 3D
- UCL BSM CATH classification
- FSSP - fold classification based on
structure-structure alignment of proteins
SWISS-MODEL - homology modeling server
Structure Prediction Meta-server
K2 - protein structure alignment
DALI - 3D structure alignment server
DSSP - defines secondary structure and
solvent exposure from 3D coordinates
HSSP Database - Homology-derived
Secondary Structure of Proteins
PredictProtein & PHD - predict secondary
structure, solvent accessibility,
transmembrane helices, and other stuff
Jpred3 - protein secondary structure
PSIpred (& MEMSAT & GenTHREADER) -
protein secondary structure prediction (&
transmembrane helix prediction & tertiary
structure prediction by threading)
- Structural analysis
Pfam: A collection of protein families
SMART: Simple Modular Architecture Research Tool
SCOP: Structural Classification of Proteins
MEME: Motif-based sequence analysis tools
Jpred3: A Secondary Structure Prediction Server
PSIPRED: A highly accurate method for protein secondary structure prediction
DiANNA: Cysteine state and Disulfide Bond partner prediction
Robetta: Full-chain protein structure prediction server
PhyML: Phylogenetic Tree by Maximum Likelihood
Molecular Graphics Software Links
Open Source viewer that includes features for morphing proteins and visualization of lipophilic and electrostatic potentials.
A protein visualization and modeling program
Create beautiful publication quality images and movies. Users can superpose and analyse structures as well. The program runs 'out of the box' on Linux, MacOSX and Windows platforms.
Interactive molecular modeling system, free to academic/non-profit; displays multiple sequence alignments and associated structures, atom-type and H-bond identification, molecular dynamics trajectories (AMBER format), and offers ligand-screening interface (DOCK), filter by number/position of H-bonds, and extensibility to create custom modules - for Windows, Linux, Mac OS X, IRIX, and Tru64 Unix
Simultaneously displays structure, sequence, and alignment, with annotation and alignment editing features, for use with 3-D structures from NCBI's Entrez; available for Windows, Macintosh, and Unix
A program for building, displaying and manipulating all kinds of crystal and molecular structures.
Embedded Python Molecular Viewer (ePMV) is an open-source plug-in that runs molecular modeling software directly inside of professional 3D animation applications
Foldit is a crowdsourcing computer game based on protein modeling.
Open GL graphics program displays small, large, and multiple molecules; measures distances and angles, superimposes structures, calculates RMSD between atom coordinates, structurally aligns chains, and displays dynamics trajectories. For Mac OS X incl. 10.2
Jmol is a free, open source molecule viewer for students, educators, and researchers in chemistry and biochemistry. It is cross-platform, running on Windows, Mac OS X, and Linux/Unix systems.
- Mage and Kinemages
Interactive molecular display for research and educational uses. Free, open source for Windows and Mac (OSX or PPC), Unix, and Linux. A Java version does 3-D Web display without plug-ins.
Marvin is a collection of tools for drawing, displaying and characterizing chemical structures, queries, macromolecules and reactions for all operating systems, web pages and custom applications.
Interactively generate heterogeneous PDB-based membranes with varying lipid compositions and semi-automatic protein placement. Supports membrane patches and vesicles, microdomains as well as stacking of monolayer and/or bilayer membranes.
- Molecule World
Molecule World 2.1 is an iPad application for viewing and manipulating 3D chemical and molecular structures. Structures can be downloaded and displayed from the PubChem, PDB, and NCBI structure databases together with the sequences for proteins and nucleic acids. Structures can be drawn as tubes, ball and stick, or space filling modes. Coloring options include residue, charge, hydrophobicity, rainbow, and molecule. Parts of structures can be hidden or displayed with mixed coloring and drawing modes.
- Molecule World for iPhone
Molecule World for iPhone can be used on the iPhone or iPod touch to display and manipulate 3D chemical and molecular structures from the PubChem, PDB, or NCBI structure databases. Drawing options include ball and stick, space fill, and ball and stick modes. Coloring options include rainbow, residue, charge, hydrophobicity, and molecule. Proteins, nucleic acids, and heterogens can be displayed in different modes.
- Molecule World DNA Binding Lab
A classroom ready iPad application for exploring the ways chemicals and proteins bind to DNA. The DNA Binding Lab uses Molecule World?s rendering engine and display features to highlight different molecules and understand how they intact. The DNA Binding Lab includes instructions, three examples, and 40 unknowns that can be assigned to students. Photo sharing capabilities allow students to share their work with teachers to aid with assessment.
An iPhone application for PDB structures
A program for displaying structures in both detailed and schematic formats and writing images in various formats for Unix
Free, interactive visualization tutorials
Molecular Visualization Program and GUI of ZMM. MVM is a free molecular viewer that can be used to display protein, nucleic acids, oligosacharides, small and macromolecules. It has an intuitive interface. In addition to being a molecular viewer, it is the user interface of a very powerful molecular mechanics engine (ZMM).
- PMV (Python Molecular Viewer)
An interactive molecular visualization and modeling environment for manipulation and viewing of multiple molecules.
Program to view and manipulate PDB files on a PocketPC
Protein structure annotation using sequence profiles
Versatile annotation and high quality visualization of macromolecular structures
Analysis and visualization of macromolecular motions
Free viewer to display and manipulate PDB files and create animations and slides of proteins for Windows. Online ordering of protein 3D prints in several color schemes.
Mapping protein sequence annotations onto a protein structure and visualizing them simultaneously with the structure.
A free and open-source molecular graphics system for visualization, animation, editing, and publication-quality imagery. PyMOL is scriptable and can be extended using the Python language. Supports Windows, Mac OSX, Unix, and Linux
An open source (GPL), interactive, high quality molecular visualization system. QuteMol exploits the current GPU capabilites through OpenGL shaders to offers an array of innovative visual effects.
A free viewing system for PDB coordinate files that runs on Mac (PPC), Windows, Unix, and Linux systems. Open source versions are also available.
A set of tools for generating high quality raster images of proteins or other molecules. Freeware for Mac OSX, Windows, Unix, and Linux
- RasTop (v. 2.0)
A free user-friendly graphical interface to RasMol molecular visualization software (v. 18.104.22.168), available for Windows and Linux
A program for molecular illustration and error analysis, for for Mac OSX, Windows, Unix, and Linux
- RCSB MBT Viewers
The MBT toolkit is a framework that allows to create various viewers. It is used for 4 different viewers on the RCSB PDB web site.
A Tcl/Tk script responsible to redirect PDB files or RasMol scripts to multiple RasMol sessions; can be used as a Web browser helper application or as a standalone program for Mac (OSX or PPC), Windows, or Unix
- Schrödinger Product Suites
Schrödinger's full product offerings range from general molecular modeling programs to a comprehensive suite of drug design software, as well as a state-of-the-art suite for materials research. All products are run with Maestro, a unified interface for all Schrödinger software, which is available for Mac, Windows, and Linux.
The Structural Proteomics Application Development Environment (SPADE) provides community tools for development and deployment of essential structure and sequence equipment. Includes a chemical probing suite to support experimental verification of predicted structural models. Written in Python with scripting tools available. Runs on Windows, Linux and Mac.
Align proteins by sequence and 3D structure.
- Swiss PDB viewer
A 3D graphics and molecular modeling program for the simultaneous analysis of multiple models and for model-building into electron density maps. The software is available for Mac (OSX or PPC), Windows, Linux, or SGI
A free and open-source tool with PDB format visualization support written in fast memory efficient C++ code. Supports Windows, Mac OSX, Unix, and Linux.
VMD (Visual Molecular Dynamics) runs on many platforms including MacOS X, and several versions of Unix and Windows. VMD provides visualization, analysis, and Tcl/Python scripting features, and has recently added sequence browsing and volumetric rendering features. VMD is distributed free of charge.
A complete molecular graphics and modeling program, including interactive molecular dynamics simulations, structure determination, analysis and prediction, docking, movies and eLearning for Windows, Linux and MacOSX.
A molecular visualization tool that supports PDB, MOL, MOL2/SYBYL and XYZ file formats. The rendering engine can output high quality molecular graphics. Zeus provides a sequence search that can highlight within the molecular structure. Ramachandran plots of internal dihedral angles can be generated and exported. PDB files can be automatically downloaded from the RSCB PDB.
Phylogeny & Taxonomy进化与分类
Gene Expression Databases基因表达数据库
TRAFAC - For identifying
conserved and shared cis regulatory elements
between a pair of genes.
CisMols - For identifying
conserved and shared cis regulatory elements
between a set of co-expressed genes.
EPD - eukaryotic promoter database
DBTSS - DataBase of Transcriptional
Start Sites (human)
SCPD - Saccharomyces cerevisiae promoter
DCPD - Drosophila Core Promoter Database
RegulonDB - a database on
transcriptional regulation in E. coli
DPInteract - protein binding sites on E.
PromoterInspector - prediction of
promoter regions in mammalian genomic
MatInspector - search for transcription
factor binding sites
Cister - cis-element cluster finder
Gene regulatory Tools
- Small RNA/MicroRNA小分子RNA
microRNA.org: microRNA Targets & Expression
- PmirKB - Plant microRNA Knowledge Base
- MicroRNA Target Prediction
miRanda — miRNA target prediction for human, drosophila and zebrafish genomes
miRBase — a comprehensive repository for miRNAs and their predicted targets
miRDB — an online database for miRNA target prediction and functional annotations in animals
miRNAMap — a genomic maps of microRNA genes and their target genes in mammalian genomes
miR2Disease— a database providing comprehensive resource of miRNA deregulation in various human diseases
TarBase — a comprehensive database of experimentally supported animal microRNA targets
PicTar — microRNA targets for vertebrates, fly and nematodes
TargetScan — a search for the presence of conserved sites that match the seed of each miRNA
Target Gene Prediction at EMBL — miRNA-Target predictions for Drosophila miRNAs
- Databases for microRNA Expression
microRNA.org— predicted microRNA targets & target downregulation scores. Experimentally observed expression patterns
HMDD — Human MicroRNA Disease Database (HMDD) is a database that contains the experimentally supported miRNA-disease association data, which are manually curated from publications. The dysfunction evidence or miRNAs
and literature PubMed ID are also given
TransmiR — a web query-driven database integrating the experimentally supported transcription factor and miRNA regulatory relations
- RNA Secondary Structure Prediction
DIANA MicroTest— a prediction of miRNA-mRNA interaction
mfold — tools for predicting the secondary structure of RNA and DNA, mainly by using thermodynamic methods
microInspector —a web tool for detection of miRNA binding sites in an RNA sequence
miRNA Bioinfor —miRNA End Energy calculator which takes miRNA duplex to calculate free energy for 5 base pairs at one end plus a dangling nucleotide
miRRim— a method for detecting miRNA foldbacks based on hidden Markov model (HMM)
MXSCARNA— a multiple alignment tool for RNA sequences using progressive alignment based on pairwise structural alignment algorithm of SCARNA. Good for large scale analyses.
RNAhybrid— a tool for finding the minimum free energy hybridisation of a long and a short RNA
- MicroRNA Homologous Prediction
miRNAminer — a web-based tool used for homologous miRNA gene search in several species
miRviewer —a global view of homologous miRNA genes in many species
RISCbinder— prediction of guide strand of microRNAs
Mireval — Sequence evaluation of microRNA properties
- MicroRNA Deep Sequencing
miRanalyzer— A microRNA detection and analysis tool for next-generation sequencing experiments
miRNAkey— A software pipeline for the analysis of microRNA Deep Sequencing data
miRDeep— Discovering known and novel miRNAs from deep sequencing data
Metabolic, Gene Regulatory & Signal
Transduction Network Databases代谢、基因调控和信号转导网络数据库
- BBOCUS (BackTranslation Based On Codon Usage Strategy) by Ferro and Purrello lab, a re-implementation of the algorithm in Graziano Pesole's BACKTR. It's based on cluster analysis (Complete Linkage algorithm), that requires a similarity matrix D containing distance between each pair of sequences of mRNA.
- Benchling by Benchling, Inc. Free online tools for vector editing, restriction analysis, primer search, multi-sequence alignment, and more.
- Biopolymer calculator by Schepartz lab. Calculate extinction coefficients, Tm's, and base composition for your DNA or RNA; calculate amino acid composition and extinction coefficient for your protein.
- Clipboard by Austin Che. Web tool for getting complement, reverse complement, translation and restriction enzyme analysis of a DNA sequence.
- Cytostudio by Molecula Maxima. An integrated development environment and a compiler for a high-level bio-programming language for Synthetic Biology. Based on iGEM conventions.
- DNAWorks by Hoover and Lubkowski. A web tool for optimizing melting temperature during gene synthesis.
- Geneious by Biomatters. Comprehensive suite of tools for molecular biology.
- Genome Compiler is the industry's most user friendly genetic engineering design tool. It allows you to manipulate genetic information; from genes to plasmids to whole genomes. You can rapidly access extensive libraries of genetic parts, and easily order your final design from a variety of providers.
- GeneDesign by Boeke lab. Collection of online (and some command line) tools for codon optimization and shuffling, restriction site editing, and so on.
- GeneDesigner by DNA2.0. Combine genetic building blocks by drag-and-drop, codon optimize, restriction site editing, sequence oligo design etc.
- GenoCAD is a design tool that uses collections or libraries of genetic parts and explicit design rules describing how these parts should be combined to engineer genetic constructs.
- NEB Cutter by New England Biolabs, Inc.Tool for finding restriction sites, et cetera.
- Synthetic Gene Designer by Gang Wu. A web platform that allows codon optimization to various extent. Compatible with non-standard genetic codes.
- Vector NTI by Informax, Inc. Free-to-academics tool for sequence analysis and data management.
- j5, DeviceEditor, and VectorEditor online tools
- j5: DNA assembly design automation for (combinatorial) flanking homology (e.g., SLIC/Gibson/CPEC/SLiCE/yeast) and type IIs-mediated (e.g., Golden Gate/FX cloning) assembly methods.
- DeviceEditor: a visual DNA design canvas that serves as front-end for j5.
- VectorEditor: a visual DNA editing and annotation tool.
- Appendix by Ambion, Inc. Website with many useful nucleic acid parameters.
- mFold by Michael Zuker. is for predicting RNA and DNA folds, calculating Tm's and free energies.
- Cn3D by NCBI. A helper application for your web browser that allows you to view 3-dimensional structures from NCBI's Entrez retrieval service. It doesn't read PDB files but can be more straightforward to use than DeepView.
- DeepView by GlaxoSmithKline & Swiss Institute of Bioinformatics. Awesome program for viewing and studying protein structure.
- ExPASy Proteomics server by the Swiss Institute of Bioinformatics. Collection of links to many pages to calculate parameters of your favorite proteins.
- Modeller by Sali Lab. For homology or comparative modeling of protein three-dimensional structures.
- TinkerCell by Deepak Chandran. Construct computational models using biological parts, cells, and modules.
- Metabolic Tinker by Kent McClymont and Orkun Soyer. Construct thermodynamically feasible metabolic paths among user-defined compounds.
- Colibri by Institut Pasteur. E. coli genome site; get sequences, see the position of your gene in the chromosome, see the function of your gene, and other fun stuff. You can also search for protein sequences/motifs within the E. coli genome.
- JBEI Registry. A site where you can explore the various features of the JBEI Registry software, and even get some work done! A DNA part, plasmid, microbial strain, and Arabidopsis Seed online repository with physical sample tracking capabilities.
- PaR-PaR Laboratory Automation Platform allows researchers to use liquid-handling robots effectively, enabling experiments that would not have been considered previously. After minimal training, a biologist can independently write complicated protocols for a robot within an hour.
Other Databases (Annotations, Ontologies,
Human Diseases and Cancer Databases人类疾病与癌症数据库
- OMIM - OMIM is a comprehensive, authoritative compendium of human genes and genetic phenotypes that is freely available and updated daily.
- MalaCards - an integrated database of human maladies and their annotations,modeled on the architecture and richness of the popular GeneCards databse of human gene.
- TCGA (The Cancer Genome Atlas) -
a collaboration between the National Cancer Institute(NIC) And the National Human Genome Research Institute (NHGRI)that has generated comprehensive,mutil-dimensional maps of the key genomic changes in 33 types of cancer.
- CCLE (Cancer Cell Line Encycclopedia) - a collaboration between the Board Institute,and the Novartis Institute for Biomedical Research and its Genomics Institute of the Novartis Research Foundation to conduct a detailed genetic and pharmacologic characterization of a large panel of human cancer models, to develop integrated computational analyses that link distinct pharmacologic vulnerabilities to genomic patters and to translate cell line integrative genomics into cancer patient stratiffication.The CCLE provides public access to genomic data,analysis and visualization for about 1000 cell lines.
- Cancer Genome Anatomy Project
- The Cancer Genome Atlas
- Cancer Genetics With an Edge
- Cancers Genomes and their Implications for Curing Cancer by Bert Vogelstein
- Cancer Risk Prediction Models and Assessment
- NIH Database of 100,000 Chest X-Ray images, associated data, and diagnoses
- DICOM® (Digital Imaging and Communications in Medicine): is the international standard to transmit, store, retrieve, print, process, and display medical imaging information.
- The Cancer Imaging Archive (TCIA) formerly the National Biomedical Imaging Archive (NBIA): Lung Image Database Consortium (LIDC), Reference Image Database to Evaluate Response (RIDER), Breast MRI, Lung PET/CT, Neuro MRI, CT Colongraphy, Virtual Colonoscopy, Osteoarthritis Initiative (MIA), PET/CT phantom scan collection
- OASIS: Cross-sectional MRI Data in Young, Middle Aged, Nondemented and Demented Older Adults; Longitudinal MRI Data in Nondemented and Demented Older Adults
- ADNI: Alzheimer’s Disease Neuroimaging Initiative (ADNI) unites researchers with study data as they work to define the progression of Alzheimer’s disease. ADNI researchers collect, validate and utilize data such as MRI and PET images, genetics, cognitive tests, CSF and blood biomarkers as predictors for the disease.
- FITBIR: The Federal Interagency Traumatic Brain Injury Research (FITBIR) informatics system: MRI, PET, Contrast, and other data on a range of TBI conditions
- STARE: STructured Analysis of the Retina: This research concerns a system to automatically diagnose diseases of the human eye.
- Cancer Digital Slide Archive: Whole-slide images from The Cancer Genome Atlas's (TCGA) glioblastoma multiforme (GBM) samples
- The Cancer Imaging Archive: The image data in The Cancer Imaging Archive (TCIA) is organized into purpose-built collections of subjects. The subjects typically have a cancer type and/or anatomical site (lung, brain, etc.) in common.
- Johns Hopkins Medical Institute: DTI Atlases: adults, children, ...
- Duke Center for In Vivo Microscopy: Small animal MRI, CT, ...
- UCI Machine Learning Repository: The father of internet data archives for all forms of machine learning.
- Computer Vision Online Image Archive: Large listing of multiple databases in computer vision and biomedical imaging
- Cornell Visualization and Image Analysis (VIA) group: Provides a list of available databases, many of which are also listed here.
- UT Health Science Center Image Collections: List of medical images, atlases, and databases available on the web.
- OmniMedicalSearch.com: Medical Image Databases & Libraries
- Digital Database for Screening Mammography (DDSM): Large collection with normal and abnormal findings and ground truth.
- Digital Retinal Images for Vessel Extraction (DRIVE): Digital images and expert segmentations of retinal vessels.
- Japanese Society of Radiological Technology (JSRT) Database: Digital Chest X-ray images with lung nodule locations, ground truth, and controls.
- Segmentation in Chest Radiographs (SCR) database: Digital Chest X-ray images with segmentations of lung fields, heart, and clavicles.
- Public Lung Database to Address Drug Response: Well documented chest CT images.
- Mammographic Image Analysis Society (mini-MIAS) Database: Mammographic images and markup.
- Standard Diabetic Retinopathy Database (DIARETDB1): Digital retinal images for detecting and quantifying diabetic retinopathy.
- SpineWeb: SpineWeb is an online collaborative platform for everyone interested in research on spinal imaging and image analysis.
- OAinitiative - MR data of Hips, knees and other sites affected by OA
- MR pediatric repository - MR brain
- Facebase - 3D photography
- Synapse - NIH-funded datasets. For example, from UNC see https://www.synapse.org/#!Synapse:syn4152456
Bioinformatics on-line course materials and
tutorials (not an exhaustive collection)在线生物信息学课程教材
Intro to bioinformatics and computational
Web Sites for Background Information & News背景信息与新闻
Bioinformatics Conferences/Meetings 国际会议
Other Collections of Bioinformatics Resources其他资源
Nucleic Acids Research (NAR) database category list
Nucleic Acids Research (NAR) web server category list
OBRC: Online Bioinformatics Resources
Bioinformatics, Databases and Software for
Medicine: Covers recent literature,
tutorials, links, bioinformatics database,
jobs, and news, updated daily
www.liebertonline.com: critical mass of content exceeds 50,000 papers from more than sixty authoritative journals, all full-text searchable and linked to external bibliographic databases.
bioinformatics.oxfordjournals.org: new developments in genome bioinformatics and computational biology
ExPASy Proteomics tools
- Computational Tools (Nature Reviews Genetics)
- BioThink: Bioinformatics Services, Scientific Editing, Synthetic Biology
Bioinformatics Journals 相关期刊
Up to Top