Fast, scalable generation of high‐quality protein multiple sequence alignments using Clustal Omega
Multiple sequence alignments are fundamental to many sequence analysis methods. Most
alignments are computed using the progressive alignment heuristic. These methods are …
alignments are computed using the progressive alignment heuristic. These methods are …
InterProScan 5: genome-scale protein function classification
P Jones, D Binns, HY Chang, M Fraser, W Li… - …, 2014 - academic.oup.com
Motivation: Robust large-scale sequence analysis is a major challenge in modern genomic
science, where biologists are frequently trying to characterize many millions of sequences. …
science, where biologists are frequently trying to characterize many millions of sequences. …
Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences
Motivation: In 2001 and 2002, we published two papers (Bioinformatics, 17, 282–283,
Bioinformatics, 18, 77–82) describing an ultrafast protein sequence clustering program called cd-…
Bioinformatics, 18, 77–82) describing an ultrafast protein sequence clustering program called cd-…
CD-HIT: accelerated for clustering the next-generation sequencing data
CD-HIT is a widely used program for clustering biological sequences to reduce sequence
redundancy and improve the performance of other sequence analyses. In response to the …
redundancy and improve the performance of other sequence analyses. In response to the …
CD-HIT Suite: a web server for clustering and comparing biological sequences
CD-HIT is a widely used program for clustering and comparing large biological sequence
datasets. In order to further assist the CD-HIT users, we significantly improved this program …
datasets. In order to further assist the CD-HIT users, we significantly improved this program …
A new bioinformatics analysis tools framework at EMBL–EBI
M Goujon, H McWilliam, W Li, F Valentin… - Nucleic acids …, 2010 - academic.oup.com
The EMBL-EBI provides access to various mainstream sequence analysis applications.
These include sequence similarity search services such as BLAST, FASTA, InterProScan and …
These include sequence similarity search services such as BLAST, FASTA, InterProScan and …
Analysis tool web services from the EMBL-EBI
Since 2004 the European Bioinformatics Institute (EMBL-EBI) has provided access to a wide
range of databases and analysis tools via Web Services interfaces. This comprises services …
range of databases and analysis tools via Web Services interfaces. This comprises services …
The EMBL-EBI bioinformatics web and programmatic tools framework
Since 2009 the EMBL-EBI Job Dispatcher framework has provided free access to a range of
mainstream sequence analysis applications. These include sequence similarity search …
mainstream sequence analysis applications. These include sequence similarity search …
Clustering of highly homologous sequences to reduce the size of large protein databases
We present a fast and flexible program for clustering large protein databases at different
sequence identity levels. It takes less than 2 h for the all-against-all sequence comparison and …
sequence identity levels. It takes less than 2 h for the all-against-all sequence comparison and …
[HTML][HTML] The Sorcerer II Global Ocean Sampling Expedition: Expanding the Universe of Protein Families
Metagenomics projects based on shotgun sequencing of populations of micro-organisms
yield insight into protein families. We used sequence similarity clustering to explore proteins …
yield insight into protein families. We used sequence similarity clustering to explore proteins …