Background

The SILVA databases are developed and maintained by the SILVA Team at the Opens external link in new window Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures in Bremen, Germany.

SILVA is an interdisciplinary project of biologists and computer scientists to provide:

fully aligned and up to date small (16S/18S, SSU) and large (23S/28S, LSU) subunit ribosomal RNA "Parc" databases on the webpage as well as ARB files
preconfigured subsets of only high quality, full-length sequences as ARB & FASTA files (SSU/LSU Ref)
extensive browse & search functionalities for sequence retrieval
a clear rating system for all steps of data processing with emphasis on sequence and alignment quality
full compatibility to the software package ARB & the latest official alignments released by the ARB project
compatibility to many common programs like Phylip or Paup via direct Fasta export or the ARB program

Release information & Database history

Version 89 of our datasets has been made available in February 2007. Version 138.1 was released in August 2020 and increased the number of available SSU sequences to over 9,400,000. Detailed information about the content of the databases and statistics can be found Opens internal link in current window here...

Motivation

Sequencing the ribosomal RNA (rRNA) genes is currently the method of choice for phylogenetic reconstruction and nucleic acid based detection and quantification of microbial diversity. The ARB software suite with its corresponding rRNA databases has been accepted by researchers worldwide as their standard tool for large scale ribosomal RNA analysis. Almost 30 years of development have already been invested to extend and maintain the system. To provide high quality and comprehensive rRNA databases comprising Bacteria, Archaea and Eukarya the SILVA (from Latin silva, forest) system has been implemented in 2007. It is designed as an automatic software pipeline for sequence retrieval, quality assignment and the alignment of nucleic acid sequences based on the latest comprehensive ARB alignments.

SINA: The new SILVA (Web)Aligner

We developed a new aligner called SINA (SILVA INcremental Aligner) that is able to accurately align hundred thousands of sequences based on a curated SEED alignment. In a first step the aligner determines the next related sequences using an optimized Suffix Tree server. To find the optimal alignment for a new sequence up to 40 reference sequences are taken into account. While running, the system simulates the manual refinement process to optimize the result.

Features of SINA:

Process and quality values are added to each sequence indicating e.g. the alignment quality
Only minimum manual revision of the output alignment is required (e.g. no base-spreading at the ends)
Improved alignment quality due to advanced alignment technology compared to e.g. the ARB Aligner ("Fastaligner")

SINA is also available online for small scale projects. More information about SINA as well as the corresponding publication can be found Opens internal link in current window here.