1. Paste your sequences.
2. Select the appropriate reference MSA. (See
below for a description of the other parameters.)
3. Choose your output format. If you desire FASTA files, you must also choose where you want the per-sequence meta data produced by SINA: In the header line (each item enclosed in []), on comment lines between header and sequence data (best readability) or in a separate CSV formatted file.
4. Click on submit
5. You will be redirected to the download page where you can follow your job's progress and download your file(s) once it is finished.
Old jobs will remain listed until you restart your browser. The links will remain valid for at least one day.
The log file (the second link) contains the console output generated by SINA, including its version number and the exact parameters used for alignment. If not all of your sequences are in the output file, you will probably find a statement in this file, claiming that no reference sequences could be found for the missing submitted sequences. Usually that means that they are not the kind of rRNA you selected.
Basic alignment parameters and
Advanced alignment parameters are explained below.
Use the sequence search stage of SINA to find related sequences and add them to your cart:
Click on "Align sequence(s)" to submit your request. You will be redirected to the download page where you can follow your job's progress:
Once you job has completed, a link will appear, allowing you to add all search results to the cart.
Please be aware that some sequences may be matched multiple times, the number of search results may therefore be lower than the number of query sequences times the number of requested search results. Also, since the cart system is accession number based, if the search matched a genome sequence, all LSU/SSU sequences from that genome will be added to the cart.
Check "Enable classification" to request LCA classification of your sequences.
The result will be written to meta data fields of the form "slv_lca_tax_<taxonomy-name>"
Search results that are classified as "Unclassified" will be ignored during classification. If no classification could be made (no search results found or results have different domains), the query sequence will be assigned the classification "Unclassified".
Since classification is based on the results from the sequence search, care must be taken in modifying search parameters. Setting the number of search results to 1 will, for example, always get you the classification of the best database match. Setting the number of search results high will result in "shallower" classification. Lowering the required identity with the query may result in misclassification.
The default is to attach the remaining unaligned bases at the end of your sequences to the last aligned base. This is appropriate if your sequences are full length and properly truncated. The number of unaligned bases at the end is reported by SINA in the fields "slv_cutoff_head" and "slv_cutoff_tail". Alternatively, you may choose to move those bases to the outer columns of the alignment. Our alignments contain 1000 empty columns at both ends, so these bases are easily removable prior to e.g. tree reconstruction (in ARB, just use the TERMINI filter). Lastly, you may opt to have these bases removed from your sequences.
Insertions will, as long as there is enough room in the reference alignment, always be placed adjacent to the following aligned base. Sometimes, however, the insertion may be longer than the number of free columns between the adjoining aligned bases. The default is to disallow this case during the dynamic programming stage of sequence alignment ("forbidding during alignment"). Alternatively, you may choose to have those insertions fitted in into the alignment by pushing the adjoining aligned bases outwards as required. The benefit of this option is that you can be made aware of cases where our alignment contains insufficient free columns. Lastly, you may choose to have the insertion truncated to the number of free columns. This is only appropriate if the sequences will be subjected to column filtering (i.e. prior to tree reconstruction) afterwards. Selecting this option disables sequence search.
Using default settings, SINA will verify that your sequence has the correct orientation. If a different orientation is expected to yield a better alignment, your sequences will be transformed accordingly. If you know that your sequences are correctly oriented and not complemented, you may disable this feature.
If you select "indicate unaligned bases", all insertions and remaining unaligned bases at the ends of your sequences will be set in lower case letters. If you intend to validate or refine the alignment computed by SINA, identifying bases with indeterminate positions in this manner may help you locate sections in the alignment worthy of attention.
If you submitted SILVA compatibly aligned sequences, such as from a previous alignment run with different parameters or after manually modifying the alignment, checking the "show differences" option will show the sections of the new alignment differing from the submitted alignment:
Dumping pos 2528 through 2542:
CG-G-ACA 19
CG-G-AUA 0-18 20-23 25-29 31 34-35 37-39
CG-GAAUA 33
CG-GCAUA 43 <---(## NEW ##)
CGGC-AUA 42 <---(%% ORIG %%)
GG-A-AUA 40
GG-G-AUA 41
UG-G-AUA 24 30 32 36
The example above shows all columns containing at least one base between column 2528 and 2542. The row marked with "ORIG" contains the original alignment. The row marked with NEW contains the new alignment. In this example, the dimer "GC" was placed one column further to the right than in the original alignment. The other rows show bases and alignment for the selected reference sequences. The numbers to the right of the alignment extract correspond to the position of the respective reference sequence identifiers in the field "align_family_slv".