Back to SeqWord Project main page
LingvoCom 1.0Oligonucleotide usage pattern comparison and visualization |
Program download
|
The program LingvoCom is written in Python and requires Python ver. 2.5 to be installed on the machine. The program works with DNA sequences in FASTA or GenBank formats. Visualizations are saved as vector graphic SVG files, which may be viewed by Mozilla Firefox or Chrome browsers, or vector graphic editors such as Adobe Illustrator.
|
|
Pattern type is set in the format: nX_Ymer, where Y is the length of oligonucleotides to be counted; and X is the length of shorter constituent words used for calculations of expected frequencies of Y-mers. By default n0_4mer is set, meaning that the program will analyze the given sequences for frequencies of tetra-(4)-nucleotides assuming that all words are equally expected (0-order normalization). If n1_3mer pattern is set, the program will analyze frequencies of tri-(3)-nucleotides assuming that the expected frequencies correlate with the GC-content (1-order normalization) of the DNA sequence. Users may set Y values within the range from 2 to 7 and X values within the range from 0 to Y-1. When
<P>+<Enter> is typed, the program will suggest for comma
separated X and Y values to be
entered:
|
|
Type
<T>+<Enter> and select the type of analysis/visualization to
be performed:
|
Query/Subject file |
Query and subject files may be in FASTA or GenBank
formats, which are required to be stored in the folder "input" prior to
the analysis. Upon analysis the results files are written as text and SVG
into the folder "output". LingvoCom can further be utilized for analysis
of the predicted GIs in GenBank or FASTA formats as generated by SWGIS.
Alternatively, it may extract DNA fragments from the whole genome by the
use of user defined genomic coordinates:
Quer and subject inputs are treated differently when differen tasks are performed:
|
Output file | Provide this parameter with a generic name for the output files. The text output file will be saved under the provided name and for the SVG output file the corresponding extension will be added. |
Graphical output | This parameter may be set to either Yes or No. If set Yes, an additional output SVG file will be saved for data visualization. |
Input folder | The name of an existing folder where input files will be looked for by the program. By default "input". |
Output folder | The name of an existing folder where the output files will be saved. By default "output". |