SeqWord Project - genome linguistics approaches for 
comparative genomics, phylogenomics and mobilomics
            Principal Investigator: Dr Oleg Reva    
            This project addresses the development of an integrated 
research environment for data mining in DNA sequences by using genome 
linguistics. 
                         This is none-commercial academic project supported by the 
National Research Foundation of South Africa (NRF                        ): 
            The project was started at the University of Pretoria in 
2006. All tools of the SeqWord project were created by post-graduate students as 
part of their Hons, MSc or PhD projects. Following tools are available now:
            
                - Genome browser – a tool to visualize the genomic loci belonging to different functional categories (gene islands, functionally indispensable regions, non-coding loci, etc.);
                
 - Genomic Island 
  Sniffer – a tool for an automatic identification of horizontally acquired genomic islands in bacterial genomes;
                
 - Sniffer GI 
  Browser                 – a database and viewer of genomic islands predicted in multiple bacterial genomes;  
                
 - GI Databases
                 –  Databases of predicted genomic islands in i) prokaryotes & archaea; 
                ii) in eukaryots; iii) in human genome; 
                
 - Interactive GI 
  maps                       – several interactive SVG files representing phylogenetic links between genomic islands identified in different bacteria;
                
 - SWPhylo
                 – phylogenetic inferencing using parametric whole genome versus gene based comparison;
                
 - GenomeBarcoder
                 – interactive program for creation and application of diagnostic genetic barcodes;
                
 - OligoDBViewer - a Python program with GUI to 
  count and store the frequencies of signature 8-14 mer oligonucleotides in 
  whole sequences of bacterial chromosomes (beta-version); 
                
 - MetaLingvo - 
  general genome linguistic analysis program written in Python 
  (beta-version).
                
 - LingvoCom - 
  command line tools for linguistic analysis (Python 2.5).
  			
 
            
            Visiting of the project Web-sites is monitored by Google 
Analytics. A graphical report for the last year is shown below:
            