Revised 15/04/2005 README for Inparanoid Eukaryotic Ortholog Groups data download directory ******************************************************************************** This a description of the download directory of the Inparanoid website. For more information on how to use the Inparanoid program as well as the online tool please go to "http://inparanoid.cgb.ki.se/ehelp.html" ******************************************************************************** ----------------------------------------------------------------------------- A - Main Inparanoid section using Ensembl and other datasets ----------------------------------------------------------------------------- All files can be downloaded from http://inparanoid.cgb.ki.se/download/ ---- Original fasta files: Location; http://inparanoid.cgb.ki.se/download/current/sequences/processed/ --- processed fasta files: Location; http://inparanoid.cgb.ki.se/download/current/sequences/processed/ Species files and their fasta files abbreviated as follows: -------------------------------------------- Name Species ensAG Anopheles gambiae (Ensembl) ensAM Apis mellifera (Ensembl) ensCE Caenorhabditis elegans (Ensembl) ensCF Canis familiaris (Ensembl) ensDM Drosophila melanogaster (Ensembl) ensDR Danio rerio (Ensembl) ensFR Fugu rubripes (Ensembl) ensGG Gallus gallus (Ensembl) ensHS Homo sapiens (Ensembl) ensMM Mus musculus (Ensembl) ensPT Pan troglodytes (Ensembl) ensRN Rattus norvegicus (Ensembl) ensTN Tetraodon nigroviridis (Ensembl) modCB Caenorhabditis briggsae (Model organism database) modCE Caenorhabditis elegans (Model organism database) modDD Dictyostelium discoideum (Model organism database) modDM Drosophila melanogaster (Model organism database) modDP Drosophila pseudoobscura (Model organism database) modMM Mus musculus (Model organism database) modOG Oryza sativa (Model organism database) modRR Rattus norvegicus (Model organism database) modSC Saccharomyces cerevisiae (Model organism database) modSP Schizosaccharomyces pombe (Model organism database) ncbAT Arabidopsis thaliana (NCBI) ncbEC Escherichia coli (NCBI) sanPF Plasmodium falciparum (Sanger) FASTA Combined fasta file containing all the above species (FASTA.phr/FASTA.pin/FASTA.psq can be used in combination to create a local blast library ) SQL tables used to construct the main Inparanoid Database location; ----------------------------------------------------------------------- Name Description sqltable_full Table of all Ensembl proteins used; Each column represents: 1 - Gene identifier 2 - Transcript identifier 3 - Peptide identifier 4 - External database identifier, e.g. HUGO. 5 - Uniprot acc no. 6 - Ensembl Family identifier 7 - Ensembl Family description 8 - Gene/Protein description 9 - Species abbreviation (see fasta list above) 9 - Original fasta filename:Original fasta header Main section Inparanoid clustering results: ---------------------------------------------------- Name Description orthologs.?????-?????.html Output files containing all Inparanoid clusters for each species pair in html format. See species fasta file list above for species abbreviations. e.g. orthologs.ensHS-ensDM.html; All Inparanoid clusters between Homo sapiens and Drosophila melanogaster. Location: http://inparanoid.cgb.ki.se/download/current/html/ orthologs.?????-?????.xml Output files containing all Inparanoid clusters for each species pair in html format. See species fasta file list above for species abbreviations. e.g. orthologs.ensHS-ensDM.html; All Inparanoid clusters between Homo sapiens and Drosophila melanogaster. Contains the most info of all result files. Contains internal description. Location: http://inparanoid.cgb.ki.se/download/current/xml/ sqltable.?????-????? Output files containing all Inparanoid clusters for each species pair in table format. See species fasta file list above for species abbreviations. e.g. sqltable.ensHS-ensCE; All Inparanoid clusters between Homo sapiens and Caenorhabditis elegans Each column represents: 1 - Cluster number 2 - Seed ortholog-pair blast score in bits 3 - Species abbreviation 4 - Inparanoid score 5 - Protein identifier 6 - Bottstrap value for seed inparalog/ortholog Location: http://inparanoid.cgb.ki.se/download/current/sqltables/ longsqltable.?????-????? Output files containing all Inparanoid clusters for each species pair in table format. See species fasta file list above for species abbreviations. e.g. sqltable.ensHS-ensCE; All Inparanoid clusters between Homo sapiens and Caenorhabditis elegans Each column represents: 1 - Cluster number 2 - Seed ortholog-pair blast score in bits 3 - Species abbreviation 4 - Inparanoid score 5 - Protein identifier 6 - Bottstrap value for seed inparalog/ortholog 7 - Description/name 8 - Original fasta filename:Original fasta header 9 - External ID Location: http://inparanoid.cgb.ki.se/download/current/sqltables/