Downloadable packages
- MyWEST.zip (1843 KB): software application package containing all files necessary to run MyWEST. It also includes MS-Access and MySQL databases for storing the aggregated extracted annotations. MS-Access database contains also two sets of specifically implemented queries for mining and providing comprehensive integrated views of the extracted annotations stored into the database. A set of SQL queries implemented for the MySQL database are separately stored in text files also included in this package. (MyWEST.zip content...)
- TestingResults.zip (15109 KB): test result package containing a list of genes differentially expressed in U937 cells after 4 hours of treatment with 10-6 M Retinoic Acid (RA), derived from a microarray experiment. The package also contains the templates created with MyWEST for extracting annotations of interest for the considered list of genes from several web databanks, and the corresponding extracted data both in single text excel files and aggregated in a MS-Access and a MySQL database. Tables of combined annotations for the test genes, mined by means of specifically implemented queries in the MS-Access database of the aggregated mined data, are also provided in text excel file format. (TestingResults.zip content...)
Required softwareMyWEST is implemented in Java and then requires a Java 2 Virtual Machine, version 1.3.1 or above, installed on used computer.
A Java 2 Runtime Environment (JRE) self-extracting file for Windows 95/98/NT4.0/2000/XP, containing the Java 2 Virtual Machine, can be downloaded directly from here, or by choosing the adequate JRE version for the used computer platform at http://java.sun.com/products/archive/.
In both cases, execute the downloaded JRE self-extracting file to run the installation program and follow the installation instructions. Here you can find the installation notes for JRE 1.3.1 version for Windows 95/98/NT4.0/2000/XP.
Apple users need al least Mac OS X to run MyWEST, since only Mac OS X includes, as core component, a full version of Java 2 Virtual Machine (precisely a Java 2 Standard Edition, version 1.4.2).
To use database functionalities implemented in MyWEST, a suitable Relational Data Base Management System (e.g. MS-Access, MySQL, MS-SQL Server, Oracle, Informix, ...) must be available. However, if a RDBMS is not available, the extracted data can be saved in text excel file format (ASCII text files with ".xls" extension).
MyWEST.zip content:
- MyWEST\AllTools.jar: required system Java package.
- MyWEST\Config.txt: default extraction configuration file.
- MyWEST\DatabasesURL.txt: web databank and storing database accessing configuration file.
- MyWEST\jtds-0.5.jar: Java driver to connect to MS-SQL Server 7 databases where storing aggregated the extracted data.
- MyWEST\mysql-connector-java-3.0.10-stable-bin.jar: Java driver to connect to MySQL databases where storing aggregated the extracted data.
- MyWEST\MyWEST.bat: My Web Extraction Software Tool start file.
- MyWEST\MyWEST.ico: My Web Extraction Software Tool icon file.
- MyWEST\MyWEST.jar: My Web Extraction Software Tool program package.
- MyWEST\MyWEST-Up.bat: start file of the updating module of the My Web Extraction Software Tool program.
- MyWEST\MyWEST-Up.jar: updating module package of the My Web Extraction Software Tool program.
- MyWEST\ParseClasses.jar: required system Java package.
- MyWEST\data\: system subdirectory where local database(s) in which aggregating and saving extracted data should be located.
- MyWEST\data\DBgene.mdb: default MS-Access database where storing the extracted data and containing the implemented queries to extract such data.
- MyWEST\data\Empty-DBgene.mdb: data-empty duplicate of the default DBgene.mdb database to be copied to generate new databases ready to be used.
- MyWEST\data\MySQL\: system subdirectory where are lacated MySQL databases.
- MyWEST\data\MySQL\GeneData\: MySQL database where storing the extracted data.
- MyWEST\data\MySQL\Empty-GeneData\: data-empty duplicate of the MySQL GeneData database to be copied to generate new MySQL databases ready to be used.
- MyWEST\data\MySQL\SQLqueries\: system subdirectory where the ASCII text files, containing the implemented SQL queries to extract data from the used MySQL databases, are saved. These files are:
- List Data Categories.sql
- List ID Code Types.sql
- List ID Codes.sql
- List Tables.sql
- List Templates.sql
- List Web Databases.sql
- Mine data of 1 given category and with 1 given word (+ links).sql
- Mine data of 1 given category and with 1 given word.sql
- Mine data of 1 given category and with 2 given words (+ links).sql
- Mine data of 1 given category and with 2 given words.sql
- Mine data of 1 given ID Code (+ links).sql
- Mine data of 1 given ID Code.sql
- Mine data of 1 given Table (+ Data Categories + links).sql
- Mine data of 1 given Table (+ Data Categories).sql
- Mine data of 1 given Table (+ links).sql
- Mine data of 1 given Table.sql
- Mine data of 1 of 2 given categories and with 1 given word each.sql
- Mine data of 1 of 2 given ID Codes (+ links).sql
- Mine data of 1 of 2 given ID Codes.sql
- Mine data with 1 given word (+ links).sql
- Mine data with 1 given word.sql
- Mine data with 1 of 2 given words (+ links).sql
- Mine data with 1 of 2 given words.sql
- Mine data with 2 given words (+ links).sql
- Mine data with 2 given words.sql
- Mine dataSets of 1 given category and with 1 given word.sql
- Mine dataSets of 1 given category and with 2 given words.sql
- Mine dataSets of 1 of 2 given categories and with 1 given word e.sql
- Mine dataSets of 2 given categories and with 1 given word each.sql
- Mine dataSets with 1 given word (+ links).sql
- Mine dataSets with 1 given word.sql
- Mine dataSets with 1 of 2 given words (+ links).sql
- Mine dataSets with 1 of 2 given words.sql
- Mine dataSets with 2 given words in different dataRow cells.sql
- Mine dataSets with 2 given words in the same dataRow cell.sql
- Mine links with 1 given word.sql
- Mine links with 1 of 2 given words.sql
- Mine links with 2 given words.sql
- MyWEST\datafiles\: system subdirectory where the files with the mined data are saved in text excel format.
- MyWEST\datafiles\ExampleCodes.xls: text excel file containing the GenBank Accession Numbers, LocusLink IDs, and Swiss-Prot Accession Numbers of an example short list of genes, which can be used for running MyWEST.
- MyWEST\datafiles\ValidationData.xls: text excel file containing a list of 729 genes resulted induced or repressed by RA treatment. Out of these 729 candidate regulated genes, 339 were induced and 390 repressed.
- MyWEST\logs\: system subdirectory where the extraction log files are saved in text excel format.
- MyWEST\templates\: system subdirectory where the created template files are saved.
- MyWEST\templates\Example-GC.pro and MyWEST\templates\Example-GC_config.txt: example template and correspondent configuration file for mining annotations from the GeneCards web databank.
- MyWEST\templates\Example-LL.pro and MyWEST\templates\Example-LL_config.txt: example template and correspondent configuration file for mining annotations from the LocusLink web databank.
- MyWEST\templates\Example-MGI.pro and MyWEST\templates\Example-MGI_config.txt: example template and correspondent configuration file for mining annotations from the Mouse Genome Informatics web databank.
- MyWEST\templates\Example-SP.pro and MyWEST\templates\Example-SP_config.txt: example template and correspondent configuration file for mining annotations from the Swiss-Prot web databank.
- MyWEST\templates\Example-SS.pro and MyWEST\templates\Example-SS_config.txt: example template and correspondent configuration file for mining annotations from the SourceSearch web databank.
- MyWEST\templates\Example-UG.pro and MyWEST\templates\Example-UG_config.txt: example template and correspondent configuration file for mining annotations from the UniGene web databank.
- MyWEST\updating\: system subdirectory used by the extracted data updating MyWEST module.
- MyWEST\updating\Codes.txt: text file containing a short sample list of nucleotide sequence codes which can be used for running the updating MyWEST module.
- MyWEST\updating\Templates.txt: text file containing a sample list of templates to use for running the updating MyWEST module.
- MyWEST\updating\datafiles\: system subdirectory where the files with the updated mined data are saved in text excel format.
- MyWEST\updating\logs\: system subdirectory where the updating extraction log files are saved in text excel format.
TestingResults.zip content:
Test gene list:
(MyWEST\MT_RA-4h\ directory)
- MT_RA-4h.xls: excel file containing a list of 729 genes resulted induced or repressed by RA treatment. Out of these 729 candidate regulated genes, 339 were induced and 390 repressed.
Mined data database:
(MyWEST\MT_RA-4h\data\ directory)
- WebExtractionResults-DBgene.mdb: a MS-Access 2000 database containing aggregated all the annotations mined from the UniGene, LocusLink, Swiss-Prot, SOURCE, and GeneCards web databanks for the considered test genes.
- MySQL\WebExtractionResults-GeneData\: a MySQL database containing aggregated all the annotations mined from the UniGene, LocusLink, Swiss-Prot, SOURCE, and GeneCards web databanks for the considered test genes.
Mined data files:
(MyWEST\MT_RA-4h\datafiles\ directory)
- annotations mined from the SOURCE (SS) web databank for the considered test genes (a text excel file for each annotation type).
- global-SS-Aliases_001_01.xls
- global-SS-Chromosomal Location_001_01.xls
- global-SS-External Resources_001_01.xls
- global-SS-External Resources_more_001_01.xls
- global-SS-External Resources_more1_001_01.xls
- global-SS-External Resources_more2_001_01.xls
- global-SS-Gene Ontologies_001_01.xls
- global-SS-Gene Symbol&Title_001_01.xls
- global-SS-Gene Symbol&Title_more_001_01.xls
- global-SS-Gene Symbol&Title_more1_001_01.xls
- global-SS-Gene Symbol&Title_more2_001_01.xls
- global-SS-Locus Link Summary_001_01.xls
- global-SS-Normalized expression _001_01.xls
- global-SS-Orthologs_001_01.xls
- global-SS-Summary Function_001_01.xls
- global-SS-SwissProt Information_001_01.xls
- global-SS-UniGene Cluster_001_01.xls
- annotations mined from the LocusLink (LL) web databank for the considered test genes (a text excel file for each annotation type).
- global-LL-Additional Links_002_01.xls
- global-LL-Cytogenetic location_002_01.xls
- global-LL-EC Number_002_01.xls
- global-LL-EC Number_more_002_01.xls
- global-LL-Gene Ontology_002_01.xls
- global-LL-Gene References into Function_002_01.xls
- global-LL-Haplotype_002_01.xls
- global-LL-Locus ID_002_01.xls
- global-LL-Map Information_002_01.xls
- global-LL-NCBI Genome Annotation_002_01.xls
- global-LL-Overview_002_01.xls
- global-LL-Phenotype_002_01.xls
- global-LL-Phenotype_more_002_01.xls
- global-LL-RefSeq - NCBI Genome Annotation_002_01.xls
- global-LL-RefSeq - REVIEWED_002_01.xls
- global-LL-Symbol, Name and Alias_002_01.xls
- annotations mined from the Swiss-Prot (SP) web databank for the considered test genes (a text excel file for each annotation type).
- global-SP-Comments_003_01.xls
- global-SP-Features_003_01.xls
- global-SP-General information_003_01.xls
- global-SP-Keywords_003_01.xls
- global-SP-Name and origin_003_01.xls
- global-SP-Sequence information_003_01.xls
- global-SP-Taxonomy_003_01.xls
- annotations mined from the GeneCards (GC) web databank for the considered test genes (a text excel file for each annotation type).
- global-GC-Additional Gene or cDNA sequences_004_01.xls
- global-GC-Aliases and Additional Descriptions_004_01.xls
- global-GC-Chromosomal Location_004_01.xls
- global-GC-Disorders & Mutations_004_01.xls
- global-GC-Disorders & Mutations_more_004_01.xls
- global-GC-Disorders & Mutations_more1_004_01.xls
- global-GC-Gene Symbol & Title_004_01.xls
- global-GC-Genome Wide Resources_004_01.xls
- global-GC-Proteins_004_01.xls
- global-GC-REFSEQ mRNAs_004_01.xls
- global-GC-Unigene Cluster_004_01.xls
- global-GC-Unigene Representative Sequence_004_01.xls
- annotations mined from the UniGene (UG) web databank for the considered test genes (a text excel file for each annotation type).
- global-UG-Expression Information_005_01.xls
- global-UG-Gene UC_ID-Symbol&Title_005_01.xls
- global-UG-Mapping Information_005_01.xls
- global-UG-mRNA Sequences_005_01.xls
- global-UG-Protein Similarities_005_01.xls
Extraction log files:
(MyWEST\MT_RA-4h\logs\ directory)
- LogFile_001.xls: the log file of the extraction from the SOURCE web databank for the considered test genes.
LogFile_002.xls: the log file of the extraction from the LocusLink web databank for the considered test genes.
LogFile_003.xls: the log file of the extraction from the Swiss-Prot web databank for the considered test genes.
LogFile_004.xls: the log file of the extraction from the GeneCards web databank for the considered test genes.
LogFile_005.xls: the log file of the extraction from the UniGene web databank for the considered test genes.
Mined data tables:
(MyWEST\MT_RA-4h\results\ directory)
Each table is contained in a distinct excel file and is the result of a specific articulated query implemented in the database of the aggregated extracted data.
- _MyRegulatedGenes.xls: expression regulations in the considered experiment, and gene symbols and titles of the test genes. Out of the 729 identified candidate regulated genes, 513 (221 induced, 292 repressed) were classified genes and 216 (118 induced, 98 repressed) ESTs, according to the UniGene build Hs.160 - release 14 June 2003.
- A_ClusterID-Symbol&Title_UG.xls: comprehensive view of the expression regulations of the test data set and the UniGene cluster ID, gene symbol, and gene title annotations, mined from the UniGene web databank.
- A_CytogeneticLocation_LL.xls: comprehensive view of the expression regulations of the test data set and the cytogenetic location annotations, mined from the LocusLink web databank.
- A_Disorders&Mutations_GC.xls: comprehensive view of the expression regulations of the test data set and the disorder and mutation annotations, mined from the GeneCards web databank.
- A_FunctionalAnalysis-Transcription_LL_SP.xls: comprehensive integrated view of the expression regulations of the test data set and the functional analysis results presenting the transcription related genes of the testing list (according to the Gene Ontology, RefSeq Summary, Gene Reference Into Function, keyword, and function annotations mined from the LocusLink and Swiss-Prot web databanks, respectively).
- A_FunctionalAnalysis-Differentiation_LL_SP.xls: comprehensive integrated view of the expression regulations of the test data set and the functional analysis results presenting the differentiation related genes of the testing list (according to the Gene Ontology, RefSeq Summary, Gene Reference Into Function, keyword, and function annotations mined from the LocusLink and Swiss-Prot web databanks, respectively).
- A_GeneOntology_LL.xls: comprehensive view of the expression regulations of the test data set and the Gene Ontology functional categories, mined from the LocusLink web databank.
- A_GeneRIF_LL.xls: comprehensive view of the expression regulations of the test data set and the Gene Reference Into Function annotations, mined from the LocusLink web databank.
- A_GenomeWideResources_SS_GC.xls: comprehensive integrated view of the expression regulations of the test data set and the reference codes in several resorces, mined from the SOURCE and GeneCards web databanks.
- A_GenomicContig-Ortholog-EC_Number_LL_SS.xls: comprehensive integrated view of the expression regulations of the test data set and the genomic contig reference codes, mouse ortholog reference codes, and EC numbers, mined from the LocusLink and SOURCE web databanks.
- A_Keywords_SP.xls: comprehensive view of the expression regulations of the test data set and the protein product keywords, mined from the Swiss-Prot web databank.
- A_mRNASequences_UG.xls: comprehensive view of the expression regulations of the test data set and the mRNA sequence reference codes, mined from the UniGene web databank.
- A_Name&Origin-SequenceInformation_SP.xls: comprehensive view of the expression regulations of the test data set and the protein product names, origins, molecular weights, and lengths, mined from the Swiss-Prot web databank.
- A_NormalizedExpression_SS.xls: comprehensive view of the expression regulations of the test data set and the normalized expressions in different tissues, mined from the SOURCE web databank.
- A_OverviewRefSeqSummary_LL.xls: comprehensive view of the expression regulations of the test data set and the functional RefSeq Summary annotations, mined from the LocusLink web databank.
- A_Phenotype_LL.xls: comprehensive view of the expression regulations of the test data set and the phenotype annotations, mined from the LocusLink web databank.
- A_ProteinSimilarities_UG.xls: comprehensive view of the expression regulations of the test data set and the protein product similarities in different organisms - with the percent identity and length of the aligned regions - mined from the UniGene web databank.
- A_RefSeq_LL.xls: comprehensive view of the expression regulations of the test data set and the genomic contig and mRNA Reference Sequence codes, mined from the LocusLink web databank.
- A_SwissProtInfo-Disease_SP.xls: comprehensive view of the expression regulations of the test data set and the disease annotations, mined from the Swiss-Prot web databank.
- A_SwissProtInfo-Function_SP.xls: comprehensive view of the expression regulations of the test data set and the function annotations, mined from the Swiss-Prot web databank.
- A_SwissProtInfo-ProteinExpression_SP.xls: comprehensive view of the expression regulations of the test data set and the protein product expression annotations - including tissue specificity, developmental stage, and induction - mined from the Swiss-Prot web databank.
- A_SwissProtInfo-ProteinFunction_SP.xls: comprehensive view of the expression regulations of the test data set and the protein product functional annotations - including pathway, function, subcellular location, enzyme regulation, catalytic activity, and cofactor - mined from the Swiss-Prot web databank.
- A_SwissProtInfo-ProteinStructure_SP.xls: comprehensive view of the expression regulations of the test data set and the protein product structure annotations - including domain, similarity, ptm, subunit, and polymorphism - mined from the Swiss-Prot web databank.
Templates:
(MyWEST\MT_RA-4h\templates\ directory)
- global-GC.pro and global-GC_config.txt: template and correspondent configuration file used for mining the annotations from the GeneCards web databank for the considered test genes.
- global-LL.pro and global-LL_config.txt: template and correspondent configuration file used for mining the annotations from the LocusLink web databank for the considered test genes.
- global-SP.pro and global-SP_config.txt: template and correspondent configuration file used for mining the annotations from the Swiss-Prot web databank for the considered test genes.
- global-SS.pro and global-SS_config.txt: template and correspondent configuration file used for mining the annotations from the SOURCE web databank for the considered test genes.
- global-UG.pro and global-UG_config.txt: template and correspondent configuration file used for mining the annotations from the UniGene web databank for the considered test genes.
© Marco Masseroli, PhD masseroli@biomed.polimi.it - Last update on June 10, 2004 - 17:38:50