First time use considerations (here for a tutorial)
- To use GAAS, a user must be registered and login through the "GAAS logon window". To register a user its user name and password must be specified in the "Users" table inside the MasterDB system database. This can be done through the "Users" section in the MasterDB management window of the Analyzer program that can be opened by selecting the "MasterDB Management" command in the "Tools" menu of the Analyzer main window. To enable the first use of GAAS, in MasterDB a master and a demo user have been already inserted. The demo user has the following identifiers:
User Name: demoUser Password: demoUserThe master user should be the administrator of the GAAS system. To have the master user privileges, he/she must be the first in the list of users in the "Users" table of the MasterDB. Compared to the other users, the master user has also the privileges of removing registered users and public labels, parameter or output structure sets. The registered master user has the following identifiers:User Name: masterUser Password: masterUserAfter the first use, login as master user and change the master user identifiers through the "Users" section in the MasterDB management window of the Analyzer program that can be opened by selecting the "MasterDB Management" command in the "Tools" menu of the Analyzer main window.
- All label databases must be registered in the "Labels" table inside the MasterDB system database. To facilitate the first use of GAAS, the label databases "LabelFilters.mdb" and "LabelMicro.mdb" of the provided testing data have already been registered in MasterDB.mdb. However, the first time each of these label databases are used in the Assembler or Analyzer program, the user is asked to define the local path of the label database utilized. A friendly user interface aids the user to this aim.
- Both Assembler and Analyzer programs write automatically an "ini" file (GeneArray*.ini), in which useful information are stored and reloaded every time the programs are run. The first time each program is run, these two files are not present. Therefore the system asks the user to locate the MasterDB.mdb file. When closing each program, the correspondent ini file is written on disk inside the same directory of the correspondent GAAS program.
Example of GeneArrayAnalyzer.ini file:
[NAMEPATH_MASTERDB_LOGIN] D:\GAAS\SystemDB\MasterDB.mdb [LAST-USER] demoUser [PATH_INPUT] D:\GAAS\Data\Filters [PATH_OUTPUT] D:\GAAS\Data\Filters [PATH_LABELS] D:\GAAS\Data\Filters [LAST-DATAFILE] D:\GAAS\Data\Filters\Filter.mdb NumExperiments: 6
GAAS package use
To use GAAS, any input dataset must be provided associated with a label dataset specifying identifiers and location on the spotted array of the clones to be analyzed. In GAAS, the Analyzer program uses a built-in database-based gene expression data structure to perform fast differential gene expression analyses. Any input data structure in MS-Excel format can be transformed into a built-in database-based data structure in MS-Access format using the Assembler program.
To achieve data structure transformation, the input data structure and its label data structure must be specified inside the MasterDB database in the "InputStructure" and "LabelStructure" tables, respectively. This can be done as specified here and here, respectively. To manage input and label data structures the "Input Structure" and "Label Structure" sections in the MasterDB management window of the Analyzer program that can be opened by selecting the "MasterDB Management" command in the "Tools" menu of the Analyzer main window.
Assembler program use (here for a tutorial)
- Start the Assembler program and login through the GAAS logon window.
- In the Assembler main window the user must click the "Open data file ..." button to select the set of files to convert. The selected files must have a MS-Excel format and a homogeneous data structure.
- The program automatically fits the input file structure to those stored into the MasterDB. If more than one match structure is found, the user is asked to select one.
- Based on the matched input data structure, the system provides the list (Labels "Name" list) of all label databases registered in MasterDB adequate for the input data structure. The user must select one of these label databases. If the selected label database is not located in the directory registered for it in MasterDB, the program automatically asks the user to indicate the new label database location through a friendly user-interface.
If no suitable label database type has been previously registered in MasterDB, the user can select and register a new label database by clicking the "Select Database" button.- For microarray data using a label database containing information of all clones available in the clone plate library used to spot the microarrays, using the "Load Plate List" button the user must select and locate a further MS-Excel data file containing the list of plates with the clones actually spotted on the specific microarrays the selected input data files refer to. The structure (name and content of columns) of this plate list file must be in accordance with the structure and type of content of the table in label database containing the same information. An example of this type of label database and the correspondent plate list file is provided among the testing data files.
- Clicking the "Process..." button the user can start the pre-processing by defining the name of the MS-Access database to be created, containing the input data in the transformed GAAS structure. If the user stops the pre-processing execution before its end (by clicking the "Stop" button), no output database is stored.
Analyzer program use (here for a tutorial)
- Start the Analyzer program and login through the GAAS logon window.
- Select the data file to analyze in the "Input Database" panel on the upper left side of the Analyzer main window. This panel contains the database data files previously generated by the Assembler program, or previously loaded into the Analyzer program. If the Analyzer was already opened when the Assembler generated a database file, to load this database file into the "Input Database" panel select the "MasterDB Reloading" command in the "Tools" menu of the Analyzer main window.
If database data files with suitable GAAS structure not created by the Assembler are available, they can be loaded in the "Input Database" panel by selecting the "Open ..." command in the "Database" menu of the Analyzer main window. In this case an "Open Data Source" window appears in which the user must specify the name and location of the database data file to load, the type of data it contains (e.g. Filter, Microarray), the data structure of the input data (to be selected among those stored into the MasterDB), and the label database registered into MasterDB suitable for the data to load. If no suitable data structure is registered into the MasterDB, a new input data structure (and in case its label data structure) can be specified inside the MasterDB database in the "InputStructure" table (and "LabelStructure" table) as here (and here) described. Defined input and label structures can be managed through the "Input Structure" (and "Label Structure") section in the MasterDB management window that can be opened by selecting the "MasterDB Management" command in the "Tools" menu of the Analyzer main window. As well, if the specified label database is not in the file system directory registered in MasterDB, the program automatically asks the user to indicate the new label database location through a friendly user-interface. Conversely, if no suitable label database has been previously registered into the MasterDB, the user can select and register a new label database by clicking the "Select Database" button in the appeared "Open Data Source" window. The information of input and label databases stored in MasterDB can be managed respectively through the "Info" and "Labels" sections in the MasterDB management window, that can be opened by selecting the "MasterDB Management" command in the "Tools" menu of the Analyzer main window.- When a database data file is selected in the "Input Database" panel, a new window (Input array data type assignment window) appears. In this window, the user must define the correspondence between each data set contained in the selected database data file and the experiment types (Control or Test), and which experimental replica each data set belongs to. The defined correspondence then are automatically stored into the selected database to avoid repeating this step in future analyses. However, it can be always modified.
- Confirming the defined correspondence ("OK" button), the Input array data type assignment window closes and name and status of the single loaded data sets are displayed in the "Tables" panel on the bottom left side of the Analyzer main window.
- Automatically the Analyzer program loads in memory the related label database.
- Selecting in the "Tables" panel one of the loaded data sets, the Input data window appears visualizing the loaded array spot expression data.
- Selecting the "Parameter Settings" command in the "Tools" menu of the Analyzer main window the Parameter setting window can be opened. In this window the user can choose whether applying background correction, setup a set of values for the analysis parameters (e.g. background and expression level thresholds, experiment regulation confidence levels, replica regulation probability cut-offs). The parameter setup can be stored in MasterDB by clicking the "Save" button in the Parameter setting window. Conversely a parameter set already stored in MasterDB can be used and selected in the bottom right corner of the Parameter setting window. The parameter sets in MasterDB can be managed through the "Parameters" section in the MasterDB management window that can be opened by selecting the "MasterDB Management" command in the "Tools" menu of the Analyzer main window.
- Selecting the "Expression Bounds" command in the "Tools" menu of the Analyzer main window the Gene expression bound window can be opened. In it the user can define upper and lower thresholds of expression levels to be used to exclude low (noise affected) or high (saturated) intensity levels.
- Selecting the "Normalization Types" command in the "Tools" menu of the Analyzer main window the Normalization option window can be opened. In this window the user can select the type of normalization to be applied to the expression data sets before comparing expression values from distinct arrays or experimental conditions. A normalization factor can be computed considering all clone intensities in the array, or a subset of clone intensities (e.g. housekeeping genes or control spikes of heterologous genes). In either case, low quality clones can be excluded by checking the "Valid clones" check box in the Normalization option window. The subset of array clone intensities to be used for data normalization can be chosen on the bottom of the Normalization option window by selecting one of the clone filter stored inside the MasterDB database in the "FilterCloneType" table. These clone filters can be built, in term of the clone attributes present in the label database (see here how), and managed through the "Clone Filter" section in the MasterDB management window, which can be opened by selecting the "MasterDB Management" command in the "Tools" menu of the Analyzer main window.
- Selecting one of the commands in the "Analysis" menu of the Analyzer main window the user can run the data analysis, which is separated into subsequent processing steps:
Conversely, selecting the "All Analyses" command, the user can run subsequently all previous processing steps.
- Background: Background quality labels are automatically assigned to exclude array regions altered by stains.
- Single: Spot quality labels are automatically determined to reject expression values affected by spotting errors. Clone quality labels are computed automatically as matching of background and spot quality labels.
Background correction, to eliminate eventual topological hybridization differences and to improve signal (expression intensity level) to noise ratio, is performed.
Expression level normalization is performed to enable comparing expression values from distinct arrays or experimental conditions.- Pair: Statistical significant differential expressions in a single experiment are evaluated on the basis of expression intensity ratios (test vs. control condition).
- Replica: Significant differential expressions across multiple replica experiments are determined.
When a data analysis is run, the used parameter set is automatically associated to the input database to avoid repeating setup of analysis parameter values in future analyses of the same input data. However, the parameter set to be use can be always modified.- Analysis results can be viewed in tabular form inside the Background, Single, Pair, and Replica panels, respectively, of the Result interface. By selecting the "Show Columns" command in the "Tools" menu of the Analyzer main window the Show column window can be opened. In this window each user can make its own customization of the data visualization by selecting the columns to be visualized in each result data panel.
Analysis results can also be seen in graphical form in the Histogram and Scatter Plot panels of the Result interface. In both graphic panels, by clicking on the panel with the mouse right button an option menu appears in which the user can select the following options to customize the graph visualization:Two key sequence commands are available:
- Variable enables selecting the variable values to display (i.e. Bkgr, Spot, Spot-Bkgr, Clone, Clone-Bkgr, Clone-Bkgr Normalized, and - only for histogram plot - Ratio);
- Data enables selecting data graphic layout (i.e. Bars, Curve for histogram, and Scatter Points for scatter plots);
- Scale enables selecting the scale modality of data graphic layout (i.e. Linear, Logarithmic);
- Classes (for histogram plots only) enables defining the number of classes used to build the histogram;
- Clone Navigation (for scatter plots only) Activates/Inactivates graphic Selection of clones in scatter plot by mouse picking (i.e. at the same time pressing "n" key and clicking with mouse left button on a clone graphical representation);
- Show/Hide enables either showing or hiding Reference regulation bounds (showing/hiding the lines graphically delimiting the 2 folding bounds of test vs. control gene expressions), Regulation bounds (showing/hiding the lines identifing the regulation bounds considered during the analysis for the ratio values), Selected gene (for scatter plots only, showing/hiding the green cross graphically identifying a searched clone), Mouse Coordinates, and Grid;
- Zoom enables selecting/removing and applying the zoom tool to magnify interactively the data displayed in the graph;
- Fit enables fitting the graphic view to the Width, Height, and Page dimension of the graphic panel;
- Properties enables managing graphical General (mainly coordinate ranges), Axis, and Graphs properties;
- View enables displaying numerical Values of plotted data.
When the Scatter Plot panel is selected, a Plotbar automatically appears above the Result interface (conversely this bar can also be shown by selecting the "Plotbar" command in the "View" menu of the Analyzer main window). In this bar the user can select the datasets to plot on each axis of the scatterplot. Moreover, this bar can be used to search for a clone in the scatter plot by specifying either Location (in the array), Accession number, or Clone ID of the clone to search. By clicking on the "Go" button in the Plotbar, the searched clone is identified in the scatter plot with a cross. Besides, in the Input data window and in all tabular analysis result panels (Background, Single, Pair, and Replica) the rows containing the data of the searched clone are highlighted.
- pressing at the same time the "shift" key and the mouse left button, and dragging the mouse on the graph enables moving the plot;
- pressing at the same time the "z" key and the mouse left button enables removing the zoom tool to magnify interactively the data displayed in the graph.
- Analysis results can also be searched and interactively navigated through input data (Input data window), tabular result data (Background, Single, Pair, and Replica panels), and scatter plot (Scatter plot panel) by selecting the "Clone Search " command in the "Tools" menu of the Analyzer main window which makes appearing the Clone search window. In this window the user can specify either Location (in the array), Accession number, or Clone ID of the clone in the analyzed array whose analysis results the user want to search. By clicking on the correspondent "Find" button in the Clone search window, the rows with the data of the searched clone are highlighted in the Input data window and in all tabular analysis result panels (Background, Single, Pair, and Replica). If the Scatter plot panel is active, the searched clone is identified with a green cross.
- Analysis results can be stored in an output database by selecting the "Save" command in the "Database" menu of the Analyzer main window and defining the name of the MS-Access database to be created. The output data structure can be customized by the user (through the Saving Options window that can be opened by selecting the "Saving Options" command in the "Database" menu of the Analyzer main window) and stored in MasterDB by clicking the "Save" button in the Saving Options window. Conversely, an output data structure already stored in MasterDB can be used and selected in the bottom right corner of the Saving Options window. The output data structures in MasterDB can be managed through the "Output Structure" section in the MasterDB management window that can be opened by selecting the "MasterDB Management" command in the "Tools" menu of the Analyzer main window.
When analysis results are saved, the used output data structure is associated to the input database to avoid repeating the same output data structure customization in future analyses of the same input data. However, the output structure to be used can be always modified.
© Marco
Masseroli, PhD
masseroli@elet.polimi.it - Last update on
.