Download

GMQL software freely available. Downloads to GNU/Linux systems:


Please refer to:


GMQL Services installation

To install the GMQL Web Services, please orderly do the following:



The provided GMQL packages (both v.1.0.1 and v.1.2) include four example GMQL queries addressing some typical biological use cases, such as:

  1. Finding ChIP-seq peaks in promoter regions
  2. Finding distal bindings in transcription regulatory regions
  3. Associating transcriptomics and epigenomics
  4. Finding somatic mutations in exons

The packages include also a few small-scale datasets with data and metadata from the ENCODE and TCGA projects, which we provide just for testing the examples and demonstrate the power and flexibility of GMQL at work in a rich set of biological use cases.

Note that GMQL is designed for cloud computing processing of big data in the Hadoop framework (i.e. when used in MapReduce mode).
It shows its assets in particular when it is applied on numerous data samples with many genomic regions and of multiple data types, in order to identify their genomic regions that satisfy given distance constrains.

GMQL can be used also with small data and on non-parallel computing frameworks (i.e. when used in Local mode); in these cases other available tools may show much shorter running times, but then they fail on massive data.


Run examples in two clicks

Within the packages we include two shortcut commands to enable also first-time GMQL users to quickly execute in Local mode the provided examples and see their results.

First click: After GMQL installation (see GMQL Quick Start), before running the four provided examples, the datasets used as input in the examples must be created using the data in the folder GMQLPackage/EXAMPLES/data/. To do so, execute: ./GMQLPackage/EXAMPLES/createInputDataSets.sh
This makes the following four input datasets available in your GMQL user account:

HG19_ANN
HG19_MUT
HG19_PEAK
HG19_RNA

Second click: To run all together the four example GMQL queries, execute: ./GMQLPackage/EXAMPLES/runScriptExamples.sh
After the execution finishes (few seconds), all generated result datasets are shown in the print out:

PROM_HM_TF
TF_res
Genome_space
Exon_res

Each GMQL example materializes only one dataset; so, in total four output datasets, one for each example, are generated in the GMQL repository. From there, their data files can be extracted and placed in a user local folder for their use outside GMQL; this can be done by executing the following command (see Section 1.5 of the GMQL Tutorial):
repositoryManagerV1 CopyDSToLocal <DatasetName> <DestinationLocalFolder>

Thanks to the standard data formats used, both input data samples and generated results can be directly loaded in a Genome Browser (e.g. UCSC Genome Browser, Integrated Genome Browser (IGB), or Integrative Genomics Viewer (IGV)) for their easy visualization, browsing and evaluation.