GMQL System

The GenoMetric Query Language (GMQL) system is a novel high-throughput computational software for multi-sample integration and processing of big heterogeneous datasets of genomic features and known annotations from Next Generation Sequencing.

Meeting efficiency, flexibility and usability requirements, it can operate on large amounts of heterogeneous genomic region data in their usual formats, and can compare billions of genomic regions, mainly on the basis of metric properties but also of arbitrary region attributes and of metadata content, with enhanced accessibility, portability, scalability and performance.

All its implemented features (see About) make the GMQL system an easy-to-use, versatile and interoperable support for processing big data of heterogeneous genomic features in order to extract candidate targets for biomedical knowledge discovery. Many provided biological use case examples on public ENCODE, Roadmap Epigenomics and TCGA datasets demonstrate GMQL system relevance.

An implementation of the GMQL system open for public use at http://www.gmql.eu/ is installed on a cluster at CINECA, where it can be freely tested through its Web and REST interfaces. The user manual is available here.

Full documentation of all GMQL operators and examples of their use are available here.
GMQL System source code and documentation are available here.


Former version of GMQL is available here.


GMQL System is supported by the Data-Driven Genomic Computing (GeCo) project, funded by the European Research Center (ERC) (Advanced ERC Grant 693174).