This page will help you better understand the usage of the read-only Web API that is designed both to get a complete insight of the Genomic and Proteomic Knowledge Base (GPKB) and to allow its comprehensive querying via HTTP requests.
A Web application for the GPKB is also available at GPKB Web site.
The API is made up of two distinct parts:
Multiple databases are available to the user on both parts of the API. The usage of the different database is defined in the Database Selection section.
The data warehouse has a complex structure and changed by time in order to support the original data source
information and efficiently extract the data.
The REST services divided into to two sub-sections, in order to support the data warehouse.
The first service (metadata API) provides the
structural information of the selected data warehouse version. The structural information of the annotations and the
feature is extracted by using these services and each client should always start using the API from using this
services.
The second service (public API) provides
the annotations that are imported into the data warehouse. By compiling the XML structure, that is defined in the
metadata API, client import the data from the services.
The API is available to the user over different databases. In order to get the list and the description of the databases, the user should call the resource below:
/GPKB-REST/rest/resources/registered-databasesThe result contains for each database:
In order to call the API for a specific database, the user calls the URI with the parameter "db-handle".
The XSD definition of the API:
Database XSD, database XSD documentation and root element registered_databases.
The metadata API exposes first of all the list of all the features contained in the GPKB: genes, proteins , dna sequences and so on. For each feature, is then possible to get the metadata details regarding the specific feature selected; namely the list and the structure of tables containing the information about the feature and the associations with the other features in which is involved.
Content negotiation: XML, Application-XML.
Parameter | Description |
---|---|
feature_name | Name of the feature of interest. E.g.: gene, dna_sequence, enzyme and so on. |
Returned tag | Description |
---|---|
attribute_groups | The list of all the tables regarding the feature |
values | The following tag, inside the attribute tag for each attribute group, is not empty only for encoded fields, namely for field that can assume a predefined range of values. For encoded fields that can assume a small range of values the value is directly contained in the value tag. Otherwise, if the range of values is too broad, it is present a link to be queried to get further details about the specific encoded field values. |
value | The following tag, inside the values tag can contain an hyperlink to access the full list of values if the amount of values exceeds a certain threshold, otherwise will contain the list of values for the specific encoded field. An encoded field value has the following structure: an id that represents the identifier for the encoded field value, a name that represents the actual value for the field and a count that represents the total number of elements assuming that specific value. Both id and name can be used, alternatively or together, in the public data queries to specify filter options for the encoded field. |
associated_features | It contains the list of associations in which the feature is involved. For each association, in the association_basic_info tag there are some basic information regarding the association; in particular, the tag name contains the name of the actual table that stores the information about the association in the data warehouse. |
source | It contains the name of the source/s that provide the information about the feature. |
Parameter | Description |
---|---|
feature_name | Name of the feature of interest. E.g.: gene, dna_sequence, enzyme and so on. |
feature_name | The second path parameter represents the name of the first feature involved in the association. |
associated_feature_name | Name of the second feature involved in the association. |
Returned tag | Description |
---|---|
attribute_groups | Inside the association portion, it represents the list of all the tables regarding the feature association |
values | The following tag, inside the attribute tag for each attribute group, is not empty only for encoded fields, namely for field that can assume a predefined range of values. For encoded fields that can assume a small range of values the value is directly contained in the value tag. Otherwise, if the range of values is too broad, it is present a link to be queried to get further details about the specific encoded field values. |
value | The following tag, inside the values tag can contain an hyperlink to access the full list of values if the amount of values exceeds a certain threshold, otherwise will contain the list of values for the specific encoded field. An encoded field value has the following structure: an id that represents the identifier for the encoded field value, a name that represents the actual value for the field and a count that represents the total number of .... Both id and name can be used in the public data queries to specify filter options for the encoded field. |
associated_features | It contains the list of associations in which the feature is involved. For each association, in the association_basic_info tag there are some basic information regarding the association; in particular, the tag name contains the name of the actual table that stores the information about the association in the data warehouse. |
source | It contains the name of the source/s that provide the information about the feature association. |
Parameter | Description |
---|---|
table_name | The name of the table that contains the field. |
encoded_field_name | The name of the encoded field. |
Returned tag | Description |
---|---|
link_plain_search | Link to query in order to discover, in a sequential way, all the values for the specific
field. To discover the
values it is sufficient and necessary to specify a starting point, offset and a limit
for the values returned limit. E.g.: a request to /GPKB-REST/rest/resources/features/encoded/expasy_enzyme/cofactor/limit/10/offset/0 will return the first 10 values for the encoded field "cofactor" of table "expasy_enzyme". |
search_string_links | Link to query in order to discover, by means of a search string, all the values for the
specific field.
To discover the values it is sufficient to specify: the search string with the wildcard
({search_string}) in the desired position (startsWith, contains or endsWith) , a
starting point, offset and a limit for the values returned limit.
E.g.: a request to /GPKB-REST/rest/resources/features/encoded/expasy_enzyme/cofactor/search/startsWith/b/limit/10/offset/0 or its equivalent /GPKB-REST/rest/resources/features/encoded/expasy_enzyme/cofactor/search/startsWith/B/limit/10/offset/0 will return the first 10 values for "cofactor" field of table "expasy_enzyme" starting with letter b/B. |
(Go back to the Top)
The public data API, by means of a single parametric service, allows for the comprehensive querying of the data warehouse.
Content negotiation: XML, Application-XML.
The resource URI for the public API is /GPKB-REST/rest/resources/selections .All the selections are to be posted in the request payload and all the selection posted in the XML have to
conform the following
XML Schema grammar:
Download public XSD,
public XSD documentation and
root element feature_details.
The following table describes only the most important tags. The explanation is omitted for the intuitive ones.
Xml Schema tag | Description |
---|---|
feature | By adding the following element and by specifying the proper feature name (in the name attribute)
to the selections the user can choose the specific feature/s to include in the query.
For each feature the user can then choose the tables to query (attribute_groups element) and for
each table he/she can
specify the attributes (attribute element) of interest. Finally the user can specify also filter
options on these attributes, using
the options available in the selected_options element. Furthermore in case of queries involving different instances of the same specific feature (the so called "alias queries") the user has to specify an alias for each feature involved so as to distinguish among the different instances of the features having the same name. The alias name has to conform the following syntax: "[feature_name]_[counter]". |
feature_association | By adding the following element and by specifying the proper feature association name (in the
name attribute) to the selections the user can choose the specific feature association/s
to include in the query.
For each association the user can then choose the tables to query (attribute_groups element) and
for each table he/she can
specify the attributes (attribute element) of interest. Finally the user can specify also filter
options on these attributes, using
the options available in the selected_options element. Furthermore in case of queries involving different instances of the same specific feature association(the so called "alias queries") the user has to specify an alias for each feature association involved so as to distinguish among the different instances of the associations having the same name. The alias name has to conform the following syntax: "[feature1_name]_[counter]TO[feature2_name]_[counter]" or "[feature1_name]_[counter]to[feature2_name]_[counter]" |
queries_general_options | The following element allows to specify general options regarding the query. E.g: DISTINCT, LIMIT, OFFSET, ORDER BY. Furthermore the element only_matching, when set to "TRUE" allows to return only INNER JOINS between all the selected tables. Instead, the counting element allows to choose between an exact versus an estimated total count as concerns the global number of rows returned as resultSet for the query. Anyway, for counting queries that have a cost higher than a pre-defined threshold, even though the user chose an exact count the service will return an approximate count in order not to negatively impact on the response time. |
alias_general_options | The following element allows to specify advanced options regarding the joining of tables, in case of alias queries. In particular, by choosing the join tables and join attributes of interest, it is possible to join tables that aren't directly joining with each other, by means of adding WHERE clauses to the generated SQL statement. < > |
The XML returned by the service, containing the results of the query together with some metadata regarding the results conforms to the following XML Schema grammar
Download result XSD, schema documentation and root element result.The following table describes only the most important tags. The explanation is omitted for the intuitive ones.
Xml Schema tag | Description |
---|---|
entries_total_count | This element keeps track of the total number of rows that are returned by the query. N.B. Unless expressed otherwise, the count returned will always be exact. But there are cases in which even if the user selected an exact count, the service will return an approximate value for the count; namely if the exact total count query exceeds a predefined response time threshold. This value is extracted from the information returned after launching a SQL EXPLAIN query. |
entries_showed_count | This element shows the number of rows that are actually returned by the query, according to the limit option specified at selection time. |
attributes_group_names_list | This element contains the metadata regarding the results returned, namely: the columns selected, "attribute" tag, and the belonging table names/table aliases. |
rows_group | This element contains the collection of rows that represent the actual result of the query. In the resultSet, for each row, is present the row number and the list of tables involved each of them containing the results data of the query. |