The 3rd International Workshop
on Conceptual Modeling for Life Sciences

Logo CMLS

17 October, 2022 in Hyderabad, India

Online workshop


In conjunction with the 41th International Conference on Conceptual Modeling (ER 2022)

17-20 October, 2022

About

The recent advances in unraveling the secrets of human conditions and diseases have encouraged new paradigms for their prevention, diagnosis, and treatment. As the information is increasing at an unprecedent rate, it directly impacts the design and future development of information and data management pipelines; thus, new ways of processing data, information, and knowledge in health care environments are strongly needed.
The third edition of the workshop aims to continue being a meeting point for Information Systems (IS), Conceptual Modeling (CM), and Data Management (DM) researchers working on health care and life science problems. It is also an opportunity to share, discuss and find new approaches to improve promising fields, with a special focus on Genomic Data Management - how to use the information from the genome to better understand biological and clinical features - and Precision Medicine - giving to each patient an individualized treatment by understanding the peculiar aspects of the disease.
From the precise ontological characterization of the components involved in complex biological systems to the modeling of the operational processes and decision support methods used in the diagnosis and prevention of diseases, the joined research communities of IS, CM, and DM have an important role to play; they must help in providing feasible solutions for a high-quality and efficient health care.
The COVID-19 pandemic has attracted increasing attention to the genetic mechanisms of humans and viruses. CMLS may become an additional forum for discussing the responsibility of the conceptual modeling community in supporting the life sciences related to this new reality. This year – more than in the previous editions – we aim to welcome topics that have become of particular interest for the general ER community: conceptual modeling for big data analytics and AI-driven systems, particularly beneficial to life sciences disciplines.

Topics of interest

The third edition of the workshop focuses on Conceptual Modeling as a means for facing the challenges that emerge when designing and developing systems for life sciences, focused on genomics and precision medicine. The workshop is not restricted to specific research methods; we will consider both conceptual and empirical research, as well as novel applications.

The topics of interest include, but are not limited to:

  • Conceptual modeling for genomics
  • Modeling of complex biological systems and of health ecosystems
  • Information systems for healthcare, genomics, or medicine of precision
  • Design, implementation, and evaluation of health information systems
  • Electronic/digital health information systems
  • Life science-related domain specific modeling languages
  • Data management and integration for genomics and biology
  • Ontologies and workflows for life sciences
  • Clinical and biological data interoperability
  • Interoperability of health information systems
  • Knowledge-representation for genetics
  • Business process modeling for genetic/clinical diagnosis
  • Conceptual model-driven big data analytics for genomics, clinical diagnosis or biological problems
  • Conceptual models for data-driven AI systems in life sciences
  • Models for digital transformation of healthcare systems

As we wish to stimulate more discussion in the ER community regarding the use of models for life sciences, we welcome also “discussion papers”, particularly related to the following topics:

  • Conceptual models in life sciences: from theory to practice
  • Models to facilitate multidisciplinary exchange in healthcare contexts

Accepted papers

The list of accepted papers for CMLS 2022:

  • Ana Xavier Fernandes, Filipa Ferreira, Ana León and Maribel Yasmina Santos. Towards a Model-driven Approach for Big Data Analytics in the Genomics Field
  • Mireia Costa, Alberto García S. and Oscar Pastor. Conceptual Modeling-based Cardiopathies Data Management
  • Francisco Manuel García Moreno, Maria Bermudez-Edo, José Manuel Perez-Marmol, Jose Luis Garrido and Rodríguez Fórtiz María José. A Conceptual Model of Health Monitoring Systems Centred on ADLs Performance in Elderly People
  • Mireia Costa, Alberto García S. and Oscar Pastor. A Comparative Analysis of the completeness and concordance of data sources with cancer-associated information
  • Pietro Cinaglia and Mario Cannataro. A Flexible Automated Pipeline Engine for Transcript-level Quantification from RNA-seq

Paper submission guidelines

We invite submissions of high quality papers describing original and unpublished results regarding any of the workshop’s topics of interest.

CMLS 2022 proceedings will be part of the ER 2022 Workshop volume published by Springer in the LNCS series. The authors must submit manuscripts using the Springer-Verlag LNCS style for Lecture Notes in Computer Science. For style files and details, see the page http://www.springer.de/comp/lncs/authors.html. The page limit for workshop papers is 10 pages. Papers must be submitted as PDF files using EasyChair at https://easychair.org/conferences/?conf=cmls2022.

To ensure high quality, all papers will be thoroughly peer reviewed by the Program Committee. Manuscripts not submitted in the LNCS style or having more than 10 pages will not be reviewed and thus automatically rejected. The papers need to be original and not submitted or accepted for publication in any other workshop, conference, or journal. Submission to CMLS 2022 will be electronically only.

Post-conference publication

The papers accepted to the workshop's last edition CMLS 2022, will be published within the Springer volume as per the usual tradition of the ER conference joint events (see last year's Advances in Conceptual Modeling 2021).
In addition, they will be invited to submit a revised and extended version for a post-conference supplement in the journal BMC Bioinformatics or a similar venue (e.g., BMC Medical Informatics and Decision Making), depending on the topic.

The authors of accepted papers interested in submitting an extended article to a BMC supplement will need to fill, sign, scan and submit the supplement submission letter of interest. This document is important for us so we can know how many extended articles we will receive.
The authors can read the BMC Bioinformatics guidelines here and the BMC Medical Informatics and Decision Making guidelines here. There are no page limits for the BMC journals. The articles can be submitted on the BMC Supplements website.
The submitted extended articles will go through two review phases. In the first phase, the organizers of the CMLS 2022 workshop will serve as guest editors. Some additional colleagues might be involved as additional guest editors, in case of high number of submissions. At the end of the first review phase, the guest editors will recommend the submitted extended articles for acceptance or rejection, and transfer them to the BMC journals' editor-in-chiefs who will take care of the second review phase.
The BMC editor-in-chiefs can decide to confirm the guest editors' recommendations or to start a new review phase (with new reviewers invited, new reviews, new requests, etc). The BMC editor-in-chief will eventually make the final decision on the acceptance or rejection of each extended manuscript.
Please note that, even if the guest editors suggest the acceptance for a specific paper, the BMC editor-in-chief can still decide to reject it (this case is quite rare but it can still happen).
Deadline. Submission of extended articles for the BMC supplements: 20 December 2022.

Important dates

  • Paper submission: June 15th, 2022 July 4th, 2022 (EXTENDED!)
  • Notification: July 14th, 2022 August 12th, 2022
  • Camera-ready version: July 31st, 2022 August 26th, 2022 (firm deadline)
  • CMLS/EmpER online workshop date: October 17, 2022
  • ER online conference dates: October 17-20, 2022
  • Extended article submission to BMC Bioinformatics supplements: December 20th, 2022

Organizers

Anna Bernasconi, Dipartimento di Elettronica, Informazione e Bioingegneria (DEIB)
Politecnico di Milano, Italy
anna.bernasconi@polimi.it

Anna Bernasconi works as a researcher in Politecnico di Milano, within the “Data-driven Genomic Computing” ERC Awarded project (2016-2021), under the supervision of Professor Stefano Ceri. In 2015 she obtained a Master of Science in Computer Engineering from Politecnico di Milano and a Master of Science in Computer Science from University of Illinois at Chicago with a thesis on Formal Methods. Her research is on bioinformatics data and metadata integration methodologies to support complex biological query answering. Main expertise areas include conceptual data design, data integration, data cleaning, semantic web, data analysis; she is passionate about models and methods formalization.



Arif Canakoglu, Dipartimento di Anestesia, Rianimazione ed Emergenza-Urgenza,
Fondazione IRCCS Ca’ Granda Ospedale Maggiore Policlinico, Milan, Italy
arif.canakoglu@policlinico.mi.it

Arif Canakoglu currently works as a data scientist at Policlinico di Milano; and he works mainly on the electronic health record of the intensive care unit's patients in the Lombardy region. He is leading the research with the support of the medical group, analyzing the life quality of the patients after the hospital discharge. Previously, he was involved in the "Data-driven Genomic Computing" ERC Awarded project (2016-2021), where he contributed to developing integration of heterogeneous genomic data and for developing computational methods for genomic applications. In 2016, he received his PhD on biomolecular knowledge data integration by using the modular schema data warehouse. His research interests include data integration and data driven genomic computing, big data analysis and processing on cloud computing, artificial intelligence applications. His main areas of expertise are heterogeneous data integration and data driven models and machine learning approaches in genomic, and big data processes, especially on cloud computing.



Ana León Palacio, Research Center on Software Production Methods (PROS)
Universitat Politècnica de València, Spain
aleon@pros.upv.es

Ana León, PhD in Computer Science (2019, Universitat Politècnica de València), is also University Expert in Medical Genetics and Genomics by the Universidad Católica de Murcia. Her main research topics include Conceptual Modeling, Genomic Data Science, Explainable AI, Data Quality and Information Systems. Currently, she is researcher at the Research Center on Software Production Methods (PROS-UPV) where her research activity is focused on the use of conceptual models for the development of Genomic Information Systems, as well as the definition of a systematic process for the search, identification, load and exploitation of DNA variants in the context of Precision Medicine.



José Fabián Reyes Román, Research Center on Software Production Methods (PROS)
Universitat Politècnica de València, Spain
jreyes@pros.upv.es

José F. Reyes R. is a researcher at PROS Research Center at Universitat Politècnica de València (Spain). He holds a Ph.D. in Computer Sciences (2018) from Universitat Politècnica de València (UPV, Spain), a MSc in Software Engineering, Formal Methods and Information Systems (2013) from UPV (Spain), a Diplomate of Analysts and Systems Designers (2011) and a University Degree in System Engineering (2010) from Universidad Central del Este (Dominican Republic). Currently, his main research activities are centered on the use of Conceptual Models for the development of Genomic Information Systems (GeIS). His main research interests include Conceptual Modeling, Genomic Data Science, Engineering Requirements, SE and Information Systems.

Program Committee

  • Giuseppe Agapito, Magna Graecia University, Italy
  • Samuele Bovo, University of Bologna, Italy
  • Bernardo Breve, Università degli Studi di Salerno, Italy
  • Mario Cannataro, Magna Graecia University, Italy
  • Stefano Cirillo, Università degli Studi di Salerno, Italy
  • Johann Eder, University of Klagenfurt, Germany
  • Jose Luis Garrido, University of Granada, Spain
  • Giancarlo Guizzardi, University of Twente, Netherlands
  • Khanh N.Q. Le, Taipei Medical University, Taiwan
  • Sergio Lifschitz, Pontifical Catholic University of Rio de Janeiro, Brazil
  • Paolo Missier, Newcastle University, United Kingdom
  • José Palazzo, Federal University of Rio Grande do Sul, Brazil
  • Ignacio Panach, University of Valencia, Spain
  • Barbara Pernici, Polytechnic University of Milan, Italy
  • Rosario Michael Piro, Polytechnic University of Milan, Italy
  • Monjoy Saha, National Cancer Institute, USA
  • Domenico Vito, Università degli Studi di Pavia, Italy
  • Emanuel Weitschek, Uninettuno University, Italy

Invited Talk

Pietro Pinoli (Dept. of Electronics, Information and Bioengineering - Politecnico di Milano).

Pietro Pinoli works as Researcher Fellow and lecturer at the Department of Electronics, Information and Bioengineering at the Politecnico di Milano (Italy). He received his PhD cum laude in 2017, with a thesis titled “Modeling and Querying Genomic Data” where he proposed and benchmarked data structures and algorithms to manage, search and elaborate huge collections of genomic datasets, by means of cloud and distributed technologies. He has been visiting PhD candidate at Harvard University (Cambridge, MA, US). His research interests include bioinformatics and computational biology, data bases and data management, big data technology and algorithms, machine learning and natural language processing, and drug repurposing. He participated in the Italian PRIN GenData, ERC GeCo and EIT VirusLab projects.

Topic: Modeling machine learning pipelines for life sciences to put the human back in the loop

Talk abstract: Automatic Machine Learning (AutoML) is an emerging sub-field of machine learning (ML). It uses Artificial Intelligence (AI) to generate and optimize ML pipelines to respond to the data analysis needs of the user. One of the main aims of AutoML is to democratize ML, by allowing users without a strong computational background to adopt ML solutions. However, despite so many advantages, AutoML is rarely adopted in high-risk applications like healthcare, government, and justice. This is mainly because of the exclusion of humans from the process of the creation of the ML model, which leads to a decreased trust in the results of the generated ML solution. Among the several approaches proposed to mitigate these issues, one of the most promising is "putting the human back in the loop" so that the final user can provide their domain expertise and contribute to building trustworthy ML solutions.
In this talk, I present several approaches that exploit different conceptual modeling of ML pipelines to build human-in-the-loop ML solutions. In particular, I will describe GeCoAgent (a conversational agent for genomic data analysis), DSBot (a system based on Natural Language Processing for general ML applications), and ALFriend (a tool to analyze event-based temporal data).

Program

CMLS sessions (joined this year to the EmpER workshop session) are on Monday October 17th.

The workshop will have two time slots: 10:00-12:00 and 12:30-14:30 (IST). Note that these times correspond to 06:30-08:30 and 09:00-11:00 (CET).

10.00-10.15 (IST) ---> 6.30-6.45 (CET)
Welcome and introduction to the CMLS and EmpER workshops

10.15-11.00 (IST) ---> 6.45-7.30 (CET)
Keynote talk by Pietro Pinoli, Politecnico di Milano. Modeling machine learning pipelines for life sciences to put the human back in the loop.

11.00-11.30 (IST) ---> 7.30-8.00 (CET)
Ana Xavier Fernandes, Filipa Ferreira, Ana León and Maribel Yasmina Santos. Towards a Model-driven Approach for Big Data Analytics in the Genomics Field.

11.30-12 (IST) ---> 8.00-8.30 (CET)
Mireia Costa, Alberto García S. and Oscar Pastor. Conceptual Modeling-based Cardiopathies Data Management.

Break 12-12.30 (IST) ---> 8.30-9.00 (CET)

12.30-13.00 (IST) ---> 9.00-9.30 (CET)
Francisco Manuel García Moreno, Maria Bermudez-Edo, José Manuel Perez-Marmol, Jose Luis Garrido and Rodríguez Fórtiz María José. A Conceptual Model of Health Monitoring Systems Centred on ADLs Performance in Elderly People.

13.00-13.30 (IST) ---> 9.30-10.00 (CET)
Mireia Costa, Alberto García S. and Oscar Pastor. A Comparative Analysis of the completeness and concordance of data sources with cancer-associated information.

13.30-14.00 (IST) ---> 10.00-10.30 (CET)
Pietro Cinaglia and Mario Cannataro. A Flexible Automated Pipeline Engine for Transcript-level Quantification from RNA-seq.

14.00-14.30 (IST) ---> 10.30-11.00 (CET)
Alberto García, Anna Bernasconi, Giancarlo Guizzardi, Oscar Pastor, Veda Storey and Mireia Costa. An Initial Empirical Assessment of an Ontological Model of the Human Genome.

Collaborations

This workshop is supported by the data-driven Genomic Computing group at Politecnico di Milano and by the VRAIN Research Center at Universitat Politecnica de Valencia (INNEST/2021/57 - Agència Valenciana de la Innovació and PDC2021-121243-I00 - Spanish State Research Agency)

Logo CMLS Logo GeCo Logo UPV Logo PROS