10.3 Machine learning approach for metagenomic data interpretation in water process control
Drinking water companies are constantly monitoring the quality of the water in their distribution networks. Large amounts of data are produced to allow for continuous quality and process control. Many different approaches are taken to combine sensor and analytical data with process data to allow for a better understanding of the process to get better steering parameters. Ultimately, these systems aid in telling the operators what actions they should take when deviations to the standard operating procedures occur. Quality and health threats may be caused by biology (bacteria, viruses, protozoa) in the system or the produced water. Using the genomic data of the biology enable tighter control of the purification and distribution steps in the process of producing potable water from groundwater or surface water. A large part of the systems are still operated as black-box systems. Since metagenomics data are considered to be complex on their own, incorporating and translating them is a challenge. However, with a system biology approach combined with machine learning, significant new steps can be made that allow for more control on operations and water quality.
The research challenge is to maximize the information hidden in metagenomics data for understanding, controlling, and optimizing a process. You will have to connect metagenomics data and process conditions using system biology and machine learning. The types of data have very distinctive features in terms of specific versus holistic information, frequency, sensitivity, specificity, and dynamics. However, there are many dependencies between the data that can be described in a mathematical model, allowing for the systematic evaluation of parameter changes linked to altering process conditions or loss of process control. With this approach, we expect to find parameters changes that are ‘logical’ for operators and microbiologists, but also the identification of new dependencies.
The approach is case-based and focusses on a specific question or feature to develop the principle methods. Examples of relevant processes are “biological removal of pharmaceuticals in one but not the other plant” or “re-growth of bacteria in one and not the other drinking water distribution system”.
In this project, the PhD candidate will work on:
· Building a process dynamics model for a specific treatment or distribution challenge;
· Collect, sort, weigh, classify and translate relevant data-sets for the process;
· Run supervised and unsupervised learning models in an iterative way to identify parameter influence;
· Define and complement missing data together with process owners and microbiological Wetsus staff;
· Optimize the model following a recurrent design cycle.
· Construct a user-friendly (graphical) out-put system for operators.
We are looking for a candidate with an MSc degree in the field of systems biology, mathematical modeling, machine learning or similar with a strong interest in molecular microbiology (microbial genetics, bio-informatics, microbial physiology) with excellent analytical capacities and good cooperation skills.
The research project will be in close cooperation with drinking water and waste water treatment companies.
Promotor: Prof. dr. Bayu Jayawardhana (University of Groningen, Faculty of Science and Engineering, Mechatronics and Control of Nonlinear Systems)
Co-promotor: Prof. dr. G.J.W. Euverink (University of Groningen, Faculty of Science and Engineering, Products and Processes for Biotechnology)
Wetsus supervisor: Dr. Inez Dinkla
Wetsus, Leeuwarden, The Netherlands
For more information contact