Integrated process monitoring and control of Active Pharmaceutical Ingredient production via tensor-based data-driven modeling methods

Category

Ph D Defense

Date

2024-12-12 17:00

Venue

KU Leuven, Technologiecampus Gent, L226 - Polyvalente zaal, 02.L226 - Gebroeders De Smetstraat 1, 9000 Gent
Belgie

Website

https://www.kuleuven.be/doctoraatsverdediging/fiches/3E16/3E160592.htm

Promovendus/a: Carlos André Muñoz López

Promotor(en): Prof. dr. ir. Jan Van Impe, Mevrouw Kristin Peeters

The need for methods that contribute to improving our understanding of the phenomena occurring in the large-scale production systems, and also provide ways to monitor and control them is common across all sectors of industry. The use in the last decades of data-driven modeling methods as a platform to develop application-oriented solutions has shown the potential to generate robust and accurate characterizations of real-life systems. However, there are still many aspects that can and need to be improved to exploit the full potential that exists in learning from data. Within this context, this research contributed by pushing the boundaries of what can be learned from process data. The PhD was structured around one goal, one application, and one fundamental approach. First, the goal was to address 3 core challenges of the data-driven modeling of batch processes i.e., (i) performing batch and phase alignment, (ii) determining the optimal scale of the data to learn from it, and (iii) determining the right model complexity to reproduce the system variability. Secondly, the application is the pharmaceutical industry and in particular the production of active pharmaceutical ingredients (APIs). Finally, the approach was the formulation of the modeling structure based on tensors and tensor decomposition methods since a tensor is the fundamental structure of the data obtained during the operation of batch processes.

The first challenge undertaken in this research was the formulation of a novel strategy for automated phase identification and alignment of batch process data. Manifold learning and clustering were used to learn the most appropriate phase separation points in each batch, from the data itself. The strategy proposed tackles the alignment issue in a novel way which not only reduces the need for training data but also solves the phase identification and alignment simultaneously without side effects due to the shrinking/expansion effect on time and guarantees the alignment of process events.

The second challenge addressed was the improvement of the model identification based on better strategies for data scaling. Two novel strategies were designed and evaluated in this research. First, an empirical approach that aims to meet the condition of evenly distributed errors, and second an optimization approach based on the optimality criteria of the Fisher information matrix. The proposed methods demonstrate that determining the right scale for the data is key to unraveling the potential for learning meaningful information from data. Additionally, the formulation based on tensor decomposition not only allowed the development of the scaling methods but also served to render clear the significant advantages of the tensor-based methods.

Finally, the problem of determining the appropriate complexity of the data-driven model to describe the system behavior, which in terms of tensor decomposition refers to the best low multi-linear rank selection, is addressed by exploiting the information captured in the Fisher information matrix of the multi-linear model. The assessment of the information criteria during the training of the model provides valuable insights into the robustness of the models that have different levels of complexity. This robustness is associated with the join confident region of the estimated parameters for each model, this means that models can be discriminated based on the uncertainty associated with them. This criterion proves to be much less ambiguous when compared with standard strategies for determination of the best model complexity which uses different estimates of the model error as criteria. The results obtained demonstrated that the models selected, based on their error/complexity balance, although offering the lower approximation error are the ones with the largest uncertainty in their parameters.

The implementation of the methods proposed in this research resulted in two main applications to the large-scale production of APIs. First, a tensor-based model used to monitor the spray drying process and to predict the particle size of the product has been trained and validated based on the production of two intermediate drug products. This model has been implemented in the production site to support the real-time release of one of the intermediate drug products. Secondly, a multi-block tensor regression strategy has been developed for the quality prediction and root cause analysis in the plant-wide production process of an API. These models have been successful in the identification of deviations during the production of intermediate products.

All Dates

2024-12-12 17:00

Integrated process monitoring and control of Active Pharmaceutical Ingredient production via tensor-based data-driven modeling methods

All Dates

More events

C2W | Mens & Molecule