Scientific machine learning for predicting plasma concentrations in anti-cancer therapy

A variety of classical machine learning approaches have been developed over the past ten years with the aim to individualize drug dosages based on measured plasma concentrations. However, the inter-pretability of these models is challenging as they do not incorporate information on pharmacokinetic

(PK) drug disposition.In this work we compare well-known population PK modelling with classical and a newly proposed scientific machine learning (SciML) framework, which combines knowledge on drug disposition with data-driven modelling.Our approach lets us estimate population PK parameters and their inter-individual variability (IIV) using multimodal covariate data of each patient.A dataset of 549 fluorouracil (5FU) plasma concentrations as example for an intravenously administered drug and a dataset of 308 sunitinib concentrations as example for an orally administered drug were used for analysis.Whereas classical machine learning models were not able to describe the data sufficiently, the proposed model allowed us to obtain highly accurate predictions even for new patients.Additionally, we demonstrated that our model could outperform traditional population PK models in terms of accuracy and greater flexibility when learning population parameters if given enough training data.

1.Introduction
The vision of precision medicine is to provide the right drug to the right patients at the right time and dosage.The choice of the optimal drug dosage is based on different criteria in which pharmacokinetics (PK) in an individual patient is an essential factor.For this purpose, population PK models are traditionally formulated as compartmental differential equations.In these models, individual differences relative to the population average are described as random statistical effects.A limitation of this established approach is that in case of incomplete or missing knowledge about specific PK processes (e.g.absorption) the formulation of model equations can become challenging.Moreover, these models do not always provide accurate predictions for new patients who have not been part of the training dataset and, therefore, might come from a different distribution.
Thus, during the last decade various machine learning techniques have been studied as a complementary strategy for the estimation of drug plasma concentrations, aiming to individualize doses.The most frequently used algorithms for dose optimization are decision trees and their ensembles, support .vector machines and neural networks.Additionally, reinforcement learning plays an increasing a role.
Recently, several models have been proposed using neural networks for the prediction of drug concentrations [4,[18][19][20].Lu et al. proposed to learn the initial conditions of a neural ordinary differential equation (ODE) system trained to predict PK profiles [4].A hybrid of an ODE and a neural network was employed by Qian [18].Similarly, Janssen et al. used a neural network to learn covariate effects which were employed in a compartment model for describing drug concentrations [19].Valderrama et al. [20] introduced PK-SciML, a Scientific Machine Learning (SciML) [21,22] approach for learning an unknown absorption mechanism while simultaneously estimating PK parameters.Although their model showed promising results, it was only tested on synthetic data and did not consider the inter-individual variability (IIV) of the clinical data, hence only generating populational level predictions for different dose groups.Considering that a SciML framework has the advantage of not needing to establish the relationships between covariates and parameters a priori while allowing the integration of domain expertise, we here introduce a multimodal pharmacokinetic SciML (MMPK-SciML) approach, an extension of PK-SciML, which aims to learn the IIV based on multimodal covariate data of individual patients.
As a case study, we use two real datasets for two different oncology treatments as examples for an iv and an oral treatment route and compare our model with different classical machine learning and population PK models.We demonstrate that our model produces reliable predictions, while being able to simulate new patients.In this work, plasma concentrations of patients who received fluorouracil (5FU)-based infusional chemotherapy at the Oncological Outpatient Clinic UnterEms in Leer, Germany, were retrospectively analyzed.This study was approved by the local medical ethics committee, but trial registration was not conducted due to the retrospective nature.Patients with documented therapeutic drug monitoring (TDM) of 5FU were included in the analysis.Plasma 5FU concentrations were obtained at steady state during continuous infusion and quantified using the My5-FU™ immunoassay (Saladax Biomedical Inc., Bethlehem, PA, USA) [23].The dataset included 549 TDM samples from 157 patients and further information on demographics, blood counts and adverse events.Adverse events (AE) were graded at each patient visit according to the Common Terminology Criteria for Adverse Events (CTCAE) version 5.0 [24].Outliers were defined as individuals with a concentration below the lower limit of quantification (< 52 ng/mL) or a clearance above 1478 L/h [25] and were excluded from the analysis.Missing data was imputed using the last observation carried forward approach within the same treatment cycle.

Sunitinib
Sunitinib PK data were pooled from two PK/PD studies focusing on sunitinib treatment in patients with metastatic renal cell carcinoma (mRCC) and patients with metastatic colorectal cancer (mCRC) [26,27].The C-IV-001 study (EudraCT-No: 2012-001415-23, date of authorisation: 17.10.2012)was a phase IV PK/PD substudy of the non-interventional EuroTARGET project, which recruited patients with mRCC at nine medical centres in Germany and the Netherlands [26].Sunitinib doses ranged from 37,5-50 mg daily, administered orally on a 4-week on/2-week off schedule.The C-II-005 study (EudraCT-No: 2008-00151537, date of authorisation: 11.06.2008) was conducted to investigate the beneficial effect of sunitinib added to biweekly folinate, fluorouracil and irinotecan in patients with mCRC and liver metastases Patients were prescribed a daily dose of 37,5 mg sunitinib on a 4-week .on/2-week off schedule taken orally [27].Both studies were performed in accordance with the Declaration of Helsinki.
A total of 308 sunitinib TDM samples were obtained from 26 mRCC and 21 mCRC patients.345 samples were analysed for the pharmacodynamic biomarker sVEGFR2 and 337 for sVEGFR3, respectively [28].Sunitinib measurements below the lower limit of quantitation (<0,06 ng/mL) and patients without plasma concentration data were excluded from the analysis.

Data preprocessing
The total dataset was split using a 10-fold cross-validation setting with a training-test split of 80/20, keeping data from one patient strictly in the same set to avoid a splitting bias.For the classical machine learning algorithms, categorical features were one-hot encoded and continuous features were scaled between zero and one.Additionally, missing data was imputed using MissForest within the cross-validation process for each split if the 'last observation carried forward' approach was not applicable.

Population pharmacokinetic modeling
The population pharmacokinetic (PopPK) model for 5FU comprised of a one-compartment model with linear elimination to describe 5FU disposition [29].An IIV term was implemented on 5FU clearance and the volume of distribution together with its IIV were fixed to previously estimated values [29].The residual variability was modelled as proportional and the body surface area, centered on the population median, was included as a linear covariate on clearance.Differently from the original model [29], the skeletal muscle index was not included as a covariate, because it was not available for all included patients.While Schmulenson et al. used the FOCE-I method to estimate the parameters [29], we compared this method to stochastic approximation expectation maximisation (SAEM) using NON-MEM®.First, we estimated the pharmacokinetic parameters for the training patients of each split using the different methods and initial estimates as in [29].In the next step, these values were used to simulate the expected concentrations for the test data.Mean concentration values were calculated from 1000 simulations.
. The structure of the population-pharmacokinetic model for sunitinib is depicted in Figure S1.A two-compartment model for sunitinib disposition and a biphasic distribution for its active metabolite SU12662 were used [28,30].Presystemic formation of SU12662 was modelled via a hypothetical enzyme compartment incorporated into the central compartment of sunitinib.An intercompartmental clearance connected the central compartment and the enzyme compartment and was fixed to the liver blood flow.Interindividual variability (IIV) was included for the central volumes of distribution for sunitinib and SU12662, the clearance of sunitinib and the fraction metabolized in a block matrix.Proportional errors for the parent drug and metabolite were used to describe the residual unexplained variability.Originally, the FOCE-I method was used for parameter estimation.Again, we compared this method to SAEM.Moreover, we simulated the expected concentrations for the test data and applied the same evaluation methods as for described for 5FU.

Classical Machine Learning Algorithms
Various classical machine learning methods were used for concentration prediction prediction.We compared Random Forests, Gradient Boosting, Extreme Gradient Boosting (XGBoost), Light Gradient Boosting (LGBoost), Support Vector Machines (SVM) with a radial basis function kernel and simple neural networks with a possible range of 2-10 units per layer in terms of performance.The input variables consisted of dose, weight, lean body mass (LBM), fat mass (FM), body surface area (BSA), age, sex, height and time since last dose for 5FU and sex, age, weight, height, BSA, time since last dose and the sVEGFR-2, sVEGFR-3 plasma concentrations for sunitinib.These potential covariates, despite most of them having been excluded in stepwise covariate modeling (SCM), were included to enable the machine learning algorithms to make use of potential previously missed relationships within the data as they have shown to outperform SCM in some cases [31].All algorithms were applied with and without feature selection using XGBoost.Hyperparameter tuning was performed using the Bayesian hyperparameter optimisation software framework Optuna [32] and models were selected applying 5-fold cross validation with the mean squared error as the objective function.The neural networks were regularised applying common techniques such as drop-out, L1 regularization and gradient clipping to avoid overfitting.
. To investigate whether the model performance of the classical machine learning methods could be further improved by synthetic data, the training datasets were augmented for each split according to Table S1.To simulate patient data according to the correct distributions for each cross-validation fold, a PK model was fitted by NONMEM® on each fold and the obtained estimates were used for simulation.1000 synthetic patients were created for each fold respectively and added to the clinical training data.

Multimodal Pharmacokinetic SciML model (MMPK-SciML)
The main motivation of our MMPK-SciML model was to overcome the existing limitations of PK-SciML [20] predictions, i.e., we wanted to build a model learning the IIV using neural networks and multimodal patient information.Following the classical pharmacokinetic framework, individual parameters using IIV are defined as follows: Where ܸܶ ణ is the typical -populational value of the parameter ߴ and ߟ is the IIV also referred in this paper as eta value.
Our proposed architecture is composed of 2 main blocks: i) a neural network encoder which aims to predict the ߟ values using patient covariates and ii) a structural well-defined ODE system to describe the PK dynamics.Therefore, given a set of ݇ predicted patient parameters ሼߴ ሽ ୀଵ: , a dose regimen, and a time horizon, the individual concentration profiles were predicted by solving the initial value problem of the ODE system.
Following PK-SciML [20] and Lu et al. [4] the dosage was added to the first compartment of the ODE system.Additionally, we fixed the initial conditions to zero to guarantee a plausible ODE system.Model implementation is available on GitHub at https://github.com/SCAI-BIO/MMPK-SciML. .

Variational Inference
Let ‫ݕ‬ ௧ , ‫ݐ‬ ‫א‬ ࣮ denote the concentration profile measured at time points ࣮ for patient ݅.
The initial value problem can then be solved by sampling from the distribution ܰ൫ߤ ఎ , ߪ ఎ ଶ ൯ while taking advantage of the re-parametrization trick [34].With that it is possible to formulate a loss function for training ߶ ఏ by maximizing the so-called Evidence Lower Bound (ELBO) on the true posterior on logscale: where ‫ܦ‬ ሺ ‫ݍ‬ ሺߟ ‫ݔ|‬ , ‫ݕ‬ ௧ ሻԡ‫‬ሺߟ ሻሻ denotes the Kullback-Leibler divergence (a statistical distance measure) between the approximate posterior distribution ‫ݍ‬ ሺߟ ‫ݔ|‬ , ‫ݕ‬ ௧ ሻ and a prior distribution ‫‬ሺߟ ሻ.Supposing ‫‬ሺߟ ሻ ൌ ܰሺ0, ߣ ଶ ሻ and Gaussian noise, the negative ELBO can be re-written as a loss function ℓሺሼ‫ݕ‬ ௧ ሽ, ሼ‫ݕ‬ ௧ ሽሻ: where ߳ ଶ is the variance of the measurement noise and ‫ݕ‬ the ODE predictions.For the following experiments we assumed a proportional error ߳ ଶ ‫ן‬ ‫ݕ‬ .Note that a smaller ߣ 0 results in a stronger regularization of ߪ ఎ ଶ and ߤ ఎ ଶ towards their values in the prior distribution.
Where ‫ܦ‬ is the dose and ‫ܶܫ‬ is the infusion time.
To learn the ߟ values, we defined ߶ ఏ as an encoder network using the concatenation of the measured concentration, dose, weight, LBM, FM, BSA, age, sex and height was used as input for the first layer.
Specifically, only IIV on CL was learnt for 5FU. Figure 1 (top) shows an overview of our model architecture for 5FU.Model hyperparameters and more details can be found in Appendix A

Sunitinib
Unlike the 5FU dataset, for sunitinib we took advantage of having measurements at different time points during the PK profile.We used as structural ODE system the model proposed by Diekstra et al. [28]: As in the original work the ODE parameters were first calculated following equation 1 and then scaled based on the weight of the patient.Specifically, ‫ܮܥ‬ ௌ , ܳ ௌ , ‫ܮܥ‬ ெ , ܳ ெ , ܳ ு were scaled by a factor of and ܸ2 ௌ , ܸ3 ௌ , ܸ2 ெ , ܸ3 ெ by a factor of ௐ௧ ߶ ఏ was defined as a multimodal encoder containing three blocks.The first block was an encoder for static covariates ሼ‫ݔ‬ ሽ.The second block encoded the longitudinal covariates ሼ‫ݕ‬ ௧ ሽ, and for this purpose we used the Time-LSTM [35].The output of both encoders was concatenated and used by a third block, the projection encoder, with 2 subnetworks each producing ‫ܭ‬ outputs which define ൛ߤ ఎ ;log൫ߪ ఎ ଶ ൯ൟ ୀଵ: . We defined K=4 corresponding to the IIV for ‫ܮܥ‬ ௌ , ܸ2 ௌ , ‫ܨ‬ ெ , ܸ2 ெ .Figure 1 (bottom) shows an overview of our model architecture for sunitinib.Model hyperparameters and more details can be found in Appendix A.

Model Comparison
To assess predictive performance, the mean absolute error (MAE), and the root mean squared error (RMSE) were calculated and compared for the different approaches used in this project.Goodness-offit (GOF) plots were used to support the quantitative results.In the case of the population PK models, mean concentration values were calculated from 1000 simulations using parameters estimated on the training data as initials and then plotted.For the classical ML methods, the final predictions on the test set were used for the calculation of the metrics.For the MMPK-SciML approach, the results were obtained by using the means predicted by the model for each of the eta values because these represent the expected value.
. Furthermore, to evaluate how well the models perform in simulating new patients, prediction corrected visual predictive checks (pcVPCs) were generated.These graphs could only be obtained for the population PK and the MMPK-SciML models because the classic machine learning approaches are not generative.

Dataset characteristics
A dataset of 549 fluorouracil (5FU) plasma concentrations from 157 patients as example for an IV administration and another dataset of 308 sunitinib concentrations from 47 patients as example for a po administration were used for analysis.Baseline characteristics of all patients included in our analyses can be seen in Table 1.

Population pharmacokinetic modeling results
In the Pop-PK analyses, all PK parameters and their IIVs as defined in the original publications [28,29] could be estimated for all data splits.The mean estimated parameter values were in a similar range to the original estimated values for the whole datasets as depicted in Table 2 and the ߟ values appeared to be normally distributed for all tested methods.There were no relevant differences between the estimated parameters and the simulated concentrations for the test data of the FOCE-I and SAEM methods.
For 5FU, using both FOCE-I and SAEM, we observed balanced ߟ distributions.However, the GOF was still relatively poor, and showed wide confidence intervals (Figures 2, 4) In the case of sunitinib, initial convergence problems for some splits with the SAEM algorithm had to be resolved by setting the initial estimates closer to the final values obtained by Diekstra et al. [28] Once this issue was resolved, we obtained similar results to FOCE-I and comparatively good fits for both methods (Figures 3, 4). .

Classical machine learning methods
The proposed classical ML methods were not able to learn relationships within the data and accurately predict plasma concentrations of both drugs as can be seen in the GOF plots in Figure 2, and the cross-validated accuracy metrics in Table 3. Results did not differ relevantly between algorithms with and without feature selection.Therefore, we only depict results of the algorithms where feature selection was applied.The only comparatively well performing classical algorithm on the sunitinib data was the Light Gradient Boosting algorithm.However, the performance differed vastly across splits and GOF plots suggest a rather poor performance (Figure 2).
To investigate whether the poor model performance could be improved with additional synthetic data, we augmented the training data with data that was simulated by the population PK model and repeated the model runs.As a limitation, the sunitinib biomarker data had to be simulated using the final parameter estimates from Diekstra et al. [28] due to matrix singularity in parameter estimation.However, as can be seen in Table 3, the prediction accuracy was not substantially improved for both datasets.Optimized hyperparameters and selected features are reported in Supplementary Tables S2   and S3.

MMPK-SciML
Our proposed MMPK-SciML model generates accurate predictions for the different oncology treatments routes used in this paper.Figure 2, bottom row, illustrates the GOF plots for 5FU.On the right side we depict a second version of our model using a fixed volume population parameter ܸܶ ൌ 40.0, which will be referred to as MMPK-SciML* in the tables and figures.Opposed to classical machine learning methods a close correlation between the predictions and the real data was found.At the same time, cross-validated RMSE and MAE metrics were even lower than those of the population PK model (Tables 3, S4).Importantly, results were highly accurate for both versions of our model, indicating that the model in both cases was able to learn ܸܶ .Especially for 5FU, we observed an at least three times better MAE than all other methods, and at least 30% improvement in RMSE for all the folds.At the same time there was a considerable difference in the predicted￼ across model versions .
(Table 2).Although the ܸܶ ￼ on average was close to the fixed volume, the variance of our standard MMPK-SciML model was larger than in the second version.Hence, during the cross-validationܸܶ , ܸܶ ￼estimates differed.
Similarly, Figure 3 (bottom row right) illustrates the GOF plots of our MMPK-SciML model for Sunitinib.
Although the GOF plots were not as good as those for 5FU, our model still showed comparable performance to the FOCE-I method, and much lower variance in prediction accuracy across data splits than the best performing classical machine learning method LightGBM (Tables 3, S4).
As can be seen in Figure 4, the MMPK-SciML models performed well in simulating new patients, as the associated statistics of the real data are within the 90% confidence intervals (shaded region) of the predictions.Additionally, our models approximated the posterior ߟ distributions in a reliable manner (Figure S2).

Discussion
Using two examples of oncological treatments with different administration routes (IV and oral), our results demonstrate that generally a compartmental model structure is required to make accurate predictions of drug plasma concentrations.Overall, only the MMPK-SciML as well as the Pop-PK methods were able to adequately describe the underlying drug disposition.In contrast to MMPK-SciML, other ML-based PK models are entirely data-driven and cannot accurately learn the PK profile when working with a small number of measurements and dose schedules [4,20,36].Moreover, these models have the limitation that they need to be initialized with test data to generate simulations, which in real application scenarios is usually not possible.Although our previously proposed PK-SciML [20] does not have this limitation, it only generates population level predictions and does not process multimodal data.Altogether, this demonstrates a clear advantage of our MMPK-SciML architecture proposed in this paper.
. Remarkably, LightGBM performed well for the sunitinib analysis.Nevertheless, its performance was rather inconsistent across splits and not good enough to be used in practice, as can be seen in the GOF plots.The comparatively good performance of LightGBM may be explained by the fact that gradient boosting machines construct an ensemble of decision trees.Decision trees implicitly split the data into different subspaces, resulting in an implicit discretization.This behavior could be advantageous in situations in which the data is of multimodal nature, as typical in clinical studies.
Overall, the classic machine learning models were not able to appropriately learn key aspects of the data generating process to produce accurate PK predictions.Additionally, data augmentation could not further improve the performance of classical machine learning algorithms, suggesting that the inherent complexity of the temporal dynamics, the variance and the presence of outliers are difficult to learn by methods that have been designed for comparably simple tabular data only and thus use no information about the PK related processes.However, the results possibly could have been improved if we had more clinical training data.

5FU
Our proposed MMPK-SciML model was able to predict the population volume for 5FU, which was impossible to obtain with FOCE-I due to convergence problems [29].We observed that the predicted volume was often overpredicted, which could be due to outliers in the training data.We hypothesize that FOCE-I is less robust to outliers than the MMPK-SciML method and thus required the population volume to be fixed.This behavior illustrates one of the main advantages for using a SciML model for PK modeling, namely greater flexibility compared to traditional modeling approaches.However, a major limitation of our analyses was that the genotypes and the activity of the main metabolizing enzyme of 5FU, dihydropyrimidine dehydrogenase, which are important predictors for 5FU pharmacokinetics, were not available for our patient cohort.This information probably could have improved the performance of all tested models and should be reported in future studies. .

Sunitinib
For the sunitinib dataset we observed relatively wide confidence intervals of the MMPK-SciML estimates, while interestingly the predicted population parameters differed from those reported by Diekstra et al. [28].We identified that although the absorption rate was predicted higher compared to Diekstra et al. (0,13/h vs. 0,31/h) and central volume of distribution was predicted lower (1820 L vs. 1352 L), the elimination and redistribution rates were similar across models in most of the cases.Especially large differences (> 40%) were observed in the estimates for the population parameters defining the concentration of the metabolite.To address that issue, we trained a model adding a proportional error between the metabolite concentration and its predictions in Eq. 4.However, this modification did not improve the performance of the model.The sunitinib analysis was more challenging than 5FU due to a small dataset composed of two different study populations increasing the variability.Moreover, more parameters had to be predicted due to the absorption process and the inclusion of metabolite concentrations, increasing the task complexity.However, considering that our model performed well despite these limitations, we expect that our MMPK-SciML method would produce more confident parameter estimates and better predictions if we had more training data.

Conclusion
This work shows the need to use a structural model to effectively capture the time course of plasma concentrations in patients.In this regard we proposed a novel hybrid machine learning framework, which combines the flexibility of modern neural network architectures with a compartmental model structure describing pharmacokinetic drug disposition.A limitation of our approach is the need for larger datasets compared to standard population PK modeling approaches.On the other hand, we offer the modeler the advantage that our approach does not require precisely specifying in which way PK parameters are influenced by covariates.This results in a simplification of the modeling process.A possible direction of future research is to incorporate our model architecture into more complex frameworks for dosage adjustment, e.g.via reinforcement learning. .

Study Highlights
• What is the current knowledge on the topic?
Machine Learning and Scientific Machine Learning (SciML) frameworks have shown promising results for pharmacokinetic modeling.However, methods for learning the inter-individual variability have not been widely investigated.
• What question did this study address?
How well do population pharmacokinetic (Pop-PK) and classical machine learning (ML) approaches perform in comparison to a SciML approach for PK modeling?Can a neural network be employed in a SciML framework to learn inter-individual variability while making accurate PK predictions?
• What does this study add to our knowledge?
The proposed MMPK-SciML model learns population PK parameters and their inter-individual variability and outperforms classical ML and Pop-PK approaches.Our approach also addresses common drug development challenges such as missing values and different time sampling.

Acknowledgements
We thank Ana Victoria Ponce Bobadilla for her input and explanations on how to create meaningful diagnostic plots for ML-enabled population PK models and Sven Stodtmann for the idea of encoding .the concentration-time trajectory into parameters.We also thank Ana Socorro Rodríguez Báez for proof-reading the manuscript and aiding with its structure.

Use of AI
The tool DeepL Write was used by OT to refine the text.The tool ChatGPT was used by OT alongside Stack Overflow to assist in the coding process.        .

•
How might this change clinical pharmacology or translational science?Our methodology focuses on comparing different machine learning methods in the PK field.Our final framework allows for the development of novel comprehensible and trustworthy strategies for individual dose adjustment.

8 .
Author contributions U.J. and H.F. designed the research.D.V., O.T. and L.M.K. performed the research and analyzed the data., E.S. and A.F. developed the original population pharmacokinetic models, E.T. guided O.T. in coding the classical machine learning algorithms, and D.V., O.T., L.M.K, U.J. and H.F. wrote the manuscript.

Fig 1 .
Fig 1. MMPK-SciML overview.The mean and log variance of the patient eta´s distribution is predicted with a neural network.At the same time, the populational parameters are being learned and are used with an eta sample to define the patient-specific parameters, which are used with the patient dose regimen to predict the PK profile.

Fig 2 .
Fig 2. Goodness-of-fit (GOF) plots for the 5FU dataset showing predicted versus observed concentrations for all trained models.

Fig 3 .
Fig 3. Goodness-of-fit (GOF) plots for the sunitinib dataset showing predicted versus observed concentrations for all trained models.
the classical machine learning methods, we depict the results of the versions with feature selection.In green, blue and purple we show the best, second-best and third-best model for each dataset.In brackets, we show the results with data augmentation.MMPK-SciML* refers to the model with a fixed volume for 5FU.Metrics are reported when using the sampling from the patient specific distribution for test subjects.
[33]hermore, ‫ݔ‬ are patient-specific covariates.From a Bayesian inference perspective, we are interested in the poste-Unfortunately, solving the integral is analytically intractable.Approximations via Markov Chain Monte Carlo techniques are possible but very time-consuming.To overcome this problem, Kingma et al.[33] introduced a stochastic variational inference framework for neural networks which allows quantification of epistemic uncertainty, i.e. uncertainty due to missing data.The key idea is to approximate ‫‬ሺ ߟ ‫פ‬ ‫פ‬ ሼ‫ݔ‬ , ‫ݕ‬ ሺ‫ݐ‬ሻሽ ሻ by a distribution ‫ݍ‬ሺ ߟ ‫פ‬ ‫פ‬ ሼ‫ݔ‬ , ‫ݕ‬ ሺ‫ݐ‬ሻሽ ሻ, which is typically supposed to be Gaussian.The mean and (log) variance of this distribution are learned from the observed data via an encoder neural network ߶ ఏ :

Table S1 .
Description of data augmentation for 5FU and sunitinib

Table 2 .
Cross validation average populational parameters for 5FU and Sunitinib datasets.
F Fixed parameter.MMPK-SciML* refers to the model with a fixed volume for 5FU..

Table 3 .
Cross validation average metrics for 5FU and Sunitinib datasets