Identifying risk factors associated with death among patients with MDR-TB in KwaZulu-Natal, South Africa: an illustration using Weibull parametric model.

This study aim was to identify the risk factors associated with multidrug resistant tuberculosis (MDR TB) disease. The Weibull model has shown to perform better than the Cox proportional models with respect to the accuracy and efficient of the estimates. Therefore, a Weibull parametric model was employed to identify predictors of death in patients with MDR TB and the efficiency of the models using current dataset. Patients diagnosed with MDR-TB were studied in four decentralised sites located in rural areas and one centralised hospital in KwaZulu Natal, South Africa from July 2008 to July 2012. Patients were followed from the date of MDR-TB diagnosis until death or the last follow-up date. A total of 1 542 patients were included in the analyses: 812 and 730 from the centralised hospital and decentralised sites, respectively. Of the 1 542 enrolled, 15.9% patients died. We found that the hazard of death was significantly higher among patients treated in decentralised sites (aHR) = 1.84, 95% CI = 1.38 to 2.75; SE = 0.81 than that of those who were treated in the centralised hospital. However, the results from the Cox PH model showed an insignificant hazard of death between the decentralised sites and the centralised hospital (aHR = 1.46, 95% CI = 0.69 to 2.36; SE = 0.92). Patients who are between 31 and 40 years of age had increased hazard of death compared to those between 18 and 30 years (aHR = 1.52, 95% CI = 1.04 to 2.23). The hazard of death in female patients was 24% higher compared to male patients (aHR = 1.24, 95% CI = 0.93 to 1.63). Furthermore, patients with previous MDR-TB episodes had an increased hazard of death (aHR = 1.79, 95% CI = 0.23 to 0.62) compared to those with no previous MDR-TB episodes. The hazard of death in HIV negative patients was low compared to those who were HIV positive (aHR = 0.95, 95% CI = 0.57 to 0.77). More health facilities are needed especially in decentralised places and that can help the 2030 World Health Organisation strategy to reduce or end TB infection. Keywords: MDR-TB, Cox proportional model, Weibull parametric model

A number of global studies, including two systematic reviews, have reported higher costs associated with managing MDR-TB patients in hospital [9][10][11][12][13] . South Africa remains one of the highest burdened countries in all three WHO-defined tuberculosis categories, including drug susceptible TB, MDR-TB, and TB/HIV coinfection cases. The previous tuberculosis drug resistance survey done in South Africa during 2001-2002 reported the prevalence of MDR-TB as decentralized model of care has proven to be effective, and is advisable [16] . In this latter case, only complicated cases are referred to specialized centres or proposed to local/international TB consilia.
Elimination of TB by 2035 will only be possible if countries address the emergence of drugresistant (DR) strains of Mycobacterium tuberculosis effectively. According to the WHO 2018 report, not all DR-TB cases are diagnosed (only 51% of people with bacteriologically confirmed TB were tested for rifampicin resistance (RR) in 2018), and not all DR-TB cases were treated (only one in three of the approximately half a million people who developed MDR/RR-TB in 2018 were treated). DR-TB continues to be an important public health priority [17] , and an estimated 19 million people are latently infected with MDR-TB [18] . The main aim of this study was to employ a Weibull parametric model for better estimates and more flexible than Cox semi-parametric model which is mostly used by many researchers because of its fewer assumptions. The outcome considered here was time from MDR-TB diagnosis until death.

Source of data and description
This was a prospective health systems study including all patients with confirmed diagnosis of MDR-TB, and who commenced treatment between 1 July 2008 and 30 June 2010. Data were sourced from five sites: The Greytown, Manguzi, Murchison, Thulasizwe (Decentralised sites) and King George (Centralised hospital). The data set consists of 1 542 patients, aged 18 years and older, diagnosed with MDR-TB. The target population was defined as all MDR-TB patients diagnosed and treated in the TB centres during the study period. Patients receiving care at more than one site were excluded in order to guarantee the quality of information on MDR-TB treatment episodes. An automatic monitoring method adopted by [19] sought to eliminate duplicates and correct classification errors of different treatment episodes from the same patient. Inclusion criteria for the comparison study required that patients reside within the catchment area of the site. No data was collected after 1 October 2012 as the study period was from 1 July 2008 to 30 June 2012.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) Each participant was examined and followed through regular culture smear sputum tests for MDR-TB outcomes. Conversion of sputum culture from positive to negative was considered a useful early indicator of programme effectiveness, as treatment outcomes were only available 18-24 months after treatment started. Culture conversion was defined as two consecutive negative sputum cultures taken at least one month apart [20][21][22] . These patients were followed from the date of MDR-TB diagnosis until they die or the last follow-up date.
Medical records were reviewed to collect patient-related demographic, clinical, pharmaceutical and laboratory data. All data, was collected prospectively, prior to knowledge of patient treatment outcomes. Health system data was collected from different components of the health system -laboratory, pharmaceutical and transport services and human resources using existing records and databases, structured questionnaires, observation and interviews. An iterative approach was used which enabled the team to identify new health system data required and develop appropriate data collection methodologies. Over the four-year study period each patient was visited monthly for a day. During each visit data from each health system component was collected, the functioning of the MDR-TB unit observed and informal discussions held with the nurse-in-charge of the MDR-TB unit, the clinician responsible for MDR-TB and the hospital pharmacist. Through a process of ongoing reflection, feedback and discussion with facility and district level staff problems were investigated to determine their origin and cause and possible solutions identified. Field notes detailing the visit and documenting observations and discussions with staff were written up after returning from the site. Notes were also made of concerns, opinions and issues which needed follow up.
The study protocol was approved by the University of KwaZulu-Natal Biomedical Research Ethics Committee (Ref: BF052/09), and by the KwaZulu-Natal Department of Health. Only secondary data, the data routinely collected by health workers for clinical care was used in this study. To protect patient confidentiality and anonymity the data bases were de-identified and access strictly limited. Informed consent was waived by the ethics committee, since all patient data used were previously collected during the course of routine medical care and did not pose any additional risks to the patients.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Statistical analyses
In this study the multivariable analysis using Weibull distribution in parametric survival model was performed. Survival time was calculated as the time interval between date of MDR-TB diagnosis and date death and date of the last follow-up (for those who did not have an event of interest). The data set was analysed using the Weibull parametric model and Cox proportional hazards model for comparison purposes to show the superiority of the former model. The data set was analysed using STATA version 16 [23] .

The Cox proportional hazard model
The most well-known model used in survival analysis is the Cox proportional hazards (PH) model [24] . It is a survival analysis regression model, which describes the relation between the event incidence, as expressed by the hazard function and a set of covariates. In brief, if is the survival time, subject to possible right censorship. The Cox PH model is written as where the hazard function ℎ( ) is dependent on (or determined by) a set of covariates ( 1 , 2 ,…, ), whose impact is measured by the size of the respective coefficients In Cox's PH model, the unknown baseline hazard function ℎ 0 is the non-parametric part and the unknown is the parametric part, which together create a semi-parametric model.
Unfortunately, the simplicity of the Cox PH model imposes unrealistic assumptions on the data.
Most significantly, the model needs an assumption of independent and identically distributed samples. There are situations, however, where these two assumptions may be found not supported. For example, subjects may be exposed to different risk levels, even after controlling . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 4, 2022. ; https://doi.org/10.1101/2022.03.01.22271638 doi: medRxiv preprint for known risk factors; this is because some relevant covariates are often unavailable to the researcher or even unknown (univariate case).

The Weibull parametric survival model
The Weibull, which was developed by Waloddi Weibull in 1951, comes originally from engineering issues to analyse the survival data [25] ; actually, it has been used to predict the proportion of future failures after observing a failure process at a given point in time [26] . It has a hazard rate which is either increasing, decreasing, or constant [27] . If the hazard rate is constant it will become exponential. Weibull is the only parametric model which has both proportional hazards and an accelerated failure-time representation [27] . In addition, acceptance of Weibull model can be checked via graphical assessment [28] .
The generalization of the exponential distribution to include the shape parameter is the Weibull distribution. The cumulative distribution function of the Weibull distribution is Where is the shape parameter and is the scale parameter, and the probability density function of the Weibull distribution is say ≥ 10 it is close to the smallest extreme value distribution [29] . When > 1 the hazard rate increases as time increases, and for < 1 the hazard rate decreases.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

Exploratory data analysis
In this analysis study, a total of 1 542 patients diagnosed with MDR-TB were included.  (Table 1).

Results from the Weibull parametric survival model: multivariable model
Since very few patients had previous MDR-TB episodes and very few had other comorbidities, we categorized the two factors as 'yes' or 'no' when doing analysis. Results for the univariable analysis showed that study sites, age at diagnosis, gender, previous MDR-TB episodes, and HIV status had a high significant effect on survival time ( Table 2).
The graph of the -ln(-ln(survival probability))) against log of failure time followed a linear like trend which indicated that Weibull model is appropriate for the dataset (Figure 2).
Multivariable analysis approach was then performed using study site, age group, gender, previous MDR-TB episodes, comorbidities conditions and HIV status variables. These variables were used regardless of whether the variable was found to be significant or not on the univariate analysis.
We found that the hazard of death was higher among patients treated in decentralised sites (adjusted hazard ratio (aHR) = 1.84, 95% confidence interval (CI) = 1.38 -2.75) than that of those who were treated in the centralised hospital. This was also supported by the graph in . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.  . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 4, 2022. ; https://doi.org/10.1101/2022.03.01.22271638 doi: medRxiv preprint     is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Discussion
Sub-Saharan Africa is currently seeing a very large change in the major health problems it faces.
In this study, the interest was on estimating the survival with multi-explanatory variables, using the Weibull parametric model. Parametric models provide an interpretation based on a particular distribution of the time to event irrespective of proportional hazard assumptions.
Most researchers are interested in using the Cox proportional hazard model which in many situations ends up using the last observation of the time-varying covariate. Actually in the case of long-term survival, death rate models are better than Cox model [30] . The standard errors for the estimates in the Cox model were large and in the Weibull parametric model, they were small which makes it a better model for analysis survival time. Furthermore, the fact that Weibull model take into account time varying covariates, allow researcher some flexibility.
Weibull model was advantageous for modelling multivariable as it has a hazard rate which is either increasing, decreasing, or constant [27] .
The main aim of this study was to identify risk factors of death among patients with MDR-TB disease. Drawing on our findings based on Multivariable Weibull model, site, age at diagnosis, previous MDR-TB episodes and HIV status were significant. The results of this showed that, in overall, 56% of the patients were cured of MDR-TB, 15.9% died during the follow-up, 21.7% were defaulted and 6.4% were lost in follow-up. Results from the Cox model showed no statistical significant difference between the centralised hospital and decentralised sites in terms of dying over time, with patients in decentralised sites having higher hazard of death than those in the centralised hospital. Results from Weibull parametric model indicated that the death rate was significantly different between the centralised hospital and the decentralised sites and the standard errors were small compared to those obtained in Cox model. Most risk factors that was identified by Weibull model are those identified but Cox model but Weibull model was superior because produced parameter estimates with small standard errors. Carroll's study indicated that in the analysis of survival data Weibull model can provide a useful, parametric alternative to Cox's regression modeling [31] .
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) There are some limitations to this study. we did not have any information about the socioeconomic factors of patients but interest may be done either by investigating the effect of these factors. In addition, our study was carried out based on data collected in KwaZulu-Natal province, South Africa and therefore the findings could not be generated.

Conclusion
The Weibull model enables the assessment of the effect of factors while taking into account the distribution of the data. It is the best option for analyzing lifetime data if the distributional assumptions can be met and the shape parameter is known. However, when the shape parameter is unknown, the Cox proportional hazards model is a good alternative. It requires fewer assumptions than the parametric Weibull model and provides comparable mean square errors of the estimates of PH-slope. We conclude that even in resource-limited settings and in the presence of HIV co-infection, community-based care is more effective as care in either a centralized or decentralized hospital setting for patients who do not require hospitalization and that could decrease the number of death due to tuberculosis as number of infection continue to increase.  CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted March 4, 2022. ; https://doi.org/10.1101/2022.03.01.22271638 doi: medRxiv preprint secondary data, the data routinely collected by health workers for clinical care was used in this study. To protect patient confidentiality and anonymity the data bases were de-identified and access strictly limited. Informed consent was waived by the ethics committee, since all patient data used were previously collected during the course of routine medical care and did not pose any additional risks to the patients.

Consent for publication
Not applicable

Availability of data and materials
Data will be made available upon request but will be controlled.

Competing interests
All authors report no competing interests.

Funding
The National Research Foundation (NRF) Grant (SFH160712177401) of South Africa has funded me to be able to expand my knowledge in data analysis using different statistical methods and programs.

Authors' contributions
All the authors made contribution to the study. SVM planned the study and wrote the initial draft of the article and did the analysis. HW and RC assisted with data analysis and . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted March 4, 2022. ; https://doi.org/10.1101/2022.03.01.22271638 doi: medRxiv preprint interpretation. SVM did the revisions to the paper assisted by HW and RC. All authors approved submission of this article. The author(s) read and approved the final manuscript.