ABSTRACT
Background Effective management of type 2 diabetes is an increasing challenge for health services globally. Integrated care, targeted at increasing capacity in primary care for earlier intervention in type 2 diabetes, may reduce adverse health outcomes, with benefits accrued over time. This protocol aims to describe a causal approach for the evaluation of a specialist-led integrated model of care delivered in Australian general practices, using linked administrative health data.
Methods This protocol outlines an observational cohort study using the Lumos data asset (New South Wales Health), linking general practice and hospital data. It follows the target trial framework, emulating a parallel cluster-randomised trial, and applies the estimands framework proposed by the International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use. It describes the data structure and statistical analysis plan required for causal inference. The primary causal estimand is the effect of the Diabetes Alliance Program Plus, an integrated care intervention, on all-cause hospitalisation rates for adults actively attending general practice with current or newly diagnosed type 2 diabetes over a 5-year follow-up period.
Discussion This protocol applies the target trial framework to emulate a parallel cluster-randomized trial of the Diabetes Alliance Program Plus using linked administrative data. This approach addresses confounding and selection bias inherent in observational evaluations of practice-level interventions and will generate robust real-world evidence to inform policy regarding type 2 diabetes management in primary care settings.
Trial registration ACTRN12622001438741; 10th November 2022, retrospectively registered: https://www.anzctr.org.au/ACTRN12622001438741.aspx.
INTRODUCTION
Diabetes is a major global health challenge, contributing significantly to disability, premature death, and economic burden across all regions(1). In Australia, approximately 1.3 million of people were living with type 2 diabetes mellitus (T2DM) in 2021, with higher prevalence in remote and socioeconomically disadvantaged areas(2). It is estimated that up to 500000 remain undiagnosed, increasing the risk of complications and premature mortality(3).
T2DM is associated with a wide range of long-term health complications including cardiovascular disease, chronic kidney disease, retinopathy, neuropathy, and psychological distress(4–6). These complications reduce quality of life, increase demand on health services(2), and increase healthcare costs(7). However, early diagnosis and appropriate escalations in treatment can prevent or delay these outcomes.
Despite the central role of primary care in diabetes management in Australia, an estimated 37% of patients do not receive appropriate care for diabetes from general practitioners(8). Barriers include limited access to decision-support tools, inadequate clinical information systems, and insufficient opportunities for continuing professional education(9). These challenges underscore the need for system-level interventions that strengthen primary care capability and capacity, through integrated care pathways, improved digital infrastructure, and stronger linkages with specialist and community services.
One such initiative is the Diabetes Alliance Program Plus (DAP+), an integrated care model originally launched as the Diabetes Alliance Program in 2015 as a partnership between the Hunter New England Local Health District and Hunter New England and Central Coast Primary Health Network(10). The program expanded to DAP+ in 2023 to include the University of Newcastle and Hunter Medical Research Institute. DAP+ aims to build general practice capability and capacity in the Hunter New England Local Health District, New South Wales through a quality-improvement program including: (i) whole-practice data analysis and feedback; (ii) specialist-led case conferencing; and (iii) structured clinician education(10,11).
Early evaluations of DAP+ demonstrated significant short-term improvements in glycaemic control (HbA1c), systolic blood pressure, and cardiovascular risk among participating patients, and the program was well received by both providers and patients(10,12,13). These findings suggest that DAP+ enhances the quality and consistency of diabetes care in general practice by supporting data-driven monitoring, multidisciplinary collaboration, and proactive case management. Improvements in these intermediate markers are clinically meaningful, yet the durability of these effects and their translation into long-term outcomes remain uncertain. Sustained improvements through continued health service engagement could reduce diabetes-related complications, hospitalisations, and premature mortality, with corresponding benefits for health system efficiency and patient wellbeing.
Although randomised controlled trials (RCTs) are the gold standard for assessing intervention effectiveness, they are often impractical in real-world settings. DAP+ was not designed as an RCT and has since been widely implemented as routine practice across the Hunter New England Local Health District. Therefore, this study will apply the target trial emulation framework(14), which enables causal inference from observational data.
This protocol outlines an emulated trial to be conducted using the Lumos data asset (NSW Health) to evaluate the long-term effectiveness of DAP+ on health service use and health outcomes. Specifically, it aims to estimate the causal effect of the DAP+ cluster intervention on all-cause hospital admissions (primary outcome), emergency department presentations, hospital length of stay, lower-limb loss, adherence to testing guidelines, and biomarker levels. It will also determine the feasibility of using the Lumos data asset for evaluating pragmatic, system-level interventions delivered in primary care.
METHODS AND ANALYSIS
Target trial emulation
Target trial emulation applies the design principles of RCTs to observational data to strengthen causal inference and reduce bias. The protocol of a target trial specifies eligibility criteria, treatment strategies, assignment procedures, follow-up, outcomes, and the statistical analysis plan. It also defines the causal contrast of interest, typically representing the observational analogues of intention-to-treat and per-protocol effects. In this study, the target trial framework is used to emulate a cluster trial using the Lumos data asset, with all elements of trial design, including eligibility, treatment strategy, assignment, follow-up, outcomes, causal contrast, and analysis, defined a priori to minimise bias and enhance reproducibility.
Data sources and structure
The Lumos data asset is an ethically approved data linkage program led by the New South Wales Ministry of Health, in partnership with Primary Health Networks and the Centre for Health Record Linkage (CheReL)(15). Lumos securely links general practice data with multiple New South Wales (NSW) health data collections, including hospital admission, emergency department, ambulance, non-admitted patient and mortality records, which enables a comprehensive view of patients’ health care access and outcomes. Participating practices provide informed consent for patient records to be linked with statewide health service data, which is enabled through a waiver of individual patient consent under the Lumos Governance Framework(16). All patient and practice level data are deidentified.
The Lumos dataset undergoes regular data quality assessments to ensure completeness, accuracy, and consistency across contributing general practices and linked health system datasets. Previous evaluations have shown the Lumos cohort to be broadly representative of the NSW population with respect to key demographic and clinical characteristics(17).
Use of the Lumos data asset is ethically approved to support public health programs in NSW, including chronic disease management and early intervention. Authorised collaborators access data through the cloud-based Secure Analytics Health Environment (SAPHE) platform. Ethical use of Lumos data is restricted to planning, funding, management, and evaluation of health services(16).
Target trial specification
A comparative summary of the ideal randomised trial and its emulated target trial is presented in Table 1
Summary of the design features of the target trial protocol for the Diabetes Alliance Program Plus (DAP+) using the Lumos data asset.
Population and eligibility
The target population includes people with a clinical diagnosis of T2DM who attend general practices participating in the Lumos data linkage program, within NSW. Eligibility is determined dynamically. Patients aged 18 years or older with a current T2DM diagnosis and an active patient status are eligible. Patients not meeting these criteria at a given time may become eligible once the criteria are satisfied. The eligibility definition captures both prevalent and incident T2DM diagnoses within the Lumos data asset.
Baseline covariates will include demographic, socioeconomic, and comorbidity measures defined at the index date, and frequency of visits to general practices, and prior hospitalisations up to two years preceding the index date. Patients must have complete baseline data on all variables required for propensity score estimation.
Recruitment and follow-up period
Recruitment to the target trial will begin on 1 January 2015 and continue until one year prior to the latest hospital admission record date in the Admitted Patient Data Collection. This ensures that all eligible individuals contribute at least one year of potential follow-up within the five-year outcome window.
Eligible patients from DAP+ intervention practices will enter the cohort on the date of their first eligible visit (index date) after participating in the program. Propensity-matched controls will be selected from eligible patients attending non-DAP+ practices outside the Hunter New England Local Health District, to minimise intervention contamination when practitioners change practices. Outcomes will be evaluated for three fixed intervals relative to the index date: baseline to 1 year, years 2–3, and years 4–5.
Follow-up will begin at the index date and continue until the earliest of (i) five-years post-index, (ii) death (ascertained from the Registry of Births, Deaths, and Marriages), or (iii) administrative censoring at the data cut-off date. Baseline variables for demographic, socioeconomic, and comorbidities will be measured at the index date, and health service access variables within the preceding two years of the index date.
Treatment assignment procedure
The treatment is DAP+ intervention, delivered in general practice as per the original clinical trial registration (ACTRN12622001438741). The comparator is standard general practice care. Exposure to treatment is defined as the first attendance at a DAP+ practice after meeting eligibility. Exposure to standard care is defined as attendance at a non-DAP+ practice after meeting eligibility.
Because practices are not randomly assigned, treatment assignment will be emulated using propensity score matching to balance measured confounders between exposure groups. The resulting matched cohort will approximate randomisation conditional on observed covariates. Details of the propensity score estimation, matching specification, and balance diagnostics are described in the Primary Analyses section.
Outcomes
The primary outcome is total all-cause hospitalisations, defined as the number of unique patient episodes of care. All-cause hospitalisations will be identified in from the New South Wales Admitted Patient Data Collection (APDC). Multiple same-day admissions will be counted as one episode of care to obtain unique per patient per day events. Totals will be separately calculated for each follow-up interval (baseline to 1 year, years 2-3, and years 4-5).
Secondary outcomes include:
hospital length of stay: number of days from admission start date to discharge date as recorded in the APDC. When multiple admissions occur on the same date, the episode with the longest length of stay is used.
potentially preventable hospitalisations: defined according to Australian Institute of Health and Welfare indicators for selected potentially preventable hospitalisation (Table S4 in Additional file 1), with the same definition of episode of care as applied to the primary outcome.
T2DM-specific hospitalisations, defined as a hospitalisation with T2DM diagnosis recorded as the principal diagnosis in the APDC. ICD-10 clinical codes for determining T2DM diagnoses is available in Table S1 in Additional file 1.
ED presentations: identified using the Emergency Department Data Collection (EDDC), with the same definition of episode of care as applied to the primary outcome.
potentially avoidable ED presentations: defined according to the Australian Institute of Health and Welfare indicators for selected potentially avoidable GP-type presentations to emergency departments (Table S5 in Additional file 1), with the same definition of episode of care as applied to the primary outcome.
T2DM-specific ED presentations, defined as ED presentations with T2DM diagnosis recorded as the principal diagnosis in the EDDC. The EDDC records diagnosis codes using the ICD-9, ICD-10, or SNOMED-CT code sets. Codes used to identify T2DM-specific ED presentations in each system are listed in Tables S2 and S3 in Additional file 1.
T2DM-related lower-limb loss: defined according to the Australian Institute of Health and Welfare indicator for Hospitalisation for lower limb amputation with T2DM as principal or additional diagnosis (Table S6 in Additional file 1).
all-cause mortality: death dates will be obtained from Registry of Births, Deaths, and Marriages data set.
Adherence to testing frequency guidelines for T2DM pathology monitoring: testing frequency will use definitions from the Royal Australian College of General Practitioners’ annual cycle of care guidelines for T2DM(18). Separate adherence measures will be constructed for HbA1c, lipid profile, and renal function testing. For HbA1c, each test initiates a 6-month coverage window extending from test date. For lipid profile and renal function tests, each test initiates a 12-month coverage window. If a subsequent test occurs within an active coverage window, the coverage is extended from the new test date for the relevant duration. Coverage from tests conducted before the start of a follow-up interval is carried forward into the interval. Coverage intervals that extend beyond the end of a follow-up interval will be truncated at the interval’s end. For each test type, the proportion of months within the interval covered by any active window is calculated per participant. Participants with more than 90 percent coverage for a given test type are classified as adherent to that guideline.
T2DM-related biomarker levels (HbA1c, lipid profile, eGFR): Laboratory measurements will be obtained from General Practice Electronic Health Records. For each biomarker, the mean value will be calculated separately for each patient within each follow-up interval. When multiple tests occur within an interval, all available results will contribute to the per patient interval mean. Analyses will be performed separately for each biomarker. Additionally, biomarkers levels may be classified according to clinically relevant thresholds (e.g., HbA1C <7%, 7-8.9%, ≥9% etc.).
Clarifying the estimands
An estimand defines the treatment effect that a study aims to quantify. Clear specification of estimands clarifies the research question and guides appropriate methodological choices.
Estimands in this protocol are defined according to the International Council for Harmonisation (ICH) E9(R1) guidelines and the framework proposed by Kahan (2024)(19) . The matched population will be used to estimate the intention-to-treat (ITT) effect. Handling of intercurrent events, which are post-baseline events that may affect interpretation, will also be specified. The corresponding estimands are summarised in Table 2.
Description of the estimands for testing the effectiveness of the Diabetes Alliance Program Plus (DAP+) on hospitalisations compared to standard care in general practice.
Primary Analyses
General principles
All analyses will comply with Lumos data governance requirements, including suppression of cells with counts ≤5, which will be reported as ‘≤5’.
Causal analysis principles
An observational analogue of the ITT effect will be applied to the primary analysis. DAP+ exposure will be compared to the relevant comparator (standard care) using a propensity matched cohort.
Relevant covariates for creating the propensity matched cohort were identified using the directed acyclic graph (DAG) in Figure 1, which was informed by literature review(20–23) and clinical domain knowledge. It was determined that treatment assignment is predominantly affected by geographic factors and general practice characteristics.
Directed Acyclic Graph representing relations between DAP+ exposure, outcome, and confounders
Access to the DAP+ enhanced care intervention depends not only on individual characteristics but also on whether a patient’s general practice self-nominated to participate in DAP+. Individual exposure can occur only within participating practices. Self-nomination may be influenced by the implementation of DAP+, which primarily targets, though is not limited to, regional general practices(10). Practice-level aggregates of Index of Relative Socioeconomic Advantage and Disadvantage (IRSAD, an Australian national measure ranking areas by relative socioeconomic conditions based on income, education, and employment)(24), geographic classification under the Australian Modified Monash Model (MMM, which categorises areas by remoteness and population size)(25), and operational features such as high-frequency servicing (>30% of patients attending ≥12 times over two years)(20) were identified as key determinants of participation. Because these factors are also associated with health outcomes, they form cluster-level biasing pathways upstream of the exposure–outcome relationship.
At the individual level, factors influencing general practice attendance include sociodemographic status via IRSAD, geographic remoteness via MMM, comorbidities, frequency of general practice visits, and prior hospitalisations. In Australia, multimorbidity is more prevalent in regional and remote areas than urban settings(26). As these factors are associated with both health outcomes and the likelihood of attending practices in different socioeconomic or rural settings, they may also form biasing pathways between the intervention and outcomes. Although BMI, age, and sex lie on the causal pathway, they are not required in the minimal adjustment set. Adjustment for individual MMM, IRSAD, and comorbid conditions is sufficient to close the identified individual-level biasing paths in the DAG.
Descriptive statistics
Summary statistics of baseline characteristics will be stratified by treatment group and will include where available in the Lumos data asset; the age at study entry, sex, BMI, IRSAD, MMM, comorbidities, and health service access indicators. Summary measures will be mean and standard deviation (SD) for continuous variables, and frequencies and percentages for categorical variables.
Balancing cluster and individual confounders via propensity scores
The minimal adjustment set required to identify the causal effect of receiving the intervention includes practice-level MMM, aggregate IRSAD, high-frequency servicing indicator, and individual-level MMM, IRSAD, comorbidities, and health service use indicators. To improve the precision of estimates, individual BMI, sex, and age will also be included as additional covariates.
Before individual-level matching, intervention and control practices will be exactly matched on MMM classification, aggregate IRSAD, and high-frequency servicing status. Each intervention cluster may be paired with one or more control clusters to balance cluster-level confounding on these variables.
After balancing cluster-level confounders by forming cluster-similar groups, intervention-exposed individuals will be matched to control individuals within these groups. Individual matching will be based on propensity scores estimated using a logistic regression model, with individual exposure to the intervention at baseline as the outcome and individual level confounders as predictors. Each DAP+ participant will be matched to the control with the closest propensity score who did not attend a DAP+ practice at baseline. Matches will have propensity scores within 0.2 pooled standard deviations of the logit of the propensity score to improve covariate balance between matched pairs(27). Matching will be performed one-to-one without replacement, and unmatched individuals will be excluded.
Covariate balance will be assessed in terms of standardised mean differences after propensity matching to determine whether balancing was successful. Interactions of variables and non-linear functional forms of continuous variables will be considered to iteratively improve covariate balance, as necessary.
Outcome analysis
All outcomes will be analysed using longitudinal Bayesian hierarchical models to account for clustering and repeated measurements within individuals and dyad pairs. Models will be fit using the propensity-matched sample. All models will include fixed effects for treatment, time, and their interaction, and random intercepts to account for repeated measures within clusters and dyad pairs. Weakly informative priors will be specified for all fixed and random effects to improve model regularisation and ensure stable Markov chain Monte Carlo sampling without imposing restrictive assumptions.
Primary outcome
The primary outcome is the total number of hospitalisations in all time intervals. For each patient, all hospitalisations within an interval will be summed to obtain a single patient-level total. This outcome will be modelled using a Bayesian hierarchical Poisson, negative-binomial, or zero-inflated Poisson model, as selected by model fit, evidence of overdispersion, or excess zeroes. The model will estimate incidence rate ratios (IRR) with 95% credible intervals (CrI).
Secondary outcomes
Secondary outcomes include total hospital length of stay, total potentially preventable hospitalisations, total T2DM-specific hospitalisations, total ED presentations, total potentially avoidable ED presentations, and T2DM-specific ED presentations in all time intervals. Each will be analysed using the same modelling considerations as the primary outcome to estimate IRR with 95% CrI. Statewide data indicate low event rates for T2DM-specific hospitalisations (3.7%)(2), raising the possibility of nonconvergent models in the matched cohort. If models fail to converge, results will be reported descriptively by treatment group and time interval. Similar considerations apply to T2DM-specific ED presentations.
T2DM-related lower limb loss will be analysed using a logistic regression model. Posterior predictions of event probabilities will be used to estimate relative risks (RR) with 95% CrI.
All-cause mortality will be analysed using a pooled logistic regression model for discretised time-to-event. Cumulative survival probabilities will be obtained via posterior predictions to estimate risk differences (RD) with 95% CrI at 1, 3, and 5 years.
Pathology biomarker outcomes will be analysed using normal or log-normal regression models based on empirical distribution of each biomarker. For biomarkers with clinically informative thresholds, secondary analyses will be conducted using logistic, multinomial, or ordinal regression models where appropriate.
Compliance with annual cycle-of-care testing for metabolic markers will be analysed using logistic regression models.
DISCUSSION
Evidence indicates that the risk of hospital admission among patients with type 2 diabetes mellitus (T2DM) is associated with elevated HbA1c levels(28,29) and other routinely measured primary care markers, including increased uACR, elevated systolic blood pressure, and reduced high-density lipoprotein (HDL) cholesterol(29). The association between elevated HbA1c and macrovascular and microvascular complications has been well established since 2000, with each 1% HbA1c reduction associated with 21% lower diabetes-related endpoints and 37% lower microvascular complication (30),underscoring the importance of metabolic control in preventing hospitalisations and other adverse outcomes. In the pilot evaluation of DAP+, improvements in diabetes-related metabolic markers, particularly HbA1c, were observed among T2DM patients attending DAP+-enabled general practices(10,11). Given the established link between these markers and diabetes-related complications, these findings suggest that DAP+ may contribute to improved long-term outcomes for patients with T2DM through enhanced metabolic control. However, as the pilot study was not designed to assess long-term endpoints, the effect of DAP+ on longer-term outcomes remains an important question warranting further investigation. Potentially preventable hospitalisations are widely used to monitor the performance of primary health care systems and to evaluate the effectiveness of primary care interventions in Australia and internationally(31). As the DAP+ program aims to strengthen chronic disease management in general practice, reductions in potentially preventable hospitalisation rates represent a plausible downstream indicator of its effectiveness in improving diabetes care and long-term health outcomes.
Ideally, the effectiveness of an intervention would be assessed using a RCT. However, as the DAP+ program has been progressively implemented across the Hunter and New England region since 2015, conducting a new RCT in this setting is not feasible due to potential contamination and spillover from prior program exposure, and the cost of running such a large trial. Implementing an RCT in other health districts would also pose substantial practical and resource challenges, as it would require establishing the DAP+ model of care de novo within a different health service context. There are also ethical considerations, given the existing evidence supporting its intermediate benefits (REFS), making it questionable to withhold access to enhanced care. Furthermore, in regions without the DAP+ model, patients have limited access to metropolitan specialist care and remain at increased risk of premature and progressive diabetes-related complications due to undertreated disease.
DAP+ has been progressively implemented across the Hunter New England Local Health District in partnership with the Hunter New England and Central Coast Primary Health Network since 2015, and administrative health data from participating general practices are routinely captured through the Lumos data linkage program in New South Wales(17). This linkage provides a rich longitudinal dataset suitable for evaluating the long-term effects of DAP+. Using these existing data within a target trial emulation framework(14) offers a practical and robust alternative to conducting a new RCT. Previous studies have shown that well-designed emulated trials using observational data can yield treatment effect estimates consistent with those obtained from true RCTs(32).
Although this protocol employs propensity score methods to balance key confounders, residual confounding cannot be fully eliminated and remains a limitation of any observational study(33–36). Unmeasured differences may exist across Local Health Districts and Primary Health Networks in which practices are nested, potentially influencing diabetes management. However, all operate under New South Wales Health governance with uniform policy, funding, and clinical frameworks. Matching within geographic strata (metropolitan, regional, rural, remote) further limits systematic region-level bias.
Lumos participation is voluntary among general practices, introducing potential selection bias in the data collections. As of November 2025, more that 7 million people across NSW are represented with approximately one-third of all general practices having consented to data-sharing across the state (Add Ref). The 2023 Lumos evaluation reported 91% demographic representativeness to the broader New South Wales population(37), providing reasonable assurance that the Lumos data set captures statewide demographic diversity.
While recognising the limitations inherent to observational analyses, this study will generate the best available evidence to inform general practice-centred models of care for type 2 diabetes mellitus in rural and remote communities in NSW.
DECLARATIONS
Ethics approval and consent to participate
The Diabetes Alliance Program Plus has ethics approval from the Hunter New England Human Research Ethics Committee (2024/ETH01649), registered with the University of Newcastle (R-2024-0073). The Lumos data asset has ethics approval from the Population Health Services Research Ethics Committee (2019/ETH00660) with a waiver of individual participant consent for approved secondary data uses.
Consent for publication
Not applicable.
Availability of data and materials
No new datasets were generated in this study.
Competing interests
None declared.
Funding
The Diabetes Alliance Program Plus (DAP+) and this work is supported by a philanthropic gift from the Colonial Foundation, administered by the Hunter Medical Research Institute. Joshua Dizon is supported by an Australian Government Research Training Program (RTP) scholarship administered through the University of Newcastle’s Strategic Engagement Scheme.
Authors’ contributions
JAD, DB, CO and AH contributed to the statistical analysis plan and statistical rationale described in this article. AH and SA provided clinical rationale for this protocol. All authors were involved in the discussions and editing of the manuscript and provide approval for publication.
Patient and public involvement
Patients and/or the public were not involved in the design, conduct, reporting, or dissemination plans of this research.
Acknowledgements
The authors acknowledge the Lumos program, a partnership of the NSW Ministry of Health and NSW Primary Health Networks made possible through the participation of general practices across NSW. Record linkage was carried out by the Centre for Health Record Linkage.
Footnotes
Major revisions to overall manuscript. Supplemental files updated.






