Weekly Intra-Treatment Diffusion Weighted Imaging Dataset for Head and Neck Cancer Patients Undergoing MR-linac Treatment

Radiation therapy (RT) is a crucial treatment for head and neck squamous cell carcinoma (HNSCC), however it can have adverse effects on patients’ long-term function and quality of life. Biomarkers that can predict tumor response to RT are being explored to personalize treatment and improve outcomes. While tissue and blood biomarkers have limitations, imaging biomarkers derived from magnetic resonance imaging (MRI) offer detailed information. The integration of MRI and a linear accelerator in the MR-Linac system allows for MR-guided radiation therapy (MRgRT), offering precise visualization and treatment delivery. This data descriptor offers a valuable repository for weekly intra-treatment diffusion-weighted imaging (DWI) data obtained from head and neck cancer patients. By analyzing the sequential DWI changes and their correlation with treatment response, as well as oncological and survival outcomes, the study provides valuable insights into the clinical implications of DWI in HNSCC. Design Type(s) Data integration objective Measurement Type(s) Primary tumor and nodal target volumes Technology Type(s) MRI scan (DWI)

analyzing the sequential DWI changes and their correlation with treatment response, as well as oncological and survival outcomes, the study provides valuable insights into the clinical implications of DWI in HNSCC.

Design Type(s)
Data integration objective Measurement Type(s) Primary tumor and nodal target volumes Technology Type(s) MRI scan (DWI)

Background & Summary
Radiotherapy (RT) is a cornerstone of head and neck squamous cell carcinoma (HNSCC) treatment, both as a primary treatment option and as a post-operative therapy 1 .While RT is effective in treating this type of cancer, it can also have adverse effects that may impact patient's long-term function and quality of life 2 .The response of tumors to RT varies, and predicting this response using a specific biomarker could help tailor RT doses and potentially improve treatment outcomes with reduced toxicity 3 .
Various biomarkers derived from tissue or blood have been extensively studied to determine their role in guiding personalized clinical decisions in HNSCC 4 .Among these biomarkers, only human papillomavirus (HPV) has recently accredited as a predictive and prognostic biomarker specially for oropharyngeal cancer 5 .Despite their usefulness, tissue and blood biomarkers have certain limitations.Tissue markers provide information from a small region of the tumor, usually obtained at a single time point, resulting in limited spatial and temporal resolution.On the other hand, blood biomarkers, such as liquid biopsies, offer a comprehensive overview of the tumors' secreted factors but lack spatial resolution entirely.In contrast, imaging biomarkers can evaluate each tumor volume, including primary tumors and lymph node metastases, individually.Furthermore, they can be obtained at multiple time points, thereby, offering superior temporal and spatial resolution compared to either tissue or blood biomarkers 6 .
Magnetic resonance imaging (MRI) provides detailed anatomical and functional information regarding the tumor.The potential benefits of utilizing MRI in tumor delineation and assessing their response to RT has prompted international collaborations to develop MR-linac technology 7 .MR-linac represents an innovative RT device that combines MRI and a linear accelerator, allowing for the acquisition of quantitative images on a daily basis.This integration establishes the basis of MR-guided radiation therapy (MRgRT), enabling precise and real-time visualization and treatment delivery 8 .

Diffusion-weighted imaging (DWI) is an MRI technique with potential utility for assessing
tumor response by providing functional information regarding the movement of water molecules into intra/inter-cellular spaces, which is largely affected by cellularity within tumors.The quantification of water diffusion in each voxel is assessed using the apparent diffusion coefficient (ADC), a quantitative imaging biomarker 9 .
. Multiple studies have investigated the potential of ADC as a biomarker in head and neck cancer [10][11][12][13][14][15] .However, no studies in head and neck cancer have yet incorporated regular interval DWI scans throughout the course of RT using MR-linac device.
Herein, we aim to analyze the sequential quantitative changes in DWI within the primary tumor and lymph node metastases in patients with head and neck cancer who received RT using an MR-Linac device.This dataset offers a unique opportunity to leverage frequent DWI scans throughout the entire course of RT, enabling the quantification of weekly ADC changes (ΔADC) (Figure 1).Additionally, these changes could be correlated with RT response, as well as oncological and survival outcomes, providing valuable insights into clinical implications of DWI in head and neck cancer treatment. .

Patient Population
In this pilot study, 30  Patients included in this study should have fulfilled the following criteria: 1. Diagnosis of HNSCC is pathologically confirmed.
2. Curative-intent IMRT, with or without chemotherapy, was received using MR-linac device.

Patients with performance Status (PS) of 0-2 according to the ECOG (Eastern Cooperative
Oncology Group) Score for Cancer Patients by Oncology Healthcare Professionals 16 .
The following data were gathered utilizing the EPIC electronic medical record system.

Demographic data
The patients' demographic data included: age at diagnosis, gender, ethnicity, and smoking status.Smoking status at diagnosis was documented as current smoker, former smoker and never-smoker, based on the 2023 ICD 10 definitions 18 .

Disease-related Data
The initial evaluation of the disease involved a comprehensive history and physical examination.
Subsequently, nasopharyngolaryngoscopy was conducted to assess the site and extent of primary tumor with biopsies taken from suspicious areas.For better staging of HNSCC, all patients underwent contrast-enhanced CT (CECT), MRI and/or positron emission tomographycomputed tomography (PET/CT) scans of the head and neck.Surgery was primarily implemented for diagnostic purposes, and was typically performed before RT.
. Disease characteristics encompassed: tumor laterality, head and neck specific subsite of origin, tumor histology and grade, tumor stage, and HPV status.TNM (Tumor, node and metastases) classification was also provided according to the American Joint Committee on Caner (AJCC), 8 th edition 19 .
Patients with squamous cell carcinoma (SCC) of the oropharynx, neck metastases without an identifiable primary tumor (Carcinoma of Unknown Primary), and laryngeal SCC in the absence of a smoking history were recommended to undergo HPV testing 20 .Initially, p16 expression was confirmed using immunohistochemistry (IHC) for HPV detection.Subsequently, tumors that exhibited positive p16 expression underwent more rigorous HPV-specific detection methods, such as HPV DNA in situ hybridization (ISH) and/or a PCR-based assay.Conversely, tumors that were negative for p16 expression were considered HPV-unrelated and did not require further intervention 21 .For the whole cohort, the HPV status was categorized as positive, negative or unknown.

Treatment Data
Details of RT course were described and included: • Total dose of irradiation each patient received in Grays.
• Dose per fraction received in Grays.
• Total number of daily radiation treatment fractions.
• Dates of start and end of RT.
Systemic treatment eligibility and the choice of treatment regimens were based on factors such as the disease extent, PS, and comorbidities.Therefore, patients with a significant tumor burden and/or large metastatic lymph nodes were commonly prescribed induction chemotherapy (i.e., before the initiation of radiation treatment course) and/or concurrent chemoradiation (i.e., simultaneously during the course of RT).The administration of systemic treatment (for both the induction phase and/ or the concurrent phase) was reported as a binary variable "0=yes or 1=no".

Details of Treatment Technique
Initially, simulation was conducted using a non-contrast CT scan, which was obtained for the patient while in the supine-neck extended position.External room lasers and a scout film were used for patient alignment.A mouth opening/ tongue depressing stent was inserted to aid in positioning and immobilization.Custom thermoplastic masks, a posterior head cradle, and noncustom shoulder pulls were utilized to ensure reproducible set-up for radiation treatments.
. Isocenter for treatment planning was placed superior to arytenoids, and CT images from the vertex to carina were obtained.
Subsequently, the patient was brought to the MR-linac (Elekta AB, Stockholm, Sweden) and placed in the treatment position using the previously mentioned immobilization devices from the initial simulation.Axial MRIs were obtained for the region of interest, and these images were then transferred to the Monaco 5.4 (Elekta AB, Stockholm, Sweden) treatment planning system (TPS).
IMRT was delivered using the Elekta Unity system, which combines a 1.5T MRI system and a gantry positioned around the isocenter.The gantry houses a 7 MV linear accelerator with a flattening filter free (FFF) configuration 22 .Blanchard at al. provided a detailed description of the schematic IMRT that was administrated to all patients 23 .

Clinical Outcomes
Response to treatment was categorized as 0=occurrence of complete remission (CR)" or 1=no CR" based on the Response Evaluation Criteria in Solid Tumors (RECIST) guidelines, version 1.1 24 .The assessment of treatment response was conducted either weekly during RT, using the MR images obtained from the MR-linac system, or post-RT completion.
Confirmation of recurrence required pathological analysis and was denoted as 0= tumor recurrence" or 1=no tumor recurrence".Recurrence was further subcategorized as local" if it occurred within the same subsite of the primary tumors, regional" if it occurred in the neck, or distant" if it occurred outside the head and neck region.In the same context, patient's vital status was reported as a binary outcome 0=alive" or 1=dead"; as an indicator for overall survival status.

Image Details
MRI scans were obtained using the Unity system (Elekta AB, Stockholm, Sweden), based on a • Total number of scans = 1. .It is noteworthy that we followed the consensus EPI protocol provided by the MR-linac Consortium 25 .

ADC maps
ADC maps were created using b-values of 150 and 500 s/mm², with the exclusion of b=0 images.This approach was chosen to reduce the impact of perfusion on ADC calculations and to align with the guidelines set forth by the MR-linac Consortium 26 .
The imaging data were presented in the standardized format of Digital Imaging and Communications in Medicine (DICOM).
The ADC was commonly calculated using DWI data with at least two non-zero b-values (b values = 0, 150, and 500 s/mm 2 ).However, since the b0 image (non-diffusion-weighted image) is affected by the perfusion phenomenon, an alternative method is used to calculate the ADC parameter.The following is a general overview of the approach: 1. Acquire DWI data: Obtain multiple DWIs with different non-zero b-values.
2. Preprocessing: Apply necessary preprocessing steps to correct for artifacts, distortions, and noise reduction.The appropriate software and techniques specific to our research and clinical setting is based on above principles for accurate and reliable ADC parameter calculation.

Target Volume Segmentation
. DICOM data (images and RTS files) were anonymized using an in-house Python script that implements the RSNA CRP DICOM Anonymizer software.All files have had any DICOM header info and metadata containing PHI removed or replaced with dummy entries.Notably, patient medical record numbers were mapped to new randomized numeric values, which serve as their anonymized identifiers, and any corresponding date data were mapped to new randomized dates which preserved the relative order and time between the image acquisitions.

Structure Name Cleaning
Due to the presence of misspellings or duplicate structures in exported DICOM RTS structure names, structures names were harmonized using the Pydicom v. 2.2.2 Python library 27 .

Segmentation Data (Structures & ADC maps)
In total, 7537 anonymized DICOM ADC map image files and 200 anonymized RTS files are provided for this dataset.Figure 4 and Figure 5 demonstrate the total number of images and structures included in this dataset when stratified by timepoint, respectively.The folder structure for the anonymized dataset is shown in Figure 6.

Clinical Data
We offer a comprehensive comma-separated value (CSV) file that includes the clinical, pathological and demographic data.By incorporating these data into the CSV file, we provide a consolidated resource for exploring the relationship between clinical variables, ADC measurements, and the response of head and neck cancer to treatment.The anonymized ADC maps, segmented structures, and clinical data are cited under Figshare; doi: 10.6084/m9.figshare.22766783.

Segmentation
Target volumes were segmented and reviewed by two trained radiation oncologists: DE and ASRM, possessing 9 and 15 years of experience, respectively.

MR-linac
The technical validation of quantitative images acquired from the 1.5 T MR-Linac device serves as a critical process to establish the reliability and accuracy of these images as biomarkers.This validation process is essential to ensure that the acquired quantitative data can be consistently .utilized as reliable indicators or measurements for specific biological characteristics or processes.Several studies have been conducted for this specific purpose [28][29][30] .

EPIC (Electronic Medical Record System)
Clinical data (patients and disease characteristics) were manually collected from the University of Texas MD Anderson Cancer Center clinical databases through the EPIC electronic medical record system.

Usage Notes
This data (Images and segmentations) is provided in DICOM format with the accompanying CSV file containing additional clinical information.We invite all interested researchers to download this dataset to use in researches about DWI kinetics analysis in cancer patients receiving RT.

3 . 5 .
ADC calculation: The ADC can be calculated without the b0 image using the following equation: ADC = ln(S1/S2) / (b1 -b2) where S1 and S2 are the signal intensities of two diffusion-weighted images with different bvalues (b1 and b2, respectively).The natural logarithm (ln) is applied to the ratio of the signal intensities, and the result is divided by the difference in b-values.4. ADC map generation: Apply the calculated ADC values to create an ADC map, where each pixel represents the ADC value.Data interpretation: Analyze the ADC map to assess tissue characteristics.Lower ADC values indicate restricted diffusion, which may be associated with increased cellularity or tissue pathology, while higher ADC values suggest increased diffusion and decreased tissue density.