harmonization_clinicalcombat
ComBAT harmonization of clinical MRI data. It’s a ComBAT implementations for adapting sites to a reference site along with ready-to-run scripts to prepare datasets, fit a model, apply the harmonization and analyze the outputs. Two harmonization methods are available pairwise (linear) and clinic (non-linear).
Keywords : Harmonization, Pairwise, Clinical, ComBAT
Inputs
Section titled “Inputs”Input 1
Section titled “Input 1”Format : tuple path(ref_site), path(move_site)
| Type | Description | Mandatory | Pattern | |
|---|---|---|---|---|
| ref_site | file | CSV for Reference site data for one metric, all bundles and subjects. Must include sid,site,bundle,metric,mean,age,sex,handedness,disease. The Disease column must include at least the label HC. | True | *.{csv,csv.gz} |
| move_site | file | CSV for Moving site data for one metric, all bundles and subjects. Must include sid,site,bundle,metric,mean,age,sex,handedness,disease. The Disease column must include at least the label HC. | True | *.{csv,csv.gz} |
Outputs
Section titled “Outputs”Format : path(*.model.csv)
| Type | Description | Mandatory | Pattern | |
|---|---|---|---|---|
| *.model.csv | file | Harmonization fitted model. | True | *.model.csv |
harmonizedsite
Section titled “harmonizedsite”Format : path(*.harmonized.csv.gz)
| Type | Description | Mandatory | Pattern | |
|---|---|---|---|---|
| *.harmonized.csv.gz | file | Harmonized moving site data. | True | *.harmonized.csv.gz |
Format : path(*bhattacharrya.txt)
| Type | Description | Mandatory | Pattern | |
|---|---|---|---|---|
| *bhattacharrya.txt | file | Bhattacharrya distance QC reports for harmonized data. This file contains the Bhattacharrya distance between the reference and moving site distributions (one file for pre-harmonization and one for post-harmonization). Lower values indicate better alignment between the two distributions. The first column indicates the number of HC subjects used to compute the distance. Other columns correspond to the bundles. The first row contains the bundle names and the second row contains the count (of HC subjects) or the Bhattacharrya distances. | True | *bhattacharrya.txt |
figures
Section titled “figures”Format : path(*.png)
| Type | Description | Mandatory | Pattern | |
|---|---|---|---|---|
| *.png | file | Figures generated to visualize harmonization results. | True | *.png |
plot_data_json
Section titled “plot_data_json”Format : path(*.json)
| Type | Description | Mandatory | Pattern | |
|---|---|---|---|---|
| *.json | file | JSON files used to properly plot the harmonization results in a downstream MultiQC report. These files contain the regression curves and the percentiles computed before and after harmonization for all bundles of a given metric. The files have the following structure: | True | *.json |
versions
Section titled “versions”Format : path(versions.yml)
| Type | Description | Mandatory | Pattern | |
|---|---|---|---|---|
| versions.yml | file | File containing software versions | True | versions.yml |
Arguments (see process.ext)
Section titled “Arguments (see process.ext)”| Type | Description | Default | Choices | |
|---|---|---|---|---|
| method | string | Harmonization strategy to use clinic (non-linear) or pairwise (linear) | clinic | |
| bundles | list | List of bundles subset used to plot. By default all bundles in the csv input file were used to compute model harmonization. | all | |
| limit_age_range | boolean | Drop reference subjects outside the moving age range. | disabled | |
| ignore_sex | boolean | Remove sex variable when estimating the model. If all subjects have the same value, it will be automatically applied. | disabled | |
| ignore_handedness | boolean | Remove handedness variable when estimating the model. If all subjects have the same value, it will be automatically applied. | disabled | |
| no_empiral_bayes | boolean | Ignore the empirical Bayesian estimate | disabled | |
| regul_ref | integer | Ridge penalty applied to reference regression. Parameter use for clinic method only. | 0 | |
| regul_mov | string | Moving site penalty or auto-tuning. Parameter use for clinic method only. | -1 | |
| degree | integer | Polynomial degree used for age. It depends on method used (1 for pairwise, 2 for clinic) | None | - 1 - 2 |
| nu | integer | Variance hyperparameter for the moving site. Parameter use for clinic method only. | 5 | |
| tau | integer | Covariate hyperparameter for the moving site. Parameter use for clinic method only. | 2 | |
| degree_qc | integer | QC model degree override (0 reuses the harmonization degree). | 0 |
| Description | DOI | |
|---|---|---|
| clinical-ComBAT | Method for harmonizing MRI data across different sites. | 10.48550/arXiv.2511.04871 |
Authors
Section titled “Authors”Last updated : 2026-02-12