Skip to content

harmonization_combat2stats

Format the CSV stats files (coming from combat) into a TSV format that can be easily consumed by MultiQC or other downstream tools. The combat module currently output a CSV file with the following columns: “sid,site,bundle,metric,mean,age,sex,handedness,disease”. And a separate CSV file PER metric.

This module reformats these CSV files into TSV files with the following columns: “sid,roi,<covariates>,metric1,metric2,…” and aggregates all metrics from the same sites into a single TSV file per site.

Keywords : harmonization, combat, stats, format, tsv, csv, qc


Format : path(harmonized_stats)

TypeDescriptionMandatoryPattern
harmonized_statsfileCSV stats files to be reformatted after harmonization. This CSV should have the following columns: “sid,site,bundle,metric,mean,<covariates>”. These input CSV stats files come from the output of the harmonization modules.
True*.csv

Format : path(*.tsv)

TypeDescriptionMandatoryPattern
*.tsvfileHarmonized stats files in TSV format (MultiQC ready format).
True*.tsv

Format : path(versions.yml)

TypeDescriptionMandatoryPattern
versions.ymlfileFile containing software versionsTrueversions.yml

TypeDescriptionDefaultChoices
covariateslistList of columns that are not considered metrics in the input tabular files. These columns will be kept unchanged (except for renaming the columns) while the other columns (i.e. the metrics and their values) will be pivotted or unpivotted.
['sample', 'roi', 'site', 'age', 'sex', 'handedness', 'disease']
value_col_namestringName of the column containing the metric values in the CSV files.
mean
metric_col_namestringName of the column containing the metric names in the CSV files.
metric
suffixstringSuffix to use for the output files based on the operation performed. The output files will be named as “site.<suffix>.tsv”.
harmonized

DescriptionDOI
pandasPython package used to manipulate the input and output tabular data for this module.


Last updated : 2026-02-12