Equating, or correction for between-block effects with application to body fluid LC-MS and NMR metabolomics data sets

Combination of data sets from different objects (for example, from two groups of healthy volunteers from the same population) that were measured on a common set of variables (for example, metabolites or peptides) is desirable for statistical analysis in "omics" studies because it increases power. However, this type of combination is not directly possible if nonbiological systematic differences exist among the individual data sets, or "blocks". Such differences can, for example, be due to small analytical changes that are likely to accumulate over large time intervals between blocks of measurements. In this article we present a data transformation method, that we will refer to as "quantile equating", which per variable corrects for linear and nonlinear differences in distribution among blocks of semiquantitative data obtained with the same analytical method. We demonstrate the successful application of the quantile equating method to data obtained on two typical metabolomics platforms, i.e., liquid chromatography-mass spectrometry and nuclear magnetic resonance spectroscopy. We suggest uni- and multivariate methods to evaluate similarities and differences among data blocks before and after quantile equating. In conclusion, we have developed a method to correct for nonbiological systematic differences among semiquantitative data blocks and have demonstrated its successful application to metabolomics data sets.

 

Authors: 
H.H. Draisma, T.H. Reijmers, F.M. van der Kloet, I. Bobeldijk-Pastorova, E. Spies-Faber, J.T. Vogels, J.J. Meulman, D.I. Boomsma, J. van der Greef, T. Hankemeier
DOI: 
10.1021/ac902346a
Pages: 
2010; 82 (3): 1039–1046
Published in: 
Analytical Chemistry
Date of publication: 
January, 2010
Status of the publication: 
Published/accepted