Phenotype standardization is a prerequisite to enable and promote data sharing and future joint research in Europe. Facilitating an efficient, valid pooling of data and samples across multiple biobanks and cohorts is of particular importance when statistical power is critical; for example, in exploring the joint effects (interactions) of genes and the social/physical environment, and/or when aetiological heterogeneity is such that the participants from whom data and samples are to be used are sub-sampled on the basis of demography, exposure history or by phenotypic subtype (e.g. thrombotic versus haemorrhagic stroke).
But even when the research focus is on simple main effects, practical experience to date demonstrates that many of the most important replicable findings from genome wide association studies have been generated by collaborative pooled analysis. When pooled analysis is required, users must have access and advice regarding the approaches they may adopt to ensure that apparently similar data and samples from different studies can reasonably be pooled. Specifically, it is important to ensure that the data from the various sources really are inferentially comparable in the particular scientific setting under which they are to be used.
Within the European Biobanking and Biomolecular Research Infrastructure - Large prospective cohorts (BMRI-LPC)-project, two work packages are in charge of the phenotype standardization:
WP5 on ‘Exposure Data Harmonization’, led by Dr Nadia Slimani from the International Agency for Research on Cancer (IARC) and WP6 on ‘Clinical Endpoint Data Harmonization’ led by Prof Elio Riboli from Imperial College London. The tasks and activities of WP5 and WP6 are closely synchronised.