-
Notifications
You must be signed in to change notification settings - Fork 63
Description
Hello,
I've been using REGENIE to perform a GWAS of a continuous trait 'pheno' in 50k EUR ancestry UK Biobank participants (relatedness allowed) using the Genomics England (GEL) imputed genotypes after sample- and variant-level QC. Imputed variant filters are permissive (INFO > 0.3 only, MAC >=5) to allow for post-GWAS filtering.
The phenotype is continuous and has been rank-based inverse normal transformed with a normal distribution (mean = 0, SD = 1).
The model includes the following covariates: age, sex, PCs 1-4, 3 dummy variables (for a 4-level categorical imaging site covariate), genotyping array (binay) and an estimate of intracranial volume (in mm3 - ranging from about 1,000,000 to 2,000,000). This yields a lambda GC = 1.24 (Step 1 Rsq = 0.0425).
When I convert the volume to cm3 by dividing by 1000 and using the same parameters/ other covariates, the lambda GC gets down to 1.16 (Step 1 Rsq = 0.0650). I get similar results (lambda GC 1.17; Step 1 Rsq = 0.0642) when removing this volume covariate from the model. I also get the same lambda GC when:
- z-scoring the volume covariate first (mean = 0, SD = 1) without changing other covariates
- by adding the first 10 PCs instead of the first 4 PCs and using the volume in mm3 without changing other covariates
The selected Rsq from step 1 is around 0.06 for these 3 models (cm3, z-scoring, adding 10 PCs) whereas the problematic model with the covariate in mm3 yields an Rsq = 0.04.
Is it possible that the large numbers of the covariate in mm3 cause fitting issues with step 1?
I've attached the log files from the analysis with the volume in mm3 for details (step 1 and step 2 for chr4).
Thanks in advance.