Abstract #M64

# M64
Approximate generalized least squares method for large-scale genome-wide association study.
L. Ma1, J. Jiang1, D. Prakapenka2, J. Cole3, Y. Da*2, 1University of Maryland, College Park, MD, 2University of Minnesota, Saint Paul, MN, 3USDA/ARS, Beltsville, MD.

The use of genomic relationships among individuals is an effective approach for population stratification correction in the analysis of genome-wide association study (GWAS), but the matrix inversion required for the statistical testing of SNP effects limits the sample size that can be analyzed by GWAS methods using relationship matrices. We propose an approximate generalized least squares (AGLS) method for GWAS using large samples. The AGLS utilizes the mixed model result that the least squares (LS) solution to fixed SNP effects is the GLS solution or best linear unbiased estimation if the best linear unbiased prediction of polygenic effects is removed from the phenotypic observations. Since the LS method is computationally efficient, no sample size limitation for this method is expected for the foreseeable future even though dairy genomic and phenotypic data are growing at a fast pace. Combined with a previous method and computing tool for epistasis testing, the AGLS method offers capability for testing and estimating additive, dominance and epistasis effects as well as estimating allelic and genotypic effects in large-scale GWAS. AGLS was compared with BOLT-LMM that is capable of large-scale GWAS for testing additive effects. The results showed that AGLS and BOLT-LMM identified the same significant additive effects with only minor differences in a sample of 294,079 cows. For the same sample analyzed by AGLS and BOLT-LMM, the GWAS without polygenic correction lacked sensitivity, i.e., different chromosomes and different SNP within each chromosome had similar effects, except for SNP in and around the DGAT1 gene on chromosome 14. These results showed that polygenic correction is necessary for large-scale GWAS and that AGLS is an efficient and versatile method for large-scale GWAS analysis, especially in dairy cattle where the polygenic animal effect is routinely estimated.

Key Words: genome-wide association study (GWAS), SNP, generalized least squares