Odontogenesis, the process of tooth development, is a complicated process that starts early in embryogenesis (1, 2). Mouse and human models indicate that interactions between the epithelium and mesenchymal cells during odontogenesis occur under the control of different families of signaling molecules and their receptors, including transforming growth factors (TGF), bone morphogenetic proteins (BMP), fibroblast growth factors (FGF), epidermal growth factor (EGF), and the hedgehog (Hh) and wingless (Wnt) families (3, 4). There are several known genes that regulate the communication and interaction between epithelium and mesenchymal cells, including MSX1, MSX2, PAX9, RUNX2, AXIN2, EDA, GLI2, and GLI3. Disturbances during the signaling process or changes in any of the regulating genes may result in dental anomalies, including changes in tooth development, structure, number, size, and morphology (5, 6).

Structural dental anomalies, such as tooth agenesis, enamel hypoplasia, impaction, rotation, displacement, mamelons, and supernumerary teeth (Supplementary Table 1 provides definitions for the dental anomalies used in this study) are presumed to be caused by interactions between genetic, epigenetic, and environmental factors during the process of tooth development (7). There is evidence that these dental defects tend to co-occur in the same patient, which could indicate a shared genetic etiology (8, 9). Although some dental anomalies might be asymptomatic, these anomalies can also lead to serious clinical problems, including delayed eruption or impaction of the teeth; temporomandibular joint pain and dysfunction; malocclusion; periodontal disease due to excessive occlusal force; and increased susceptibility to dental caries due to defects in tooth structure and/or crowding (10).

Different types of studies have taken advantage of next generation sequencing technologies to investigate the causes of genetic diseases and anomalies that affect the craniofacial complex. However, those studies to date have mostly focused on one disease or anomaly without taking into account that some genetic variants could cause multiple different anomalies. Therefore, the goal of this study was to better understand the inter-relationships between multiple structural dental anomalies and dental caries on a phenotypic level by identifying patterns of correlations among traits. We then conducted a multivariate genome wide association study (GWAS) to identify common variants that impact clusters of dental traits identified from these observed patterns of correlation.

Methods and Materials


The cohort for this study is part of the Pittsburgh Orofacial Clefts Study (POFC) which studies risk factors for orofacial cleft birth defects (OFCs). Participants for POFC were recruited from multiple cleft centers in the United States, including Colorado, Iowa, Pennsylvania, Texas, and Puerto Rico, and internationally, from Argentina, the Philippines, Colombia, Guatemala, and Hungary. Institutional review board (IRB) approval was obtained at each site by the appropriate IRB process and committee, with a coordinating IRB at the University of Pittsburgh (IRB 0405013). The same data collection protocols were used for every site.

The total sample used for this study (n = 3,579, ages 8–82) included only OFC-unaffected individuals: 1,392 control individuals, with no personal nor family history of craniofacial anomalies, and 2,187 unaffected relatives of cases with OFCs, including case parents, siblings and spouses. Excluded were participants 7 years of age or younger; any case affected with an overt OFC, edentulous participants, or participants having any facial trauma or surgery. Table 1 summarizes the basic descriptive statistics for the participants involved in our study.

Table 1. POFC Cohort (1,505 male, 2,074 female).

Data regarding the participant’s dental history, including dental extractions and orthodontic treatment, were collected from all participants by self-report. In addition, each participant had an in-person dental exam and/or extensive intraoral photos taken. The reliability of data from intraoral photos was compared to in-person dental exams in a subset of individuals with both dental exams and photos, prior to the start of data collection. Both the Intra-rater reliability and Inter-rater reliabilities were excellent (anomalies kappa = 0.91–0.95; caries kappa = 0.91–0.99) (11).

Data Collection and Genotyping

All information regarding the data collection process, DNA collection, genotyping and quality control are summarized in the Supplementary Materials.

Structural Dental Anomalies and Dental Caries

For the current study we investigated enamel hypoplasia, microdontia, rotated and displaced teeth, supernumerary teeth, tooth agenesis, mamelons, and dental caries, all as binary (“yes/no”) traits.

Each of these traits was assigned a binary value of yes in each subject if there was at least one instance of a trait across the dentition. Then the prevalence of each trait was averaged over the total study for unaffected relatives and controls separately. We tested if there was a significant difference in the prevalence values between unaffected relatives and controls. Further, sex differences were evaluated for each dental anomaly and for dental caries. Lastly, we tested for correlations amongst the dental anomalies and dental caries.

Statistical Methods

All descriptive and comparative statistical analyses were performed using the R v3.4.1 statistical analysis environment (12). The prevalence of dental anomalies and dental caries, and comparisons between these traits were performed using χ2-tests. The Spearman correlation coefficient was used to evaluate the associations between the traits, with a P < 0.05 was considered as significant correlation.

We utilized a multivariate GWAS approach in this study, the multivariate test of association implemented in PLINK (MV-PLINK) (13), with adjustment for age, age2, sex, subject types (Control or unaffected relatives), site, and principal components (PCs) of genetic ancestry. Principle components (PCs) can be used to help clarify the differences among the sample participants in the genetic data (14). MV-PLINK uses canonical correlation analysis (CCA), a multivariate generalization of the Pearson product-moment correlation, in order to measure the association between sets of variables. CCA extracts the linear combination of traits that explain the largest possible amount of the covariation between the genetic variants and all traits. The CCA method implemented in MV- PLINK is equivalent to multivariate analysis of variance and has been shown to outperform other methods (15). Wilks’ lambda was used to test the significance of the canonical correlations. Wilk’s lambda (F) corresponds to the linear combination of traits with maximum correlation with the genetic variant; the correlation coefficients (Weights) for each individual trait indicate the contributions of each trait to the association result (13). Because our data comes from a large family study, we incorporated permutation testing implemented in MV-PLINK to correct for family structure.

Possible genomic inflation was assessed by calculating the genomic inflation factor, lambda (λ), visualized in a quantile-quantile (QQ) plot in R. In addition, we used R to create Manhattan plots to visualize the association results. Single nucleotide polymorphisms (SNP variants) with minor allele frequencies lower than 5%, as well as SNPs with genotyping call rates <10% were filtered out. The threshold for genome-wide significance was set to p ≤ 5 × 10−8 (Bonferroni correction for a million tests), and for suggestive significance to p-values between 10−5 and 5 × 10−8. The top associated loci were then annotated.

We investigated all the genes within ±500 kilobases (16) of the top association signals (index SNPs) for putative connections to dental development or specific dental anomalies. To visualize regions showing genome-wide significance and regions showing suggestive significance, we used the regional plots generated by LocusZoom. For investigation of functionally, genes were investigated using the resources such as the Gene and PubMed databases at the National Center for Biotechnology Information (NCBI). To examine the genes near the association signals, we searched these multiple databases including OMIM, UCSC, Genome Browser, EMAGE, Ensembl, ENCODE and the Mouse Genome Informatics database by using terms including the name of the gene or corresponding protein, plus terms relevant to dental development and dental anomalies.


Summary of Dental Caries and Dental Traits in the Study Cohort

The study cohort included 1,505 males (42.05%), and 2,074 females (57.95%) with an age range of 8–82 years and a mean age of 31 years (Table 1), with 991 participants (27.69%) from U.S recruitment sites, and 2,588 participants (72.31%) from international sites. The distribution by age and recruitment site is summarized in Supplementary Table 2.

The distribution of the different dental anomalies is presented in Supplementary Figure 1. Rotation (70.80%) and Displacement (40.77%) were the most prevalent dental anomalies in the study cohorts. Females had significantly more rotation (P = 6.478−09), and displacement (P = 1.003−06) than males in the cohort (Table 2). As shown in Table 3, there were some differences in anomaly incidence between the OFC relatives and the unrelated controls, however these differences were not statistically significant after correcting for multiple testing.

Table 2. Distribution of dental anomalies by sex.

Table 3. Distribution of dental anomalies by subject type.

Dental Caries

Previous studies indicated that relatives of individuals with clefts do not have an increased risk of dental caries when compared to the general population (1719). Our study showed that there was a trend for higher rates of dental caries among unaffected relatives vs. controls (DFT/dft percentage = 62.58 vs. 37.42%, P = 0.0018).

There was also a trend for an increased incidence for dental caries among females vs. males (DFT/dft percentage =1.13 vs. 1.54, P = 0.002177) across the different POFC sites. Higher dental caries prevalence among females when compared to males has been reported before (20, 21). Hypothesized reasons include (a) earlier dental eruption in females, leading to a longer exposure to a cariogenic oral environment, (b) hormonal changes during pregnancy, (c) higher intake of snacks that might contain high sucrose amongst pregnant women, (d) and behavioral factors such as poor oral hygiene (22, 23). In addition, previous studies hypothesized that sex differences in dental caries could be explained by the differential effects of genes that influence dental caries risk (24, 25).

Dental Anomalies

We observed that unaffected relatives displayed more rotation (P = 0.0016) than controls. We also observed a trend for an increased rate in unaffected relatives for impaction (80.49 vs. 19.51%, P = 0.01), supernumerary teeth (84.62 vs. 15.38%, P = 0.014), and dental caries (78.40 vs. 62.58%, P = 0.0018) compared with controls.

Sex differences for all the dental anomalies plus dental caries as a binary trait, were tested. Females displayed significantly more rotation (P = 6.478E−09) and displacement (P = 1.003E−06) when compared to males. The tendency to develop more rotation and displacement in females could be explained by previous observations that females have different characteristics of the dentition (differences in the dentin, enamel, and mesio-distal teeth width) when compared to males (22, 26). A previous study by our group investigated the sex differences in certain dental anomalies within this cohort (POFC) and no significant sex difference were found, however note that tooth rotation and displacement were not included as separate traits in that previous study (11), they were both treated as one trait (malposition) and were excluded from the overall analysis.

Multivariate Patterns of Dental Traits

Based on the Spearman rank correlation coefficient, shown in Table 4, we found statistically significant evidence of correlation between the following groups of anomalies: tooth agenesis, impaction, and rotation (AIR); enamel hypoplasia, displacement, and rotation (HDR); and displacement, rotation, and mamelons (DRM). Further, dental caries was correlated with tooth agenesis and enamel hypoplasia (CAH). Notably, 84.4% of participants who had enamel hypoplasia had dental caries (Supplementary Table 3). However, note that the magnitude of most of the correlation coefficients indicate relatively weak correlation (i.e., we analyzed each of the four correlated groups of traits using MV-PLINK).

Table 4. Correlation results between the different dental anomalies.

A total of 5,802,671 SNPs was available for analysis after applying quality control criteria. Although no SNPs reached genome-wide significance (P < 5 × 10−8) for any of the four correlated groups of traits (AIR, HDR, DRM, and CAH) there were several regions of the genome that showed suggestive significance (P ≤ 10−5). Association results for the top SNPs for the four correlation groups are shown in Table 5 and Figure 1. The suggestive associations for each of the correlated groups are summarized in the following sections.

Table 5. Top GWAS hits for the correlated patterns of dental anomalies and dental caries.

Figure 1. Manhattan plots and (Q-Q) plots showing multivariate GWAS results for the correlated dental anomalies patterns and dental caries. In the Manhattan plot the red lines represent thresholds for genome-wide significance (p < 5 × 10−8). Blue lines represent thresholds for suggestive significance (p < 5 × 10−6) and it shows negative log10-transformed p-values (y-axis) across the whole genome (x-axis). Genotyped and imputed SNPs are plotted together. (A) On the left is the Manhattan plot for Tooth Agenesis, impaction and Rotation (AIR), on the right is the quantile–quantile plot (Q-Q), and the genomic inflation factor (λ) is 0.87. (B) On the left is the Manhattan plot for Hypoplasia, displacement and rotation (HDR), on the right is the quantile–quantile plot (Q-Q), and the genomic inflation factor (λ) is 0.87. (C) On the left is the Manhattan plot for Displacement, Rotation and Mamelon (DRM), on the right is the quantile–quantile plot (Q-Q), and the genomic inflation factor (λ) is 0.82. (D) On the left is the Manhattan plot for Dental caries, Agenesis and Hypoplasia (CAH), on the right is the quantile–quantile plot (Q-Q), and the genomic inflation factor (λ) is 1.09.

Tooth Agenesis, Impaction and Rotation

There were 100 SNPs showing suggestive associations with AIR, of which five were in regions with possible relevance to dental development, summarized below with regional plots shown in Figure 2.

Figure 2. Regional Association Plots from (AIR) results. LocusZoom were used to generate plots for regions of Interest. (A) Suggestive locus near ADAMTS9 and PRICKLE2 on chromosome 3. (B) suggestive locus near ARNT2 gene on chromosome 15. (C) Suggestive locus near NTN1 gene on chromosome 17. (D) Suggestive locus near TRPC4 on chromosome 13. (E) Suggestive locus near WNT2B gene on chromosome 1. The genome build used for the recombination rate was based on 1,000 Genomes November 2014 EUR data. All of the gene positions and directions of transcription are annotated on the plots.

Chr3:64541255 (P = 1.00E-06) is located in an intron of ADAMTS9, which codes a protein located in the extracellular matrix (ECM) and interacting tightly with ECM proteins. ADAMTS9 may modify the extracellular environment, influencing proliferation and survival of neural crest-derived cells (27). In mice, Adamts9 expression is seen in developing craniofacial structures such as teeth and mandible (28). In addition, this variant is also ~300 kb upstream of PRICKLE2, a planar cell polarity protein involved in amelogenesis. In rats, Prickle2 in rats is expressed in the differentiating inner enamel epithelial cells and early inner enamel-secretory ameloblasts (29).

Variant rs140220410 (P = 3.00E-06) is <100 kb downstream from ARNT2, a transcriptional regulator involved in several biological functions, including regulation of developmental genes. In mouse, Arnt2 was expressed in the molar and incisor teeth, the odontoblasts, and both inner and outer enamel epithelium (30).

Variant rs9913511 (P = 3.00E-06) is intronic to NTN1 which encodes a protein in the family of laminin-related secreted proteins. Although little is known about this gene with respect to dental development, it affects development of the craniofacial region in animal models (31), mediating a critical step in palatal fusion in mouse embryo (32).

Variant rs2251904 (P = 4.00E-06) is downstream of TRPC4 which encodes a protein involved in multiple biological processes, including neurotransmitter release and cell proliferation. Notably, in rats Trpc4 was highly expressed in the rat dental follicle and stellate reticulum cells during the early stage, and moderately expressed in odontoblasts (33).

Variant rs1838002 (P = 6.00E-06) is upstream of WNT2B, one of the WNT (wingless-type MMTV integration site) family of signaling factors that play an important role in human development, including the differentiation and proliferation of cementoblasts and odontoblasts (34).

Enamel Hypoplasia, Displaced, Rotation

There were 105 SNPs showing suggestive associations with HDR of which four were in regions with possible relevance to dental development, summarized below (regional plots shown in Figure 3).

Figure 3. Regional Association Plots from (HDR) results. LocusZoom were used to generate plots for regions of Interest. (A) Suggestive locus near GLIS3 on chromosome 9. (B) Suggestive locus near ROR2 and OMD genes on chromosome 9. (C) Suggestive locus near WDR72 gene on chromosome 15. (D) Suggestive locus near BMP7 on chromosome 20. The genome build used for the recombination rate was based on 1,000 Genomes November 2014 EUR data. All of the gene positions and directions of transcription are annotated on the plots.

Variant rs12379966 (P = 1.00E-06) is <100 kb upstream from the zinc finger protein GLIS. The encoded protein of GLIS3 regulates and improves osteoblast differentiation by acting interdependently with BMP2 and Shh. In addition, GLIS3 promotes an increase in FGF18 expression during osteoblast differentiation (35). GLIS3 has not been studied in tooth development.

Variant rs6479408 (P = 1.00E-06) is downstream from ROR2, an orphan tyrosine kinase that mediates Wnt5a-initiated non-canonical signaling and Wnt5a-inhibition. Wnt canonical signaling, required during the growth, patterning, and differentiation of teeth. Ror2 expression has been observed during the development of teeth in mice, with tooth developmental retardation seen in Ror2 mutant mice (36). This variant is also located upstream to the Osteomodulin gene (OMD), a member of the small leucine-rich proteoglycan family distributed in the extracellular matrix (ECM). OMD is expressed in the polarized odontoblasts and alveolar bone during early crown formation and plays an essential role in modulating the osteo/odontoblastic differentiation of human dental pulp stem cells (37).

Variant rs141429354 (P = 6.00E-06) is downstream of WDR72, a protein coding gene is essential during the maturation phase of amelogenesis for normal formation of the enamel (38). Wdr72 expression was detected in the enamel organ of mouse incisors, and knockout mice (Wdr72−/−) have defects in enamel maturation (39). Mutations in this gene are associated with the hypomaturation seen in amelogenesis imperfecta (38).

Variant rs404727 (P = 8.00E-06), is upstream to BMP7, which encodes a secreted ligand of transforming growth factor-beta (TGF-beta) a superfamily of proteins that play a role in ectopic bone formation and odontogenesis. Bmp7 is highly expressed during tooth development in mice; Bmp7 knock-out mice had morphological and functional changes in teeth (40). In studies of human dental mesenchyme BMP7 was expressed in the late bell-stage dental papilla and thus might play a role in inducing the odontogenic differentiation of human dental pulp stem cells (41).

Displaced, Rotation, Mamelons

There were 100 SNPs showing suggestive associations with DRM of which four were in regions with possible relevance to dental development, summarized below (regional plots shown in Figure 4).

Figure 4. Regional Association Plots from (DRM) and (CAH) results. LocusZoom were used to generate plots for regions of Interest. (A) Suggestive locus near GLIS3 on chromosome 9. (B) Suggestive locus near WDR72 gene on chromosome 15. (C) Suggestive locus near ROR2 and OMD on chromosome 9. (D) Suggestive locus near ROBO2 on chromosome 3. (E) Suggestive locus near SMAD2 on chromosome 18. (F) Suggestive locus near REL on chromosome 2. (G) Suggestive locus near IL1A and IL1B on chromosome 2. (H) Suggestive locus near MSX2 on chromosome 5. (I) Suggestive locus near ROBO1 on chromosome 3. The genome build used for the recombination rate was based on 1,000 Genomes November 2014 EUR data. All of the gene positions and directions of transcription are annotated on the plots.

Three of the variants were also suggestively associated with the HDR pattern, not surprisingly given that these two correlation groups share two traits (see the HDR section for detailed summaries). Variant rs10511451 (P = 1.00E-06 for DRM), is upstream of the zinc finger protein GLIS3. One of the variants suggestively associated with HDR, variant rs12379966 (P = 1.00E-06), was also located upstream to GLIS3 gene. Variant rs141429354 (P = 3.00E-06 for DRM, p = 6.00E-06 for HDR) is downstream of WDR72; Variant rs6479408 (P = 5.00E-06 for DRM and p = 1.00E-06 for HDR) is near ROR2.

Variant rs174814 (P = 9.00E-06 for association with DRM) is located ~100 kb upstream of ROBO2 which encodes a protein that acts as a cell receptor for slit2 which plays a role in cell migration (42). In a gene set enrichment analysis study, ROBO2 was listed as “potentially” associated with dental traits (43).

Dental Caries, Tooth Agenesis, Enamel Hypoplasia

There were 149 SNPs showing suggestive associations with CAH of which six were in regions with possible relevance to dental development, summarized below (regional plots shown in Figure 4).

Variant rs79577009 (P = 2.87E-07) is located intronic to SMAD2 which encodes a protein that mediate the signal of transforming growth factor (TGF)-beta, indirectly regulating multiple cellular processes, including cell proliferation, differentiation, and odontogenesis (4). In mice, smad2 plays an essential role during early stages of tooth formation (44).

Variant rs11125855 (P = 8.82E-07) is upstream of REL which encodes a protein in the “Rel homology domain/immunoglobulin-like fold plexin, transcription factor” (RHD/IPT) family, with roles in biological processes including apoptosis, inflammation, and the immune response (e.g., survival and proliferation of B lymphocytes) which produce cytokines as a response to inflammation regulating the intensity and duration of the immune response (45). In mice, REL (also known as c-rel) controls the development of the epidermis and associated appendages, such as teeth, during embryogenesis (46).

Variant rs6758898 (P = 2.93E-06) is located <100 kb from IL1A and IL1B (part of the interleukin 1 cytokine family) whose encoded proteins have multiple functions within the immune system. IL1A (Interleukin-1 alpha) has an has a role in bone resorption, and positively affects the survival and differentiation of osteoclast and odontoclast (47). IL1B (Interleukin-1 beta) has an important function as a mediator in the inflammatory response (48). A case-control study in China found that a significant association between dental caries and ILB1 (20). Interestingly, a systematic review of 27 studies showed that genetic variation in both IL-1a and IL-1b could contribute to chronic periodontitis in whites (48).

Variant rs4868444 (P = 3.09E-06) is intronic to MSX2 whose encoded protein provides the balance between survival and apoptosis of neural crest-derived cells, necessary for proper craniofacial morphogenesis. In addition, msx2 protein is part of the bone morphogenic protein (BMP) signaling pathway, that regulates various processes including odontogenesis (21). A mutation in MSX2 has been found in a family with amelogenesis imperfecta and impaired tooth eruption (49), and Msx2 knockout mice showed a tooth abnormality resembling human amelogenesis imperfecta (50).

Variant rs5850440 (P = 3.37 E-06) is located intronic to ROBO1, with an encoded protein that functions as cell receptor for slit1. In mice, both slit1 and robo1 were expressed in the primary enamel knot and during the cap stage, plus robo1 expression in tooth germ and dental papilla was noted. Moreover, it was found that even before birth both of robo1 and robo2 were localized in preodontoblast (51). As detailed earlier, note that a variant in ROBO2 was suggestively associated with the DRM pattern in this study.


In recent years there has been large number of new genomic approaches and new methods of analyzing and understanding the wealth of genomic data. The current study leveraged a novel multivariate approach to GWAS (MV-PLINK) (13) that can aid in understanding the inter-relationship between multiple traits, in this case dental anomalies, and to investigate genetic associations underlying these correlations. To do so, we utilized a large, multi-ethnic study cohort that was well-characterized for multiple dental characteristics including structural dental traits and dental caries.

Some studies have found that families with clefting have a significantly higher risk for developing dental anomalies than general population (5254). However, there are also studies that found no significant differences (11, 55, 56). In our study, subject type, i.e., unaffected relatives vs. controls, was tested for all dental anomalies’ variables, and we noticed that unaffected relatives had more rotation in their teeth (P = 0.0016) when compared to controls. In addition, a trend for an increased rate in unaffected relatives for impaction (80.49 vs. 19.51%, P = 0.01), supernumerary teeth (84.62 vs. 15.38%, P = 0.014), and for dental caries (78.40 vs. 62.58%, P = 0.0018) compared with controls. However, we do not consider these differences significant in our study after correcting for multiple testing, which is also consistent with the findings from a previous study in this same cohort (11). Rotation and displacement were the most prevalent anomalies in our study. This could be due to the different patterns of racial/ethnic admixture in our study cohort, and that potentially could induce bias (57).

Based on Spearman rank correlation coefficients, we found multivariate correlations between different structural dental anomalies and between some of these anomalies and dental caries. Note that other studies have shown similar correlations. For example, one study has found a significant association between impacted maxillary canine and the dental anomalies of tooth number, tooth size and rotations which suggest a shared common genetic background (58). Another study also discovered associations among different dental anomalies, including an association between tooth agenesis and displacement of maxillary canines, and between tooth agenesis and tooth transposition (59). We hypothesized that these correlations are due to similar genetic backgrounds, emphasizing the importance of investigating these anomalies on a genetic level. Therefore, we used a multivariate GWAS approach to investigate genetic factors underlying these patterns, as recommended when genetic correlations between traits are relatively weak, as observed in the current analyses.

This study is the first to apply multivariate GWAS to identify possible genetic loci associated with the presence of patterns of correlated dental anomalies (AIR, DRM, HDR, or CAH). Although none of our multivariate GWAS results reached the strict genome-wide significance level, there were a large number of variants with suggestive genome-wide significance relevant to odontogenesis and dental caries. The suggestive variants were annotated to help generate hypotheses and nominate these variants for further investigations using information from different bioinformatics databases. We focused our search on genes with known biological functions related to tooth development process (odontogenesis) and oral/dental health, either in human or in rodents. However, there were other suggestive variants near genes with biological roles in the body that could be of interest for further investigation in future studies.

The strongest association signal was in the CAH model (dental caries, tooth agenesis, enamel hypoplasia). The lead variant, rs79577009 (P = 2.87E-07) is located intronic to the SMAD2 gene, which has an important function in mediating the signal of the transforming growth factor (TGF)-beta and helps indirectly (TGFb) in regulating multiple cellular processes, including odontogenesis (4). In addition, smad2 was found to be involved during the early stages of tooth development in mice, so we hypothesize that variants in SMAD2 might increase the chance of having any of the correlated dental anomalies: dental caries, tooth agenesis, and/or enamel hypoplasia.

An overlap seen in results of the HDR and DRM GWAS could be due to the fact that the two patterns share two dental anomalies (D and R). The results of these overlapped regions of significance are presented in Supplementary Table 4, and include suggestive association signals near the WDR72 gene, which has an essential role in the mineralization and maturation of tooth enamel (38). Further investigations needed to replicate and understand the extent of these associations.

There are no previously reported GWAS studies directed at identifying risk loci associated with multiple correlated dental anomalies. There have been however, GWAS’s of third molar tooth agenesis in cohorts of European ethnicity (60, 61). Further, in another GWAS of tooth agenesis (excluding third molars) in European ancestry that used a sample from the POFC study for replication, several risk variants were identified, near ASCL5/CACNA1S, ARHGAP15, FOXI3, EDAR, and WNT10A (62). Sequencing studies of tooth agenesis found that nonsynonymous, nonsense, and missense mutations in WNT10A are strongly associated with tooth agenesis of 1-3 teeth and also for 4 or more teeth (63). A previous study also identified several genes that have been associated with non-syndromic forms of tooth agenesis, including MSX1, PAX9, AXIN2, and EDA (64). Each of these genes play an important role during tooth development.

We examined the candidate genes from previous sequencing and GWASs of tooth agenesis in our AIR and CAH GWAS modules results to see if they had been replicated. GWAS of tooth agenesis was the only dental anomaly that was investigated before in a single ancestry and in a sample that has no other syndromes. Only limited evidence of suggestive association was seen near the previously reported tooth agenesis genes (results shown in Supplementary Table 5).

The use of binary traits in GWAS’s is very common and has multiple advantages. However, the limitation of this approach is that the phenotype is less precise which could negatively affect the power of a GWAS study (65), that is, there might be false negative associations but not false positives. The major limition of this study was the lack of replication cohorts available for testing the variants nominated with the patterns of correlated dental characteristics. Additional studies in larger and more diverse cohorts are warranted to assess the effects of the potential variants identified in this study.

In summary, we nominate genes with known biological roles during tooth development, including ADAMTS9 and PRICKLE2 from the AIR pattern; GLIS3, WDR72, and ROR2 from HDR and DRM patterns; ROBO2 from the DRM pattern; BMP7 from the HDR pattern; and ROBO1, SMAD2, and MSX2 from the CAH pattern. In addition, genes were identified with plausible roles in tooth development such as: ARNT2 and WNT2B from the AIR pattern; OMD from the HDR pattern; and REL, IL1A, and IL1B from the CAH pattern.


Evidence of correlation between multiple different structural dental anomalies, including the correlation between dental caries and enamel hypoplasia, reflect support for the hypothesis that similar genetic background or other etiologic factors may underlie multiple dental characteristics and dental disease. This is the first study to perform multivariate GWAS for patterns of associated dental anomalies and dental caries, and we were able to identify suggestive genetic loci for four correlated dental anomaly patterns that play plausible biological roles during tooth development. Further studies are needed to replicate the analyses in independent cohorts.

Data Availability Statement

The datasets analyzed in this paper are available in the dbGaP repository,, further queries can be forwarded to the corresponding author/s.

Ethics Statement

The studies involving human participants were reviewed and approved by IRB. Written informed consent to participate in this study was provided by the participants’ legal guardian/next of kin. Written informed consent was obtained from the individual(s), and minor(s)’ legal guardian/next of kin.

Author Contributions

RA generated the data set for the analyses. RA analyzed data. RA, BH, LM, MM, JS, and SW helped define outcomes to be studied. RA wrote the first draft of the manuscript. MM, JM, JS, SW, BH, LM, and KN critically reviewed the manuscript. RA and MM designed the study. RA, MM, FD, KN, CP, IO, CB, JH, GW, SW, and LM collected and interpreted data. RA and MM generated the final draft of the manuscript. All authors have read and approved the final manuscript.


This work was supported by grants from the National Institutes of Health (NIH) including: X01-HG007485 (MM), R01-DE016148 (MM and SW), U01-DE024425 (MM), R37-DE008559 (JM and MM), R21-DE016930 (MM), R21-DE016930 (MM), R01-DE012472 (MM), R01-DE011931 (JH), U01-DD000295 (GW), and K99/R00-DE024571 (CB). Genotyping and data cleaning were provided via an NIH/NIDCR contract to the Johns Hopkins Center for Inherited Disease Research: HHSN268201200008I. Additional support provided by an intramural grant from the Research Institute of the Children’s Hospital of Colorado (FD); operating costs support in the Philippines was provided by the Institute of Human Genetics, National Institutes of Health, University of the Philippines, Manila (CP).

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.


Many thanks to the participating families world-wide who made this project possible. The tireless efforts by the dedicated field staff and collaborators were similarly essential to the success of this study. Special thanks to the staff and support personnel of Clinica Noel of Medellin, Colombia.

Supplementary Material

The Supplementary Material for this article can be found online at:



This article is autogenerated using RSS feeds and has not been created or edited by OA JF.

Click here for Source link (