Globally, there are 55 countries with more than one official languages, and immeasurable regions with only one official language but also regional and local languages. Understandably, countless people use more than one languages frequently for a long time. The long-term bilingual experience affects multiple cognitive functions (see a review Costa and Sebastian-Galles, 2014). The effect of bilingualism on the phonological processing received much attention. Bilinguals rely on one brain to process two distinct phonologies. The functional imaging studies showed that the brain activation patterns during phonological processing of the native and the second language (L1 and L2, respectively) for bilinguals were different (see reviews Liu and Cao, 2016; Sulpizio et al., 2020), as well as the brain activation pattern during phonological processing of either language for bilinguals was different from that of the single language for monolingual (Parker Jones et al., 2012; Cao et al., 2013; Ma et al., 2020). The functional changes are accompanied by anatomical changes (see reviews Li et al., 2014; Stein et al., 2014). The white matter coordinates communications between the different brain regions (Douglas, 2008) and interacts with the cortex function (Duffau, 2015). The white matter has lifelong plasticity (Gibson et al., 2014; McKenzie et al., 2014; Fields, 2015) and is likely to be modulated by environmental stimulus and behavioral experience, such as bilingualism (Li et al., 2014). Therefore, it is reasonable to hypothesize that the long-term bilingual experience can shape different white matter structures of phonological-related tracts.

The diffusion tensor imaging (DTI) technique allowed a detailed observation of white matter structure in vivo without invasion (Basser and ÖZarslan, 2014). The characteristics of white matter can be described by the DTI indices, such as fractional anisotropy (FA) and diffusivity measurements including mean, axial, and radial diffusivities (MD, AD, and RD, respectively) (Soares et al., 2013). According to the dual-stream model of the language processing, the dorsal cortical circuit including the perisylvian language areas is demonstrated to be involved in processing the phonological information in written and spoken languages (Hickok and Poeppel, 2007; Schlaggar and McCandliss, 2007). The fiber dissection studies claimed that the superior longitudinal fasciculus (SLF) is the core fiber connecting the dorsal language cortical regions (Sarubbo et al., 2016). On the other hand, the ventral cortical circuit is generally considered to play an important role in semantic processing. The core tracts connecting the ventral stream consist of the ILF and the inferior fronto–occipital fasciculus (IFOF) (Sarubbo et al., 2016). However, there is also some evidences supporting that the ventral stream has a relationship with phonology. Li et al. (2017) found patients with lesion in the left IFOF had worse performance in the phonological fluency task. The DTI studies also reported that the structure of right ILF/IFOF was related to pseudo word reading and decoding (Lebel et al., 2013), and the structure of left IFOF was related to rapid automatized naming and non-word reading (Rollans et al., 2017). The researchers claimed that the two tracts might play a role in the mapping from orthography to phonology (Rollans et al., 2017).

Accumulating DTI evidence revealed that the alphabetic–alphabetic bilinguals had different white matter structures from their monolingual peers, especially in the SLF, ILF, and IFOF (Luk et al., 2011; Schlegel et al., 2012; Gold et al., 2013; Pliatsikas et al., 2015; Kuhl et al., 2016; Singh et al., 2017; Anderson et al., 2018a). However, the reproducibility of these studies is low. It is considerable because bilingualism is a multifaceted construct sensitive to age of L2 acquisition (AoA), L2 immersion time, and language categories, etc. (Li et al., 2014). Most of the current DTI studies on bilingualism recruited the first-generation immigrants or local L2 learners as the bilingual samples, who generally learn L2 lately and use L2 for a short time (Schlegel et al., 2012; Cummine and Boliek, 2013; Pliatsikas et al., 2015; Kuhl et al., 2016; Rossi et al., 2017). In the SLF, short-term late bilinguals usually demonstrated higher FA and lower RD than monolinguals (Schlegel et al., 2012; Pliatsikas et al., 2015; Rossi et al., 2017). This is similar to the white matter characteristics of the subjects after L2 training (Hosoda et al., 2013; Mamiya et al., 2018). However, for bilinguals born in bilingual societies, they learn L2 at an early age, and the stimulus of the two languages is long-term and continuous. Only a few DTI studies on bilingualism recruited long-term bilinguals who learned L2 at an early age (Luk et al., 2011; Singh et al., 2017; Anderson et al., 2018a). Singh et al. (2017) and Anderson et al. (2018a), respectively, recruited lifelong bilingual samples from Canada and Hindi, and matched them well with the monolingual samples. Both reported that the bilinguals had higher AD in the SLF than the monolinguals, and Singh et al. also reported that the bilinguals had higher MD in the SLF than the monolinguals. This is inconsistent with the results of the studies focusing on late short-term bilinguals (Schlegel et al., 2012; Pliatsikas et al., 2015; Rossi et al., 2017). It is speculated that the long-term immersion in a bilingual environment from an early age might induce different effects from relatively short-term L2 experience (Singh et al., 2017). Globally, there are countless regions with more than one language (see footnote 1), and nearly 66% of people are raised as bilingual speakers from childhood and use more than one language for a long time (Marian and Shook, 2012). In this study, the white matter structural characteristics of such long-term bilinguals are of our interest.

In addition, some rare studies focused on the logographic bilinguals. The literature (Li et al., 2014) indicated that the language typology could affect the changes in bilinguals’ brain structure. Different from alphabetic languages, Chinese is a unique language best known for its logographic writing system (Deng et al., 2013). The orthography-to-phonology mapping rule is extremely opaque in Chinese, while it is relatively transparent in alphabetic languages. Thus, the Chinese readers tend to adopting the addressed phonology strategies, which means directly retrieving stored phonological representations for the whole word (Cao et al., 2017). In monolinguals, the evidence from fMRI studies suggested that compared to alphabetic languages, reading Chinese requires extra involvement of cortex in the ventral stream [for details see the references (Bolger et al., 2005; Tan et al., 2005; Sun et al., 2011)]. For bilinguals, when processed as the L2 in the phonological tasks, compared to alphabetic language, Chinese processing also need greater activation in the ventral cortical regions including fusiform gyrus, inferior temporal lobe and occipital regions (Kim et al., 2016; Cao et al., 2017). As claimed, the ventral cortical regions play a role in mapping orthography to addressed phonology, which is essential in reading Chinese. The ILF and IFOF are the two core tracts connecting the ventral cortex (Sarubbo et al., 2016). Qi et al. (2015) found that FA and RD of the right ILF could predict the Chinese achievement of native English speakers after short-term Chinese training. Besides, Cummine and Boliek (2013) observed Chinese–English bilinguals had different DTI indices in the bilateral IFOF and right ILF from English monolinguals. These two studies suggested that using Chinese and another alphabetic language together might be related to the white matter tracts in the ventral pathway. However, there has been no study focusing on the whiter matter characteristics of bilinguals using two kinds of logographic languages. In addition, none of these two studies focused on long-term bilinguals who learn L2 at an early age. Thus, we aim to investigate how the long-term bilingual experience of two logographic languages may shape the white matter structure. We hypothesized that the white matter structure in the SLF, ILF, and IFOF might exhibit an effect.

Mandarin and Cantonese are the two major Chinese and both are logographic languages. In Guangdong Province, China, nearly half of the population speaks both Cantonese and Mandarin (Huiming and Zhe, 2016). For Cantonese–Mandarin bilinguals, Cantonese is their native language (L1) mainly for daily communication and local medium, while Mandarin is the L2 for formal situations. As the official language, Mandarin is popularized nationwide throughout China. Mandarin is the teaching language in Guangdong, and the children learn how to pronounce and write in Mandarin from primary school or even earlier. In terms of the linguistic characteristics, Cantonese and Mandarin share the same set of characters and have similar grammatical structures (Tardif et al., 2009). However, they shared the same pronunciation for only 21.5% of characters (Li, 1990). Thus, although Cantonese is defined as a dialect according to socioeconomic factors, it is still deemed as an individual language in the region of psycholinguistics (Chen et al., 2004; Tardif et al., 2009). The evidence from behavioral studies suggested that Cantonese–Mandarin bilinguals performed different phonological processing skills compared to their Mandarin monolingual peers over a wide age span (Chen et al., 2004; Li et al., 2011). Our studies on functional magnetic resonance imaging (fMRI) showed that Cantonese–Mandarin bilinguals had different brain activation patterns and functional connectivity related to phonological tasks from Mandarin monolinguals, such as the inferior frontal gyrus and angular gyrus (Ma et al., 2020, 2022). We also observed different functional connectivity in phonology-related subnetwork between Cantonese–Mandarin bilinguals and Mandarin monolinguals through resting-state MRI, such as the resting-state functional connectivity of the inferior frontal gyrus and angular gyrus with temporal regions (Fan et al., 2021). Therefore, it is assumed that the white matter structure related to the phonology might be different between Cantonese-Mandarin bilinguals and Mandarin monolinguals, and the different structure might have a relationship with phonological processing.

In this study, we used the tractography and tract-based spatial statistics (TBSS) analysis to test our hypothesis. A binary mask of tracts of interests (TOI) was constructed for TBSS analysis to improve the anatomical accuracy and the power to detect significance (Hamalainen et al., 2017). We measured the phonological processing skills and performed a post-hoc correlation between the phonological processing skills and the significant regions revealed in the group-wise comparison. The post-hoc comparison could help us better understand the white matter plasticity induced by Cantonese–Mandarin phonological processing. The aim of this study was listed as follows: (1) To test whether the white matter structure in SLF, ILF, and IFOF is different between Cantonese–Mandarin bilinguals and Mandarin monolinguals; (2) To explore the relationship between the phonological processing skills and the significant regions in the white matter structure.

Materials and Methods


A total of 31 Cantonese–Mandarin bilinguals and 30 native Mandarin monolinguals were recruited from Canton. All the subjects underwent diffusion-weighted imaging (DWI) and T1-weighted scans. The bilinguals in this study were born in the Guangdong province of mainland China, a Cantonese–Mandarin bilingual society. Cantonese was their native language, and they learned Mandarin at the preschool or elementary school stage. For 27 of the 30 Mandarin monolinguals were born and grew up in Mandarin regions with Mandarin as the native language. Three of the monolinguals grew up in families of Shandong, Henan or Sichuan dialects which were the variants of Mandarin and can be fully interconnected with Mandarin (Institute of Language et al., 2012). None of the monolinguals had learned or used Cantonese or any Chinese southern dialect. One participant in the bilingual group was excluded because his DWI image was incomplete. Finally, we included 30 Cantonese–Mandarin bilinguals [6 males; age (y), mean ± standard deviation (M ± SD), 21.17 ± 1.97)] and 30 Mandarin monolinguals [9 males; age (y), M ± SD, 21.40 ± 2.03] in analysis. The bilinguals’ second language (L2) acquisition age ranged from 3 to 7 years old. The proficiency of the languages of participants was evaluated referring to the language and social background questionnaire (LBSQ). The LBSQ is a self-assessment tool and has been proven to be reliable and valid in diverse languages (Anderson et al., 2018b). Also, since English is a compulsory course in the Chinese education system, all participants in this study had English learning experiences. We used the grades of College English Test Band 4 (CET4), a national English test in China for undergraduate and postgraduate students, to evaluate participants’ English proficiency and included CET4 grades as a nuisance variable in the group-wise comparison. The participants’ non-verbal intelligence quotient evaluated by Raven’s Standard Progressive Matrices Test was matched in the two groups.

All participants were right-handed referring to the Edinburgh Handedness Inventory (Oldfield, 1971). The ones with learning disabilities, neurological diseases, psychiatric disorders, visual and hearing difficulties, attention deficit hyperactivity disorder, or contraindications of MRI were excluded. All participants signed written informed consent before participating. The Medical Ethics Committee, Sun Yat-sen University provided ethical approval for this study with the ethical approval number [L2016] No. 036.

Behavioral Measures of Phonological Processing Skills

We chose the rhyming judgment task, the rapid automatized naming (RAN) task, and the digit span test to assess the three aspects of phonological processing skills (Wagner and Torgesen, 1987), respectively. For bilinguals, we performed the tasks in both Mandarin and Cantonese, while only in Mandarin for monolinguals.

Rhyming Judgment Task

The visual and auditory rhyming tasks were performed to test the phonological awareness of getting sound-based representations from the written and spoken words. In the visual task, the participants needed to transfer the orthography to phonology, while they did not need it in the auditory task. Both the rhyming tasks were displayed on Eprime2.0 based on previous studies on Chinese adults (Cao et al., 2017). Each task consisted of 30 trials. The design of the tasks was displayed in Supplementary Figure 1. All the words used in the tasks consisted of one onset and one rhyme and the two paired words had the same tone. The participants were asked to judge whether the two paired words had the same rhyme. The characters used in the visual rhyming judgment tasks were the common words chosen from the Modern Chinese Dictionary and the Cantonese Dictionary, respectively. The average reaction time (RT) of responses and the accuracy rate (AR) were recorded for either task. An integrated variable called inverse efficiency score (IES) was computed by dividing RT by AR (IES = RT/AR) (Bruyer and Brysbaert, 2011) and used as the measurement of rhyming judgment task performance.

Rapid Automatized Naming

The classical RAN was used to assess the ability of phonological lexical retrieval (Siddaiah et al., 2016). We used digits and objects as materials. 40 symbols were printed on A4-sized paper and participants were asked to read them twice as fast and accurately as possible. The average RT (ms) of RAN was calculated by dividing the total RT (ms) by the mean number of correct reactions.

Digit Span Test

The Digit Span test from the Wechsler Adult Intelligence Scale-Revised (WAIS-R) Chinese version was used to assess verbal working memory ability. The number of correct answers was recorded as the total score.

Magnetic Resonance Imaging Acquisition and Analysis

Magnetic Resonance Imaging Acquisition

Diffusion-weighted images (DWI) were acquired by a 3.0 T Siemens Scanner (Siemens Healthcare, Erlangen. Germany) at the Hunan Normal University in Guangzhou. A single-shot spin–echo echoplanar imaging sequence was used with the following parameters: Repetition time (TR) 10,000 ms; echo time (TE), 90 ms; flip angle,90°; matrix size = 128 × 128; field of view, 256 mm; voxel size 2 mm × 2 mm × 2 mm; b = 1,000 s/mm2; number of average, 1; and GRAPPA factor, 2. The diffusion gradients in 64 non-collinear directions with one image of b = 0 s/mm2 were collected.

The T1-weighted 3D images were acquired using magnetization prepared rapid gradient–echo sequence with following parameters: TR = 1,900 ms; TE = 2.52 ms; flip angle = 90°; matrix size = 256 × 256; field of view = 256 mm; voxel size 1 mm × 1 mm × 1 mm; number of average, 1.

An Magnetic Resonance Imaging Preprocessing and the Tractography

An MRI preprocessing was performed using FSL (version 5.0.9) (FMRIB Software Library). The procedures included visual inspection of quality control, eddy correction (Andersson and Sotiropoulos, 2016), deletion of non-brain tissue (Smith, 2002), calculation of DTI to get FA, MD, AD, and RD maps (Behrens et al., 2003). Additionally, DTI data quality was quantificationally evaluated according to the log file generated in eddy correction using trac-all qa tool (Yendiki et al., 2014) in FreeSurfer. The mean of average translation (mm) was 0.76 ± 0.24 for bilinguals and 0.78 ± 0.21 for monolinguals, while the mean of average rotation (×10–1 °) was 0.47 ± 0.12 for bilinguals and 0.46 ± 0.10 for monolinguals. The percentage of bad slices for both groups was 0 and the average signal drop-out score was 1. No difference was found in DWI quality measures between the two groups.

Tractography was performed using diffusion toolkit and trackvis, with the following parameters: Interpolated streamline method; minimum FA threshold, 0.2; step length, 0.5 mm; maximum angle threshold,35°. To minimize the subjective variation of manual regions of interest (ROI) definition, we used the semiautomatic method to define ROIs according to Su et al. (2018). We followed the acknowledged protocol to dissect ILF and IFOF (Catani and Thiebaut de Schotten, 2008). As there were multiple names of the SLF subdivisions, which could confuse the readers, we followed the classification proposed by Nakajima et al. (2019) to divide the SLF to the following four parts: Dorsal SLF, ventral SLF, AF, and temporoparietal segment of SLF (tSLF). Since the dorsal and ventral SLF are difficult to be divided using DTI model (De Schotten et al., 2011), and the functions of them are highly correlated (Budisavljevic et al., 2017; Nakajima et al., 2019), we reconstructed them together following the protocol proposed by De Schotten et al. (2011) and used frontoparietal segment of SLF (fSLF) to referred the merged tract. The AF and tSLF were reconstructed following the protocols proposed by Catani et al. (2005). The mean FA, MD, AD, and RD of each tracts were extracted. We also recorded the number of streamlines per tract as well as of the whole brain. No difference was found in the number of streamlines in the whole brain between groups (M ± SD; bilinguals, 91316.8 ± 26705.1; monolinguals, 87079.0 ± 8899.8, p = 0.416).

Magnetic Resonance Imaging Data Processing

We performed tract-specific TBSS analysis to compare white matter structures of the TOI between the two groups. Every subject’s binary mask of TOI was generated using the fslmaths tool in native space and registered to the MNI space. Then, we create an averaged binary mask of TOI by including the voxels belonged to 75% participants. The TBSS analysis was performed according to the standard pipeline supplied in FSL website (Smith et al., 2006). The averaged binary mask was used in the permutation tests. Threshold-free cluster enhancement (Smith and Nichols, 2009) with 10,000 permutations was set. The family-wise error rate (FWR) was corrected. The location of significant voxels in the tract-specific analysis was reported through the averaged binary masks of separate tracts of TOI. The mean values of the different DWI parameters of each significant cluster was extracted using fslmeants tool for the post hoc association analysis.

We also performed a whole-brain-level TBSS analysis to explore the group-wise difference outside the TOI and facilitate the comparison with the previous studies. As this is beyond our scope of work, the results were provided in the Supplementary Materials.

Statistical Analysis

Statistical analysis was performed using R4.0.2. The comparison of demographic information and behavioral measures between the bilingual group and monolingual group was conducted by the independent-samples t-tests, Mann–Whitney U tests, or Chi-squared test. Wilcoxon paired-samples signed rank tests were performed for Cantonese and Mandarin proficiency and phonological processing skills comparison within the bilingual group. The uncorrected p-values were reported and compared with Bonferroni-corrected α.

We used linear mixed models to test the bilingualism effect on the white matter structure of TOI. The linear mixed models allow random effects induced by inter-subject variability to be incorporated into the model. The DTI metrics (FA, MD, AD, and RD) as well number of streamlines were set to dependent variables. Group, tract as well as group by tract interaction were defined as the fixed effects. Subjects and English proficiency were defined as random effects. The R package of lmerTest (Kuznetsova et al., 2017) was used and the formula was as follows: Structural indices ∼ Group × Tract + (1| Subjects) + (1| English). When the significant Group × Tract interaction was observed, we conducted a post hoc analysis of covariance (ANCOVA) to compare the structural indices between groups. English proficiency was defined as the covariate. The concerning the family-wise error (FWE), a partial Bonferroni correction was performed according to the procedure published on the SISA website. The corrected α were 0.018, 0.020, 0.009, 0.019, 0.007, respectively, for FA, MD, AD, RD, and the number of streamlines.

For post-hoc subgroup correlation analysis, Pearson correlation analysis, and Spearman rank correlation analysis were performed separately for variables in normality and non-normality. The variables for Spearman rank correlation were RT of the Mandarin RAN digit task, IES of the Cantonese auditory rhyming task, IES of the Cantonese visual rhyming task within the bilingual group, and none in the monolingual group. The partial Bonferroni method was used to perform multiple comparison corrections. The corrected α was 0.010 for the bilingual group, and 0.014 for the monolingual group. The interaction effect was tested, if there was a significant correlation between structural indices and Mandarin behavioral performance. The correlation coefficients with an uncorrected p were presented in the Supplementary Table 1, but those correlations survived the FWE correction were depicted in Figure 3.

Figure 1. Results of tractography. (A) The five TOIs: fSLF (green), AF (red), tSLF (yellow), ILF (purple), and IFOF (blue) overlaid on a standard MNI brain. (B) The boxplots showing group-wise differences in mean MD, AD, and number of streamlines from tracts. Mo, Monolingual group; Bi, Bilingual group; SN, number of streamlines.

Figure 2. The significant clusters in the tract-specific TBSS analysis. (A) The 3D view. The binary mask of TOI is copper while the significant clusters in the comparison of FA, MD, and AD is in red (FA), green (MD), and blue (AD) respectively. (B) The cross-section view. The mean FA skeleton is copper and significant clusters emphasized via TBSS-fill is set as the same color in 3D view.

Figure 3. The significant correlation between the mean different DTI indices of significant voxels in the tracts and behavioral measures, within the subgroups.


Phonological Processing Skills

The performance of phonological processing skills was reported in Table 1. The CC (Cantonese–Mandarin bilinguals who were performing the tasks in Cantonese) demonstrated a higher IES than the CM (Cantonese–Mandarin bilinguals who were performing the tasks in Mandarin), and the MM (Mandarin monolinguals performing the tasks in Mandarin) in both visual and auditory rhyming judgment tasks (p < 0.017). The CM also showed higher IES than the MM in the visual rhyming judgment task (CM: 6.74 ± 1.93 vs. MM: 5.69 ± 2.29, p = 0.017). The higher the IES, the poorer the phonological awareness. Furthermore, the CC got a higher score in the digit span task than the CM (CC: 33.17 ± 3.37 vs. CM: 33.17 ± 3.37, p = 0.005). The CM performed slower than the CC and the MM, in the digit RAN task (CM: 0.30 ± 0.06 vs. CC: 0.28 ± 0.04, and MM: 0.26 ± 0.04, pCM–CC = 0.001, pCM–MM = 0.015).

Table 1. Demographic characteristics and phonological processing skills.

The Results of Tractography

The tractography of fSLF, tSLF, ILF, and IFOF succeeded in all the subjects, but AF known not to be traceable in all individuals (Catani et al., 2007; Vanderauwera et al., 2017) was failed to be reconstructed in a few subjects. The right AF could not be reconstructed in 13 bilinguals and 3 monolinguals, while the left AF could not be reconstructed in 5 bilinguals and 2 monolinguals. The tractography reconstructions were shown in Figure 1A.

We used the linear mixed models to test the effect of the group on the white matter structure of TOI. The DTI metrics including FA, MD, AD, and RD, as well as the number of streamlines were treated as the dependent variable. The group, tract (5 TOI × 2 hemispheres) and the interaction term were set to fixed factors. In the FA model, neither significant group effect nor group × tracts interaction was found. In the MD model, the group effect was non-significant, but the group × tracts interaction was significant (T = 2.31, p = 0.021). The post hoc ACNOVA revealed a difference in the right IFOF [M ± SD (×10–4), bilinguals: 7.77 ± 0.32, monolinguals: 7.59 ± 0.19, p = 0.013] after controlling English proficiency. In the AD model, the interaction was non-significant, but the group effect was close to the significant threshold (T = 1.85, p = 0.065). We also performed the post-hoc ACNOVA to investigate the group-wise difference in each tract. Only in the right IFOF was the difference found after multiple comparison correction [M ± SD (×10–4), bilinguals: 12.56 ± 0.34, monolinguals: 12.27 ± 0.35, p = 0.002]. In the RD model, the group effect was non-significant, but the interaction was significant (T = 2.07, p = 0.039). However, the post hoc tests revealed no significant difference in each tract. In the number of streamlines model, there was a significant interaction effect (T = –2.11, p = 0.036) with no group effect. The post hoc ACNOVA revealed group-wise differences in the bilateral tSLF (M ± SD, left tSLF: bilinguals: 371.83 ± 158.32, monolinguals: 592.57 ± 0.34, p < 0.001; right tSLF: bilinguals: 212.13 ± 82.85, monolinguals: 292.18 ± 137.79, p = 0.007). The box plots of the significant differences in TOI were presented in Figure 1B.

The Results of Tract-Specific Tract-Based Spatial Statistic Analysis

The tract-specific TBSS analysis revealed that compared to the Mandarin monolinguals, Cantonese–Mandarin bilinguals had higher MD in the right tSLF, higher FA in the left ILF and higher AD in the left ILF and right IFOF after controlling the English proficiency and average translation (p < 0.05, FWE corrected). The results are represented in Figure 2, and the peak coordinates and number of the differential voxels distributed in each tract are reported in Table 2. Monolinguals did not show higher FA, MD or AD than bilinguals in any voxel. In addition, RD did not yield significance in the tract-specific TBSS analysis.

Table 2. Results of the tract-specific TBSS analysis.

The Results of post- hoc Correlation Analysis

Only those significant clusters identified in the TBSS analysis were found to correlate with the behavioral performance. As shown in Figure 3, the mean FA of the different voxels in the left ILF was positively correlated with Mandarin visual rhyming judgment task IES only within the bilingual group (r = 0.498, p = 0.006), but it does not correlate within the monolingual group (F = 6.93, pinteraction = 0.011). The mean AD of the different voxels in the right IFOF was positively correlated with the IES of Cantonese auditory rhyming judgment task within the bilingual group (r = 0.582, p = 0.001). The detailed correlation coefficients and p-values of all the behavioral measures were provided in Supplementary Table 1.


We combined the tractography and TBSS to investigate the difference in white matter structures along the bilateral SLF, ILF, and IFOF between Cantonese–Mandarin bilinguals and Mandarin monolinguals. We observed that compared to monolinguals, bilinguals had higher MD along the right tSLF and IFOF, higher AD along the left ILF and right IFOF, higher FA along the left ILF, as well as fewer streamlines in the bilateral tSLF. Furthermore, there was a significant relationship between the phonological awareness and the mean DTI indices of the different voxels along the ventral tracts. Besides, the correlation between the DTI indices and the behavioral performance was different between the two groups. The results support our hypothesis. The current study was the first study to explore the difference in the white matter structures related to phonology between Cantonese-Mandarin bilinguals and Mandarin monolinguals.

Differences in Phonological Processing Skills

First, we observed that Cantonese–Mandarin bilinguals had worse Cantonese phonological awareness than Mandarin, in both visual and auditory rhyming tasks. It is reasonable, considering that Cantonese–Mandarin bilinguals in mainland China only accept pinyin instruction for Mandarin but does not accept Cantonese in school. Pinyin is the phonological coding system used in China. As both Mandarin and Cantonese are morphosyllabic, which are not divisible at the phoneme, pinyin instruction can greatly facilitate the phonological awareness among the Chinese (Shu et al., 2008). The deficiency of pinyin instruction for Cantonese may lead to the poorer Cantonese phonological awareness of the bilinguals. In addition, Cantonese–Mandarin bilinguals had better digit span and RAN performance in Cantonese than Mandarin. This phenomenon is consistent with previous evidence that bilinguals have better abilities of working memory (Grundy and Timmer, 2016) and phonological lexical retrieval (Pennino, 2010; Yeung, 2016) in their native languages than their L2.

Second, we observed that the Cantonese–Mandarin bilinguals performed worse in the Mandarin visual rhyming judgment task and digit RAN task than Mandarin monolinguals. The lower speed of naming is probably related to the interference from non-target language when bilinguals are facing language choosing (Jian et al., 2011). The differences will be discussed combined with the neuroimaging results below.

Dorsal White Matter Tracts

First, we found that Cantonese–Mandarin bilinguals had higher MD in the right tSLF and lower number of streamlines in the bilateral tSLF than Mandarin monolinguals. The MD measures the average diffusivity of water molecules across all directions, namely isotropy, and higher MD reflects more free water diffusion in white matter (Soares et al., 2013). The literature showed that the higher MD can accompany increased axonal caliber, more fiber crossings, looser packing density, fewer synapses or glial cells (Sagi et al., 2012), or even an increase in tissue water content, such as increased cerebral blood flow (Jin and Kim, 2008). The fewer streamlines are usually interpreted as lower fiber counts corresponding to lower connectivity strength between two brain regions (Tsang et al., 2017; Young et al., 2018). However, the DTI metrics and number of streamlines are also affected by many confounding effects such as curvature, length, branching, and fiber crossings (Jones et al., 2013). The interpretation from DTI metrics to specific anatomical microstructures should be treated prudently. One possibility accounting for the higher MD and lower number of streamlines is that the bilinguals had looser packing density and lower connectivity strength in the tSLF. Another possibility is that bilinguals had higher fiber complexity originating from fiber branching and crossings.

Different from our finding, some intervention studies observed increased FA in the right SLF for L2 learners (Hosoda et al., 2013; Mamiya et al., 2016). For example, Hosoda et al. (2013) found that FA in the right SLF would increase after 16 weeks of English training for Japanese speakers. Mamiya et al. (2016) also found that FA in the right SLF was positively correlated with the number of training days in English immersion program for Chinese speakers. Animal models showed that the increase of FA was associated with learning (Gibson et al., 2014). Hosoda and Mamiya guessed that the increase of FA might be associated with increased brain myelination. However, we believed that our results are less likely to reflect remodeling of myelin, because myelination is not a necessary condition for MD changes (Takeuchi et al., 2015) or number change of streamlines (Jones et al., 2013). Furthermore, we found that there was no significant difference in FA and RD in the right SLF between the two groups. The FA and RD are sensitive to myelin changes (Song et al., 2002; Basser and Pierpaoli, 2011). The L2 training in both the intervention studies was short-term, whereas the Cantonese–Mandarin bilingual adults in our study use L2 regularly since an average age of 4.53 (in the range of 3–7 years). The evidence from motor training suggested that short-term intervention increased FA (Scholz et al., 2009), while compared with short-term intervention, long-term training would induce increased diffusivity, and reduced fiber coherence (Giacosa et al., 2019). Accordingly, we speculated that different from short-term training, lifelong bilingual language experience would generate greater isotropic diffusivity. Consistent with our finding, Singh et al. (2017) reported that Hindi–English bilinguals had higher MD in the bilateral SLF and higher AD in the right SLF than Italian monolinguals. The Hindi–English bilinguals also lived in bilingual society and started to use L2 frequently since an average age of 5 years. They suggested that lifelong bilingual language experience would generate greater isotropic diffusivity.

Anderson et al. (2018a) observed higher AD in the left SLF for lifelong bilinguals in Canada compared to monolinguals after carefully matching between groups in age and general cognitive abilities. However, they did not compare the MD values between the groups. The MD is the average of the dispersion of water molecules in each direction, including the radial and axial (Soares et al., 2013). The AD is highly correlated to MD. Thus, it is somewhat possible that the bilinguals and the monolinguals in Anderson’s study also had a difference of MD in the left tSLF. Besides, the participants in Anderson’s study were elders, while the participants in our studies were young adults. The developmental trajectories of white matter were different between bilinguals and monolinguals from early childhood to young adulthood (Pliatsikas et al., 2020). It is possible that the developmental trajectories of white matter of bilinguals and monolinguals are also different from young adulthood to old age. However, there are also some studies showed contradictory findings in the SLF. Luk et al. (2011) also recruited elder lifelong bilinguals with frequent L2 use, but they reported increased FA in the bilateral SLF. According to Anderson et al. (2018a), the matching between the groups in Luk’s study might not be enough considering the high popularity of preclinical dementia in elders. In addition, both Pliatsikas et al. (2015) and Kuhl et al. (2016) reported differences in FA of the bilateral SLF between bilinguals and monolinguals. Most of the bilinguals in the two studies started to immerse themselves in L2 environment in adulthood. The changes of FA might be related to the later exposure to L2 environment in adulthood. So we speculated that the pattern of increased MD, AD, and decreased number of streamlines in the SLF for Cantonese–Mandarin bilinguals might have relationship with the long-term frequent use of two languages from an early age.

Few previous studies on bilingualism reported the specific subdivisions of SLF. In the current study, we divided the SLF into three components and observed different structural indices in the bilateral tSLF between Cantonese–Mandarin bilinguals and Mandarin monolinguals. The tSLF connects the parietal lobe to the temporal lobe. As the function of the left and right tSLF is not the same, we will discuss them separately. The left tSLF participants in lexical retrieval and the conversion from orthography to phonology and semantics (Nakajima et al., 2019). Numerous studies reported that the proficient bilinguals showed slower lexical retrieval than comparable monolinguals (see the review of Sullivan et al., 2018). The lower number of streamlines in the left tSLF might correlate with this phenomenon. As to the right tSLF, Vaessen et al. (2016) reported the right tSLF participants in visuospatial perception. The characters of logographic languages have complex spatial structures, and character processing in logographic languages demands more involvement of brain regions related to visuospatial processing especially in the right hemisphere than alphabetic languages (Bolger et al., 2005; Tan et al., 2005). In addition, the right superior temporal gyrus that right tSLF connected is sensitive to tonal perception (Liang and Du, 2018). Loui et al. (2011) reported that FA of the right tSLF is correlated to tone learning performance. Both Cantonese and Mandarin are tonal languages. Using two tonal logographic languages might associate with white matter structural changes in the right tSLF. No significant correlations were found between the mean different DTI metrics in the right tSLF and the phonological processing skills we measured. As mentioned above, the right tSLF was related to tonal and visuospatial perception, which might explain the absence of significant correlations.

Ventral White Matter Tracts

We also found that Cantonese–Mandarin bilinguals exhibited higher AD and FA in the left ILF, and higher AD and MD in the right IFOF. Higher FA is accompanied by more myelination, more axons, higher axonal packing density, or less fiber crossing (Basser and ÖZarslan, 2014). Higher AD was associated with more axons, increased axonal caliber, looser packing density or more coherent orientation of axons, but not to be sensitive to myelin changes (Solowij et al., 2017). The higher AD and FA in the left ILF in the bilinguals might indicate that Cantonese–Mandarin bilinguals have more axons, increased myelin or less fiber mixture. The higher AD and MD in the right IFOF might indicate increased axonal caliber or looser packing density. Consistent with our study, Luk et al. (2011) reported bilingual elders had higher FA in the bilateral ILF. Anderson et al. (2018a) reported increased AD in the left sagittal stratum including ILF and IFOF for the bilingual elders compared to the monolingual elders. Both Luk and Anderson suggested that the increased FA and AD reflected enhanced white matter integrity in bilingual elders, and this neural adaption could serve as a buffer against the neuroatrophy of aging. However, we are conservative in interpreting the change of DTI metrics as enhanced integrity, since our participants were young adults. Singh et al. (2017) observed decreased FA in the right IFOF for bilingual young adults compared to their monolingual peers. Singh et al. (2017) interpreted the FA change as lower axonal density, myelination and coherence in the orientation of white matter. The microstructure change that accompanies the decrease of FA can also induce the increase of MD; for example, lower axonal density, so our findings in the right IFOF do not contradict their results. Among these studies, the association between lifelong bilinguals’ experience and ventral white matter tracts was different because of their different age. Interesting, the participants in the current study were also young adults, but the results were consistent with that in the elder (Luk et al., 2011; Anderson et al., 2018a), so more research is needed to explore the possible reasons.

Importantly, consistent with our hypothesis, we found the differences in the ventral tracts were related to phonological processing skills. First, we found that the mean FA of the significant voxels in the left ILF was positively correlated with the IES of the Mandarin visual rhyming judgment task within the bilingual group (r = 0.498, p = 0.006), but not within the monolingual group (F = 6.93, pinteraction = 0.011). The correlation between the mean AD in the left ILF and IES of the Mandarin visual rhyming judgment task within bilinguals was marginally significant with p = 0.058 (see Supplementary Table 1). The correlations here suggested that the increased FA and AD in the left ILF might have an association with the Mandarin phonological processing for Cantonese–Mandarin bilinguals. As reported above, compared to Mandarin monolinguals, Cantonese–Mandarin bilinguals had worse performance in the Mandarin visual rhyming judgment task. We proposed two possible explanations. First, the evidence from both alphabetic and logographic languages showed that the bilateral ILF is involved in semantic processing and orthography–phonology conversion (Vandermosten et al., 2012; Herbet et al., 2018; Wang et al., 2020). To complete the visual rhyming judgment tasks, subjects need to map orthography to phonology first and then decode the phonology (McPherson et al., 1997). The left ILF might correlate to the visual rhyming task through the process of conversion from orthography to phonology. Cantonese and Mandarin use the same set of written systems, in which 70% of characters share the same orthography and meaning but not phonology in both languages, namely, cognates (Yu, 1960). Bilinguals cannot restrict the activation of phonology in the non-target language when seeing cognates, which renders language competence (Rodriguez-Fornells et al., 2005). The Cantonese–Mandarin bilinguals’ worse performance in Mandarin visual rhyming tasks might be related to language interference. The activation of semantic representations was found to facilitate the phonological accessing of cognates in L2 (Friesen et al., 2014). A previous fMRI study also reported that bilinguals who used two languages with the same orthography but different phonological associations were more likely to recruit the ventral stream than the dorsal stream in the orthography-phonology conversion in both L1 and L2 (Nosarti et al., 2010). Considering the important role of the left ILF in visual semantic processing (Shin et al., 2019), the correlation between the FA and IES may reflect the Cantonese–Mandarin bilinguals recruit the left ventral stream extra in the orthography-phonology conversion. The second possible explanation is that the correlation above may reflect the increased need for lexical retrieval for bilinguals. The Mandarin visual rhyming judgment tasks involve the process of retrieval of phonology word form. The frequency–lag hypothesis claimed that bilinguals need to divide their language use between two languages and use each language less frequently, so lexical retrieval is more effortful for bilinguals (Gollan et al., 2008, 2011). The left ILF was also reported to show a strong involvement in lexical retrieval. The correlation between FA in the left ILF and IES of the visual rhyming task may reflect the extra need for lexical retrieval for Cantonese–Mandarin bilinguals. Overall, we speculated that the increased FA and AD in the left ILF may be the result of overburden.

Second, we also found a positive correlation between the mean AD of the different voxels in the right IFOF and the IES of Cantonese auditory rhyming judgment task within the bilingual group (r = 0.582, p = 0.001). In addition, we noticed the mean AD was also correlated with IES of Mandarin auditory rhyming judgment task within the bilingual group (r = 0.420, p = 0.023), though this correlation did not survive the multiple corrections. Lebel et al. (2013) reported FA in the right IFOF/ILF was correlated with the ability of phonological decoding in healthy adults. This may explain the correlation of AD in the right IFOF with IES of the auditory rhyming tasks revealed by the current study. The Cantonese–Mandarin bilinguals use two languages frequently. Facing the competition of two languages, they may rely more on phonological decoding in auditory speech processing, which rendered a higher AD. In addition, because the IFOF connects the frontal lobe, it also plays a role in the executive function, especially in cognitive flexibility (Perez-Iglesias et al., 2010; Kucukboyaci et al., 2012) and inhibition control (Rollans and Cummine, 2018). The correlation between AD in the right IFOF and performance in auditory rhyming tasks may also reflect the bilinguals’ extra requirement of executive function. To test this hypothesis, we additionally conducted a correlation analysis between AD in the right IFOF and the measurements of executive function, which were used in our previous study (Cai et al., 2021), within the bilingual group. A positive significant correlation was found between AD and the ability of shift, namely, cognitive flexibility, within the bilingual group (r = –0.39, p = 0.04) (for the plot, see Supplementary Figure 2). The bilinguals need to switch from two languages in oral communication, and more cognitive flexibility is needed (Buchweitz and Prat, 2013). The increased recruitment of the right IFOF may bring about higher AD for the Cantonese–Mandarin bilinguals.


There were some limitations to this study. First, this study used a cross-sectional design which is limited to explaining the causal relationship between structural changes in white matter and long-term logographic–logographic bilingual experience. Second, the sample size was not large enough, which may affect the statistics effectiveness of the results. However, our sample size is still larger than most of the previous MRI studies on bilingualism (Luk et al., 2011; Cummine and Boliek, 2013; Gold et al., 2013; Pliatsikas et al., 2015; Kuhl et al., 2016; Singh et al., 2017). Third, English is a compulsory course in mainland China, so all participants have English experience. To minimize the confounding, we controlled their English proficiency estimated by the CET4 grades in the group-wise comparison.


In conclusion, compared to Mandarin monolinguals, Cantonese–Mandarin bilinguals have different structures in the bilateral tSLF, right IFOF, and left ILF. The bilinguals’ white matter showed higher diffusivity, especially in the axonal direction, than the monolinguals. The specific difference pattern of DTI indices in the dorsal stream may reflect the neuroplasticity related to the long-term bilingual experience. As to the ventral tracts, they are not traditionally considered to participate in phonological processing. However, we found the differences in ventral white matter were related to phonological processing in Cantonese–Mandarin bilinguals. Our study confirmed the association between Cantonese–Mandarin bilingual experience and structural adaption in the ventral white matter tracts, and also the relationship between the structural adaption here and logographic language phonological processing skills in the bilinguals. Our study first provided evidence of white matter characteristics of bilinguals using two kinds of logographic languages.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Ethics Statement

The studies involving human participants were reviewed and approved by the Medical Ethics Committee, Sun Yat-sen University. The patients/participants provided their written informed consent to participate in this study.

Author Contributions

XX, XF, ST, and XS performed the material preparation and data collection. XX and YJ performed the data analysis. NP and MC performed the data curation. JJ performed the negotiation with the unit who helped us collect data. XL performed the supervision and the funding acquisition. XX wrote the first draft of the manuscript. All authors contributed to the study conception and design, commented on the previous versions of the manuscript, read and approved the final manuscript.


This work was supported by Key Realm R&D Program of Guangdong Province (grant number 2019B030335001); Guangdong Basic and Applied Basic Research Foundation (grant number 2021A1515011757); and the National Natural Science Foundation of China (grant number 81673197).

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.


We thank all the participants for their participation in the study and the Brain Imaging Center of Institute for Brain Research and Rehabilitation in South China Normal University for their support in the data collection.

Supplementary Material

The Supplementary Material for this article can be found online at:




This article is autogenerated using RSS feeds and has not been created or edited by OA JF.

Click here for Source link (