From more than a 1 million Mtb WGS currently available in the repositories analyzed, only 827 genomes had clinical information related to the resistance profile and confirmation of the presence or absence of T2DM. From this group, only 74 genomes from individuals with T2DM had the required sequencing depth, lineage, and coverage to be included in the study. An additional set of 74 WGS from individuals with drug resistance and without T2DM were retrieved and randomly included to match both study groups. Individuals carrying the 148 Mtb strains were mostly male 84 (57.4%), with a mean age of 47 ± 13 years at the time of sputum sample collection.
Patients came from nine countries, mainly Georgia with 36 participants (24%), Mexico, 34 (22.9%), Moldova, 20 (13.5%), and Belarus with 15 (10.1%). According to the type of resistance, four groups of isolates were identified: 40 were monoresistant (MR) (27%), 18 polyresistant (POL) (12.1%), 62 multidrug-resistant (MDR) (41.8%), and 28 extensively drug resistant (XDR) (18.9%) (Table 1).
Characterization of variants associated with resistance
Genotypic resistance in the isolates was observed for several drugs, with the highest proportion of resistance to isoniazid (INH) with 113 isolates (76.3%), and rifampicin (RIF) including 103 strains (69.5%), followed by ethambutol (EMB) 64 (43.2%), streptomycin (STR) 63 (42.5%), and pyrazinamide (PZA) in 51 strains (34.4%). Genotypic resistance to second-line drugs had a lower representation with only 49 sequences resistant to amikacin (AMK) (33.1%), and 33 (22.3%) to fluoroquinolones (FQ). Among these drugs, only STR showed a significant difference in the number of WGS with this type of resistance with 21 (28.3%) isolates in the T2DM group vs 42 (56.7%) in the TB group (p = 0.00048).
During this study were identified 431 SNPs confirmed to be associated with resistance  in the diabetic, and non-diabetic groups; 207 (48%) vs 224 (52%) respectively. Only 23 (12 [2.7%] vs 11 [2.5%]) were classified as non-fixed SNPs. The mean number of SNPs by isolate was 2.79 vs 3.02 with a 2.7 vs 2.8 standard deviation, with no significant association in the distribution between groups (p = 0.4413). Regarding the distribution of polymorphisms by type of resistance, the sequences from individuals with diabetes presented a higher frequency in isolates with monoresistance, 21 (7%) vs 20 (6.9%) (p = 0.7073), and polyresistant, 25 (8.8%) vs 14 (4.6%) (p = 0.4221). In contrast, lower frequencies of polymorphisms were identified in isolates classified as MDR, 136 (47.7%) vs 159 (52.0%) (p = 0.2659), and XDR, 104 (36.5%) vs 112 (36.6%) (p = 0.9281).
Fifty-six high confidence resistance-related variants were identified, from which, 16 had a proportion greater than 1% in the dataset. Genotypic resistance was given by genomic variants associated with anti-tuberculosis drugs, and mainly concentrated in genes associated with this property. Resistance to isoniazid was given by in 58 isolates (78.3%) from the T2DM group, vs 55 on the isolates (71.6%) with only TB (p = 0.3368), rifampicin 50 (67.5%) vs 53 (71.6%) (p = 0.2874) and ethambutol 27 (36.4%) vs 37 (50%) (p = 0.2753).
Three mutations were observed with a frequency greater than 15% in both groups; katG S315T was found in 91 sequences (61.4%), and fabG1-inhA in 62 (41.8%); these were the only genes associated with resistance to isoniazid observed in the dataset. These mutations were followed by rpoB S450L in 58 isolates (39.1%) (Table 2). Nevertheless, no differences on these mutations were observed in the groups. However, when comparing between the groups, no significant differences were observed in the variants associated with resistance to isoniazid (p = 0.9445) and (p = 0.6253), respectively, nor for rifampicin (p = 0.9675).
In only two SNPs was detected a difference in a greater proportion than 10% between diabetic, and non-diabetic groups; these were, rpsL K88R (9.4% vs 20.2%), and rpsL K43R (5.41% vs 18.9%), both related with resistance to STR. Regarding rifampicin resistance, 22 different rpoB variants were observed between the T2DM and TB groups, 13 (23.2%) vs. 9 (16%), respectively; four mutations were identified exclusively in isolates from T2DM group; rpoB H445L (observed in four isolates) and rpoB Q432L, rpoB Q432P and rpoB S441L (detected in only one isolate each) (Table 2). The rpoB S450L variant was identified in clonal complexes C2 and C3; whereas, rpoB H445D and rpoB H445L were observed in C1 and rpoB D435Y in C2.
Likewise, it was found that in 20 isolates carrying the rpoB S450L variant also had the compensatory mutation rpoC V483A/V483G. Among these, eight isolates (10.8%) from patients with T2DM carried two mutations, two rpoC V483A, and six rpoC V483G, meanwhile, in 12 strains (16.2%) from the non-diabetic group these mutations were also observed, 2 rpoC V483A, and 10 rpoC V483G.
Phylogenetic analysis of the isolates showed that 148 isolates were included in 12 sublineages of L4, being the most frequent 4.3.3(LAM9) with 31 sequences (21%), and 4.10 (PGG3) with 29 (20%), followed by 22.214.171.124 (Haarlem) with 25 (16%), and 4.2.1 (Ural) with 24 (16%). The LAM9 (22%), Ural (20.9%), and Haarlem (19%) sublineages showed the highest frequency of MDR isolates, on the other hand, LAM9 concentrated the highest proportion of XDR tuberculosis (43%).
Regarding the phylogenetic distribution based on diabetes comorbidity, differences in the prevalence of some sublineages were observed. Sublineages 126.96.36.199 (X3/X1), 13% vs 3%, and 188.8.131.52 (Haarlem), 24% vs 9%, were predominantly found in isolates from patients with T2DM, respectively. By comparison, the sublineages 4.3.3 (LAM9), 28% vs. 14%, 4.10 (PGG3), 26% vs. 14%, and 4.2.1 (Ural), 23% vs. 9%, were found in a higher proportion within non-diabetic individuals (Table 3).
According to the drug resistance profile, 40% of the MR isolates belonged to the 4.10 (PGG3) sublineage, and 33.3% of the POL isolates were found in the 184.108.40.206 (Haarlem) sublineage. The highest proportion of MDR sequences was classified into three sublineages: 19.3% in 220.127.116.11 (Haarlem), 20.9% in 4.2.1 (Ural), and 22.5% in 4.3.3 (LAM9). XDR isolates were mainly classified in the 4.2.1 (Ural) sublineage 21.4%, and 4.3.3 (LAM9) 42.8%.
Phylogenetic analysis in the 148 isolates, with a 12 SNPs cut-off point, identified 36 isolates (24%) grouped in four clonal complexes (C): C1, with sublineage 4.3.3 (LAM9), composed of five isolates, all of them from the TB group, and from Belarus; C2, with lineage 4.2.1 (Ural), which includes 17 sequences, six T2DM vs 11 TB, 88% originating from Moldova; C3, with sublineage 18.104.22.168 (X3/X1), composed of nine isolates, six T2DM vs three TB) all from Mexico and C4 with sublineage 4.10 (PGG3), which includes five sequences from Georgia.
A pattern of six mutations was observed in all C1 isolates: katG S315T, rss (position 1,472,359), embA (position 4,243,221), embB Q497R, pncA D49G, and rpoB H445D. Similarly, C2 included 14 sequences (82.3%), and shared the katG S315T, fabG1-inhA, rpoB S450L, and rpsL K88R mutation pattern, all these isolates came from Moldova. However, no relationship between diabetes comorbidity, and the presence of these variant patterns were identified. (Fig. 1).
The occurrence of rifampicin resistance-associated variants identified only in patients with T2DM had a heterogeneous phylogenetic distribution suggesting that they are not driven by geographically prevalent strains. The variants observed in a single isolate were rpoB S441L, and rpoB Q432P, belonging to sublineage 4.10 (PGG3), whereas rpoB Q432L, was observed in another isolate, and was found in sublineage 22.214.171.124 (LAM11). On the other hand, the rpoB H445L variant was observed in four isolates from three different sublineages 4.1.2 (T1;H1), 4.3.3 (LAM9), and 126.96.36.199 (LAM11).
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.