Processing of data and screening of differentially expressed genes
The data were normalized and the detailed normalized expression data can be found in Volcano plots for each region (Fig. 1). In which, only the genes meeting FDR < 0.05 and |log2 fold change (FC)| > 1 were chosen as DEGs. It could be observed that the normalization process controlled the variation between the population’s replicates (Fig S1). Based on the cut-off criteria, the DEGs of 453, 597, 577, and 636 were identified from the teat cistern (Fig. 1a), gland cistern (Fig. 1b), lobuloalveolar (Fig. 1c), and Furstenberg’s rosette (Fig. 1d) regions, respectively.
Differential Express Genes screening and data processing
The DEG expressions from the four different regions (teat cistern, gland cistern, lobuloalveolar, and Furstenberg’s rosette) were represented by hierarchical clustering heatmaps, as shown in Fig S2. In which, each group was divided into two types (E. coli-infected and control). In the teat cistern, it was observed that E. coli-infected was generally upregulated compared to the control group. However, in Furstenberg’s rosette, the changes in expression pattern between E. coli-infected and control samples were lower. Also, there were 101 overlapped DEGs (ODEG; 100 upregulated and 1 downregulated the gene) (Listed in Table S1). These ODEG were not affected by different tissue regions and were all associated with mastitis (Fig. 2a). Also, the 101 ODEGs had consistent expression patterns in different regions (Fig. 2b).
Gene ontology and KEGG pathway analysis for the ODEGs
A total of 27 GO biological processes (BP) for the co-regulated DEGs were generated. In which, 23 of these genes are participating in the immune response, 19 genes with defense response, and 9 genes with inflammatory response. Also, several genes were directly expressed in response to the bacterium (like SLC11A1, IL1B, LBP, IL10) (Table S2). For instance, GO:0006955~immune response gene had the most gene count and FDR with 1.47 × 10−12 (Fig S3). According to KEGG enrichment analysis, ODEGs were significantly enriched in 8 pathways, such as NOD-like receptor signaling pathway (IL8, IL18, IL1B, PYDC1) and the chemokine signaling pathway (PTK2, IL8, NCF1, CCR1, HCK) (Table 1).
Protein-protein interaction network analysis for ODEGs
The STRING database was used to search for PPI relationships among ODEG. In which, 241 PPI pairs were identified with a combined score over than 0.4. The PPI network was constructed, involving 55 upregulated ODEGs (Fig S4). A high significant direct network was found between IL1B and IL8. However, PTK2 was indirectly linked to IL8 through its link with IL1B.
Transcription factor-miRNA-target regulatory network analysis
In total, 57 regulatory relationships between miRNAs, transcription factors (TFs), and ODEGs that had PPI relationships were identified to construct the miRNA-DEG-TF regulatory network. The miRNA-DEG-TF regulatory network contained 21 miRNA-DEG pairs and 36 TF-DEG pairs of s regulatory relationships, including 16 miRNAs obtained from miRTarBase database (Table S3) and 6 significantly enriched TFs obtained from WebGestalt (Table S4). In which, AP1 FDR was calculated 3.59 × 10−5 for the 9 counts of target DEGs: CD79A, FCGR2B, CCR1, MMP3, TREML2, NFATC2, IL10, NCF2, TGM3 (Table S4). Also, two parts of the regulatory relationships were integrated to construct the miRNA-DEG-TF regulatory network. For instance, IL18-DEG was linked with four microRNAs (bta-miR-15b, 16a, 181a, and 155); also, it was integrated with the HNF3-TF (Fig. 3).
Meanwhile, 5 miRNAs were related to bovine mastitis inflammation from the databases of cattle candidate genes and genetic markers for milk production and mastitis. These miRNAs included bta-miR-146a, bta-miR-21-5p, bta-miR-31, bta-miR-16a, and bta-miR-155. There were 4 DEG that were targeted by the bovine mastitis inflammation-related miRNAs. CTSC was regulated by bta-miR-146a, bta-miR-21-5p, bta-miR-31, as well as was modulated by 3 TF (LEF1, IL10, and IL8). Also, bta-miR-146a and bta-miR-16a, and IL 10 were regulated by 2 transcription factors (LEF1 and AP1). While IL18 was regulated by bta-miR-16a and bta-miR-155, and by HNF3 transcription factor. As a result, the 4 mentioned DEGs that are regulated by bovine mastitis inflammation and that was correlated with miRNAs and TFs were selected for further analysis.
Validation of the expression level of DEGs
In the studied dataset, GSE15441, CTSC, IL10, IL8, and IL18 were upregulated. To validate the expression of the selected 4 DEGs in the GSE15441 dataset, another dataset GSE15020 was used (Fig. 4a). In addition, CTSC, IL10, IL8, and IL18 were all expressed at higher levels (P < 0.05) in E. coli-infected samples than that in healthy samples (Fig. 4b). In which, the expression of IL8 had the highest increase (P < 0.001) compared to IL10. Also, the expression patterns in GSE15020 were the same as GSE15441; therefore, all the 4 selected DEGs (CTSC, IL10, IL8, and IL18) could be validated in GSE15020.
Epigenetic analysis, epigenetic modifications, and CpG island and active cite
Methylation is one of the most important epigenetic modifications. This process includes the addition of methyl groups to the DNA, mostly at CpG sites, to convert cytosine to 5-methylcytosine. As a result, these highly methylated areas tend to be less transcriptionally active. CpG islands were predicted which are the main areas of methylation for the interested genes (gene nucleotide sequences of CTSC, IL10, IL8, and IL18) based on genomic sequences. In this study, CpG islands were found in the promoter region in CTST (from about 500 bp to 900 bp upstream of transcription start site), while no CpG islands were found in IL10, IL8, and IL18 (Fig. 4c). In which, 3 active sites in CTSC were found (Fig. 4d). These 3 active sites were distributed at the 258 aa, 405 aa, and 527 aa positions.
Functional domain identification
We used Simple Modular Architecture Research Tool (SMART, http://smart.embl-heidelberg.de/), which allows the identification and annotation of genetically mobile domains and the analysis of domain architectures. More than 500 domain families found in signaling, extracellular, and chromatin-associated proteins were detected. The results of confidently predicted domains in CTSC, IL10, IL8, and IL18 were listed in Table 3. In which, IL10 domain residues started from 38 to 174; however, CTSC started from 231 to 458. In which, protein sequences alignments from the 5 different species (Bos Taurus, Homo sapiens, Mus musculus, Canis lupus, and Sus scrofa) were compared and highlighted with yellow color to find the consensus sequences of CTSC protein (Fig. 4e). Also, a very intersecting upstream SNP (rs104894210) has been found in CTSC, and two upstream SNPs were in IL8 (rs1000003014, rs1000275946). On the other hand, IL18 had no upstream SNPs (Table S6). In addition, Table S5 listed the CNVs of each gene from the database of Genomic Variants (DGV). In which, novel sequence insertion was in CTSC and two deletion subtypes (nsv1134175, esv2745077) were in IL18.
Protein docking exploration of vibrant immune response genes and the active site prediction of interested DEGs
The 3D structure of the 4 selected proteins (CTSC, IL10, IL8, and IL18) was elucidated and the ligands for docking through protein binding active sites identity are presented in Fig. 5. The active sites of CTSC, IL10, IL8, and IL18 were identified using scanprosite. In which, 3 active sites in CTSC were found by the molecular docking (Fig. 5a, b), but no active sites in IL10, IL8, and IL18 (Fig. 5c, d, e). These 3 active sites were distributed at the 258 aa, 405 aa, and 527 aa positions (Fig. 4d). It has been reported that gene active sites are more easily to be docked by enzymes or ligands, which can regulate the genes’ expression. Therefore, this information is important on target sites for regulatory effects of gene expression based on the controlling of the target genes’ active sites.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.