Changes in the oil content, seed size, and weight at different seed developmental stages
The embryo oil content, seed size, and weight were examined at four seed developmental stages (40, 54, 68, and 81 daa) in the HO and LO lines. Images of yellowhorn seeds and embryos were produced at four seed developmental stages for comparison (Fig. 1a). Both lines showed a rapid accumulation of the embryo oil content from 40 to 68 daa followed by a very slow increase from 68 to 81 daa; the highest oil content was produced at 81 daa (Fig. 1b). In both yellowhorn lines, the oil content accumulated at the highest rate between 40 and 54 daa, indicating that substantial lipid biosynthesis mainly occurred during the early to middle stages of seed development. There was significantly higher oil content in the HO embryos than in the LO embryos at 40, 68, and 81 daa, but not at 54 daa. At 81 daa, the oil content of HO embryos was 10.69% higher than that of LO (Fig. 1b).
Additionally, the transverse diameters, longitudinal diameters, and lateral diameters of the seeds increased in both lines from 40 to 54 daa and then decreased at 68 daa followed by a slow increase from 68 to 81 daa, indicating that faster growth of developing seeds occurred mainly at the early stages (40–54 daa) (Fig. 1c–e). At 54 daa, the seed sizes of the two lines were at the highest level. Moreover, the transverse diameters and lateral diameters of the seeds from the HO line were significantly smaller than those from the LO line at 40 daa, and the lateral diameters of the seeds from the HO line were significantly smaller than those from the LO line at 68 daa (Fig. 1c, e). With seed development, the change in the weight of the two lines was similar to the change of the seed size, and the HO line had a higher hundred-grain weight than the LO line (Fig. 1f).
Overview of sRNA sequencing in yellowhorn
Sixteen sRNA libraries from eight samples (two biological replicates for each) were constructed to identify miRNAs linked to the regulation of oil accumulation and seed development. A total of 238,325,077 raw reads were obtained from these libraries by high-throughput sequencing (approximately 14.9 million raw reads per library). A total of 194,958,060 clean reads of 18–25 nucleotides (nt) and an average of 12,184,879 reads (81.80%) per library were obtained after removing the adaptor dimers, junk reads, low-complexity sequences, and sequences shorter than 18 nucleotides and longer than 25 nucleotides (Table 1).
Following a search against the Rfam and Repbase databases for sRNA sequences by the Bowtie software , an average of 3.22 and 7.17% clean reads in the HO and LO lines, respectively, were mapped to known non-coding sRNAs (rRNAs, tRNAs, snRNAs, snoRNAs, and other Rfam RNAs) and repeat sequences (Additional file 1: Table S1; Additional file 2: Table S2). Moreover, an average of 5.69 and 4.41% clean reads in the HO and LO lines, respectively, were mapped to NATs (Natural Antisense Transcripts) and phased siRNAs (phasiRNAs). Notably, about 5% of the clean reads per library, on average, were mapped to miRNAs in each line (Additional file 1: Table S1; Additional file 2: Table S2). The proportion of miRNAs declined at 81 daa compared with 40, 54, and 68 daa in the two lines, implying that a body of miRNAs were induced in the early and middle stages of seed development. The length distribution of unique sRNAs in both lines at four developmental stages were then summarized. Most sRNA reads ranged from 21 to 24 nt in length, leading to similar length distributions in both lines at the different developmental stages. Twenty-four nt sRNAs were the most abundant, accounting for 30.74% (LO81) to 75.95% (HO40) of the total (Fig. 2). Also common were 21, 22, and 23 nt sRNAs, which were more abundant than those of any other length besides 24 nt.
Identification of known and novel miRNAs
A total of 249 known miRNAs were identified in the 16 libraries by analyzing the alignment results against the miRbase 21.0 database (Additional file 3: Table S3). Among the identified known miRNA sequences, 180 were identified to belong to 46 miRNA families. And the miRNA sequences were mapped to known miRNAs from 34 other plant species, with the highest number of miRNAs mapped to known mtr-miRNAs of Medicago truncatula (54, 21.67%), followed by gma-miRNAs of Glycine max (39, 15.66%), ptc-miRNAs of Populus trichocarpa (22, 8.84%), and osa-miRNAs of Oryza sativa (17, 6.83%) (Fig. 3; Additional file 4: Table S4). Among the identified miRNA families, the MIR159 family had the largest number of members (18), followed by the MIR171_1 and MIR482 families with 14 members, and MIR156 and MIR166, with 12 and 10 members, respectively (Fig. 4). The expression profiles of the known miRNAs showed that the normalized read counts of known miRNAs varied from 100,000 to less than 10 reads, exhibiting great variation, even within the same family (Additional file 3: Table S3). At the opposite end of the spectrum, about 49 miRNAs were identified that had fewer than 10 reads in all libraries. In addition, Fig. 5a shows the distribution of miRNA first nucleotide preferences. MiRNAs of 24 nt tended to begin with 5′-A, while 21–22 nt miRNAs tended to start with 5′-U. At the 5′ end, uridine was the most common nucleotide (> 52%) for all known miRNAs (Fig. 5b).
In addition to the known miRNAs, the remaining unaligned sRNA sequences were mapped to the yellowhorn genome  and their hairpin structures were predicted to identify novel miRNAs in yellowhorn. We only considered sRNAs that exhibited the typical stem-loop structure. The novel miRNA sequences were named in the form of Xso-miRn-number. A total of 88 novel miRNAs between 18 nt to 25 nt in length were identified. Among them, 61.36% were 24 nt in length (Additional file 5: Table S5). The novel miRNA precursors ranged in length from 64 nt to 202 nt, with an average length of 140 nt. Compared to known miRNAs, most of the identified novel miRNAs had relatively low read counts (normalized). Most of the novel miRNAs had fewer than 10 reads, and only two novel miRNAs had more than 100 reads in all libraries (Additional file 5: Table S5). The minimum free energies of the hairpin structures of these miRNA precursors ranged from − 130.1 kcal·mol− 1 to − 26.9 kcal·mol− 1, with an average of − 59.6 kcal·mol− 1, similar to those of other plant miRNA precursors . The secondary structures of the four most abundant novel miRNAs (Xso-miRn24, Xso-miRn44, Xso-miRn45, and Xso-miRn84) were predicted (Fig. 6), indicating they can form typical stem-loop hairpins, and their folding free energies were − 53.4, − 60.9, − 60.9, and − 60.8 kcal·mol− 1, respectively. Additionally, to evaluate the consistency of the biological replicates, Principal Components Analysis (PCA) was performed on the 16 samples using the expression profiles of all known and novel miRNAs. As shown in Fig. 7, every two replicates were relatively clustered together, indicating the reliability as well as operational stability of the experimental results.
Differential expression of miRNAs in embryos during seed development
Pairwise comparison between different developmental stages in each line was conducted using the normalized expression levels of miRNAs to identify differentially expressed miRNAs (P-values < 0.05). In total, 53 known miRNAs and 11 novel miRNAs were found to be differentially expressed in pairwise comparisons (HO40 vs. HO54, HO54 vs. HO68, HO68 vs. HO81, LO40 vs. LO54, LO54 vs. LO68, and LO68 vs. LO81). Among them, only five known miRNAs (miR390, miR156i-3p_1, miR482b-3p_1, miR166a_1, and miR156e-3p_1) expressed at high levels (Additional file 6: Table S6; Additional file 7: Table S7).
In the HO line, 11 miRNAs were significantly differentially expressed between 40 and 54 daa (HO40 vs. HO54), including 5 and 6 miRNAs with upregulated and downregulated expression in HO54, respectively (Additional file 6: Table S6a). And 8 differentially expressed miRNAs were identified between 54 and 68 daa (HO54 vs. HO68), which included 3 upregulated miRNAs and 5 downregulated miRNAs in HO68 (Additional file 6: Table S6b). Furthermore, four miRNAs were upregulated and another four miRNAs were downregulated at the fully mature stage (81 daa) of the HO line compared with 68 daa (Additional file 6: Table S6c). In the LO line, there were 17 significantly differentially expressed miRNAs between 40 and 54 daa (LO40 vs. LO54). When compared to LO40, 6 and 11 miRNAs were upregulated and downregulated in LO54, respectively (Additional file 7: Table S7a). In the LO54 vs. LO68 comparison, 7 miRNAs were upregulated and 8 miRNAs were downregulated in LO68 (Additional file 7: Table S7b). Moreover, 20 miRNAs showed significant differential expression between 68 and 81 daa (LO68 vs. LO81), of which 10 miRNAs were upregulated and another 10 miRNAs were downregulated in LO81 (Additional file 7: Table S7c). Additionally, we found two differential miRNAs (upregulated miR156e-p3_1 and downregulated miR172b) in both lines at 40 vs. 54 daa and 54 vs. 68 daa, respectively.
Differentially expressed miRNAs between the high-oil-content and low-oil-content lines
The differentially expressed miRNAs between the two lines were compared. The significantly differentially expressed miRNAs between two samples with a P-value < 0.05 are presented in Fig. 8. At 40 daa, 5 miRNAs were significantly upregulated and 10 miRNAs (e.g., miR156e-3p_1 and miR172b) were downregulated in the HO line compared with the LO line. At 54 daa, six miRNAs were upregulated in the HO line, while the other five miRNAs (miR172b, miR390_1, miR397-5p, miR172c-5p_1, and miR156e-p3_1) were downregulated in the HO line compared with the LO line. At 68 daa, six upregulated miRNAs (e.g., miR171i-p5_1) were found in the HO line, whereas the other four miRNAs (miR169a-3p_1, miR172b, Xso-miRn62, and Xso-miRn27) were preferentially expressed in the LO line. We also identified seven downregulated miRNAs (e.g., miR319p_1, miR5655-p3, and miR7760-p3_1) between fully mature embryos of the HO and LO lines. Interestingly, one co-expressed differential miRNA (miR172b) was detected that was downregulated at the early and middle stages of seed development (40, 54, and 68 daa) in the HO line relative to the LO line at the corresponding developmental stages. This indicated that this known miRNA might play an important role in the early and middle stages of yellowhorn seed development.
Identification of differentially expressed miRNA target genes and gene enrichment analysis
Target mRNA is regulated by miRNA through translational repression or mRNA degradation. A total of 141 differentially expressed miRNAs (120 known and 21 novel miRNAs) were putatively targeted to 2654 genes to identify any correlation between the expression of differentially expressed miRNAs and mRNAs (Additional file 8: Table S8). Some miRNA target genes were annotated as transcription factor-coding genes, including ARF2, which was targeted by miR172b; growth-regulating factor 5 (GRF5), which was targeted by miR171i-p5_1; ethylene-responsive transcription factor 3 (ERF3), which was targeted by miR171k-5p_1; MADS domain protein AGAMOUS-LIKE 61 (AGL61), which was targeted by miR7760-p3_1; and WRKY transcription factor 41 (WRKY41), which was targeted by Xso-miRn80 (Additional file 9: Table S9). MiR7760-p3_1 was also predicted to target the KAR gene; miR319p_1 targets the FAD2–2 gene; miR1536-p5_2 targets the lysophosphatidyl acyltransferase 5 (LPAT5) gene; miR5647-p3_1 targets the DGAT1 gene and miR7760-p5_1 was predicted to target the Mediator subunit 15a (MED15A) gene (Additional file 9: Table S9). These miRNA-target regulatory modules are reported here in yellowhorn for the first time.
The differentially expressed miRNA targets between two lines were subjected to Gene Ontology (GO) enrichment analysis. One oil accumulation-related GO term (FA biosynthetic process (GO:0006633) was found to be significantly enriched by conducting GO annotation of these target genes. Notably, the most enriched GO terms were the nucleus, DNA binding, transcription, regulation of transcription, positive regulation of transcription, zinc ion binding, response to auxin, and cell wall modification (Fig. 9). The target genes of the differentially expressed miRNAs (miR7760-p3_1) in HO81 and LO81 were involved in the regulation of FA biosynthetic process, positive regulation of transcription, zinc ion binding, and NAD binding (Additional file 8: Table S8). The functions of the predicted target genes were diverse, suggesting that these differentially expressed miRNAs may play important roles during seed development and lipid biosynthesis in yellowhorn.
Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis of all differentially expressed miRNA targets showed that 268 target genes targeted by 52 of differentially expressed miRNAs mapped to 145 pathways (Additional file 10: Table S10), including those correlated with amino acid metabolism and lipid biosynthesis and metabolism, such as histidine metabolism, biosynthesis of unsaturated fatty acids, fatty acid biosynthesis, glycosphingolipid biosynthesis-ganglio series, glycerolipid metabolism, and fatty acid metabolism. Notably, the miRNAs miR319p_1, miR7760-p3_1, miR7760-p5_1, and miR5655-p3 were associated with biosynthesis of unsaturated fatty acids and fatty acid biosynthesis through targeting the FAD2–2, KAR, short-chain type dehydrogenase/reductase (SDR1), and pentatricopeptide repeat-containing protein (PCMP-H42) genes, respectively (Additional file 9: Table S9; Additional file 10: Table S10).
Features and the evolutionary conservation of the target gene encoding MED15A
With respect to the target gene (cDNA sequence) encoding MED15A, which was designated as XsMED15A, it had 3467 nucleotide residues containing a complete 2463-bp open reading frame (ORF), which was predicted to code for 820 amino acids-polypeptide with a deduced molecular weight of 89.02 kDa and isoelectric point of 8.08 (Additional file 11: Fig. S1). XsMED15A amino acid sequence was submitted online and the results showed that it contained a coiled coil region and two internal repeats, without signal peptide. Multiple sequence alignments indicated that XsMED15A mature peptide was highly conserved with Mediator subunit 15a found in other plants. It shared 81% identity with the MED15A protein of Pistacia vera (XP_031251638.1), 80% with Citrus clementina (XP_024036133.1), 78% with Juglans regia (XP_035548095.1), 77% with Malus domestica (XP_028955279.1), 76% with Gossypium hirsutum (XP_040969330.1), and 74% with G. max (XP_003547623.1) (Additional file 12: Fig. S2).
Quantitative real-time PCR validation of miRNA and corresponding target genes
To validate the expression patterns of the miRNAs derived from the high-throughput sequencing, five miRNAs (miR172b, miR171i-p5_1, miR7760-p3_1, miR319p_1, and Xso-miRn80) and nine target genes (ARF2, GRF5, AGL61, KAR, FAD2–2, WRKY41, LPAT5, DGAT1, and MED15A) were selected and qRT-PCR analysis was performed. The resulting qRT-PCR data for the differentially expressed miRNAs were nearly consistent with the sequencing results, and the expression changes of the target genes showed an inverse correlation with differentially expressed miRNAs (Fig. 10). MiR7760-p3_1, however, showed relatively low expression in both lines at 40–68 daa compared with 81 daa, and showed exclusive expression at 81 daa in the sequencing data.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.