FirmiData is a set of 40 public annotated genomes of Firmicutes (information on strains are available in Table 1, ‘Data file 41’, [9]). These genomes were extracted from the Refseq database for which ICEs and IMEs annotation was added using the standard annotation features and qualifiers used in the Genbank format [9].

The search for ICEs and IMEs relies on data from the literature and on a semi-automated procedure that was described in [1] and [2]. All annotations were carefully checked and corrected when necessary and almost all of these elements were delineated at the nucleotide level.

ICE expert annotation is based on the identification of genes carried by their conjugation module and encoding three proteins (relaxase, coupling protein and VirB4) needed for their transfer. ICE identification also relies on the search of three types of integrases (tyrosine integrases, serine integrases, and DDE transposases of the ISLre2 family). IME characterization was done in a similar fashion as for ICEs except that we searched for genes carried by their mobilization modules encoding a relaxase and eventually a coupling protein.

The co-localization of the genes encoding the signature proteins attests for the presence of an ICE or an IME. Their boundaries were then searched. One relevant information for the ICE/IME delineation is the knowledge of the insertion site targeted by their integrase. To get this information, phylogenetic analyses were done to identify the closest integrase for which the integration site was already identified. ICE/IME boundaries were identified manually by the search for the Direct Repeats (DRs) flanking the elements. When the DRs were missing, too short or too degenerated to be detected, the region containing the element was compared with (i) target regions or genes lacking elements, or (ii) already known and well-delineated elements.

When an expected signature gene was not identified within an element, a thorough examination of all its genes and sequences was done to identify those encoding the missing signature proteins.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Disclaimer:

This article is autogenerated using RSS feeds and has not been created or edited by OA JF.

Click here for Source link (https://www.biomedcentral.com/)