User interface

The portal website of Vcorn SARS-CoV-2 mainly provides three kinds of retrieval methods to acquire information on COVID-19 and SARS-CoV-2 S protein mutations: 1) global data on COVID-19, 2) domestic data on COVID-19, and 3) data on S protein mutations. With one or a few clicks, a user can access a web page that contains information of interest.

Global data

In the “Global” section, there is a global heatmap in which each nation (or region, territory or district) can be clicked to access a web page with information on COVID-19 in that nation. The colors of the heatmap from yellow to black represent the cumulative number of COVID-19 cases in ascending order.

In the “Global cases” section, just three steps are required to access to a web page with six types of global maps, including maps of 1) weekly cases, 2) weekly deaths, 3) the change in cases (the ratio of cases in the present week to the cases two weeks ago) (Fig. 2), 4) cumulative cases, 5) cumulative cases per million persons, and 6) the monthly mortality ratio (the ratio of monthly deaths to monthly cases).

Fig. 2
figure 2

A global map showing the change in the number of COVID-19 cases. The change is based on the ratio of cases in the present week to those two weeks ago. In the map, the color for each nation depends upon the ratio; i.e., black, dark red, red, tomato, orange, and yellow represent 8 times or larger, 4 times or larger, twice or larger, 1.5 times or larger, equivalent or larger, lower than 1, respectively. Data for nations in white are unreported. This global map is based on a file obtained from Wikipedia (https://en.wikipedia.org/wiki/Wikipedia:Blank_maps), according to its Terms of Use

Domestic data

In the “Domestic cases” section, just three steps are required to access two kinds of web pages with daily and weekly data on COVID-19 in the selected nation. On the web page for daily cases and deaths, line charts of daily cases and deaths are depicted in blue and red, respectively. The scale of the chart of cases (the left vertical axis) is ten times higher than that of deaths (the right vertical axis). The duration of the line charts is from the beginning of 2020 to the end of 2022. The data table includes not only the daily cases and deaths but also the S protein mutations that were assigned to collection dates according to the metadata of the specimens, with hyperlinks to web pages for the specific mutations.

On the web page for weekly cases and deaths, the line charts are similar to those for the daily data. Cases divided by age group are depicted in line charts for France, Japan, South Korea, the United Kingdom, and the United States as of February 2022 (Fig. 3).

Fig. 3
figure 3

Line charts of COVID-19 cases by age group in England. The horizontal axis represents months from 2020 to 2022. Left and right vertical axes represent the numbers of cases and deaths, respectively. The former axis is ten times larger than the latter. Line charts in green, blue, and yellow represent monthly cases in the young (0–29 years old), middle (30–59 years old), and elderly (60 years old or older) age groups, respectively. The line chart in red represents monthly deaths

Correlation network among major mutations

In the “Search for single S protein mutation” section, a user can click on a mutation in a correlation network of major mutations (Fig. 4) or select/input a mutation to access a web page for the mutation of interest. The web pages of S protein mutations contain 1) a list of variants with WHO labels containing the mutation of interest, 2) a correlation network among major mutations in which mutations related to the mutation of interest are labeled in red, 3) bar charts of the ratio of specimens sharing major mutations, 4) a monthly line chart of the temporal change in the global frequency of the mutation, and 5) the geographical pattern of specimens containing the mutation. In the correlation network and the bar graph illustrating mutation coexistence, each mutation is hyperlinked to the web pages relevant to that mutation.

Fig. 4
figure 4

A correlation network composed of major mutations. When the frequency of a mutation is 1% or larger among the specimens studied, the mutation is classified as a major mutation in the present research. In the network, a node (Mutation A) represents a major mutation and is connected to another node (Mutation B) based on the recall index from Mutation A to Mutation B (({R}_{AB}ge 0.5)). The coloration of the nodes depends upon the network modules, in which nodes are tightly connected to each other. Purple, orange, and red nodes represent mutations contained in Alpha, Delta, and Omicron variants, respectively. The direction of the arrow is based on the recall index from one node (Mutation B) to another node (Mutation A), in the opposite direction of the recall index

In the “Search for multiple S protein mutations” section, a user selects a pair of mutations of interest to access a web page for that pair. Although the web page for the pair is identical to that for a single mutation, a correlation network composed of major mutations emphasizes mutations related to the pair; i.e., mutations that coexist (with a recall index of 0.5 or higher) in specimens that contain the mutation pair are labeled in red. When a user selects a pair of mutations of interest from different network modules, the specimens selected by this retrieval method may differ from VOCs such as alpha and delta variants but be similar to Mu variants.

Variants with WHO labels

In the “Search for WHO label” section, a user selects a WHO label of a SARS-CoV-2 variant to access a web page for the variant, in which there is a correlation network composed of major mutations in which nodes represent major mutations and are connected to other mutations based on the recall index between mutations (0.5 or larger). The colors of the nodes depend on the network modules according to the Louvain method of the Pajek tool, and purple and blue nodes represent alpha and delta variants, respectively. Mutations with red or gray labels are present or not present in the variant, respectively.

Case study

Two kinds of retrieval methods are exemplified in this section. One is a general strategy for obtaining information on domestic data on COVID-19, and the other is a method for searching for information on the Omicron variant.

In the Domestic cases section of the portal page, “United Kingdom” is selected in the first step, “Weekly” is selected in the second step, and the “Submit” button is clicked to access a web page for information on weekly COVID-19 data in the United Kingdom (only England) (Fig. 3). On the web page, line charts in green, blue, and yellow represent the weekly numbers of cases in 0 to 29 year olds (young age group), 30 to 59 year olds (middle age group), and 60 year olds and older (elderly age group), respectively, and a line chart in red represents the weekly numbers of deaths. According to the line charts, the United Kingdom has shown three peaks of COVID-19, from March to June 2020, September 2020 to March 2021, and May 2021 to the present. In the first peak, the number of deaths was quite high; e.g., in the week from April 8 to 14, 2020, the ratio of deaths to cases was 32.9%. In the second peak, the ratio was lower than that in the first peak, e.g., in the week from January 13 to 19, 2021, it was 3.4%. In the third peak, the ratio has been markedly lower than that in the second peak, e.g., in the week from September 8 to 14, 2021, it was 0.5%. It is possible that this low number of deaths may be due to intensive vaccination. The numbers of cases in the three age groups were different among these peaks. During the first peak, case numbers in the middle and elderly age groups were similarly high, and the number in the young age group was lower. After the first peak, case numbers in the young and middle age groups were larger than those in the elderly age group during the second and third peaks. Based on these line charts, it is simple to understand the tendency of cases by age group.

A way to retrieve information on Omicron variant is illustrated with an example in this paragraph. In the Search for WHO label section of the portal site, “Omicron” was selected in the first step, and the “Submit” button was then clicked. This mutation is included in all variants designated by the WHO that include the Mu variant. On the web page for D614G, a hyperlink to the Mu variant is clicked to show a web page for the variant (Fig. 5). The network shows that the mutations of the variant are separately located in different network modules; e.g., N501Y and P681H are located in a network module in purple (i.e., alpha variant), D950N is in an orange module (i.e., delta variant), and E484K is in a blue module. This indicates that the mutations of the Mu variant potentially originated from multiple variants. The unusual combination of the mutations contained in the variant is explicitly depicted in such a correlation network.

Fig. 5
figure 5

Correlation network emphasizing mutations contained in the Mu variant. The legends of the network are similar to those of Fig. 4, except for the coloration of the labels of mutations; i.e., a mutation with a red or gray label is present or not present in the variant, respectively. According to the network, the Mu variant has six major mutations, among which there are two nodes in purple (alpha variant), one node in orange (delta variant), and one node in blue

Discussion

The Vcorn SARS-CoV-2 database provides information not only on daily and weekly COVID-19 data by nation/region/territory/district but also on mutations in the S protein of SARS-CoV-2 based on correlation network analysis. A correlation network that emphasizes mutations related to a mutation is constructed and helps us to understand the evolutionary traits of S protein mutations in a series of variants, particularly variants containing mutations found in multiple variants, as is the case for the Mu variant. In a correlation network that contains major mutations, several mutations (nodes) like D614G are the bases of many arrows; i.e., such mutations may precede other mutations. Mutations included in network modules, in which mutations are tightly connected to each other, are detected in sampled genomes similar to those harboring other mutations in the same modules, indicating that the timing of their occurrence was almost simultaneous to each other. Some variants with WHO labels, such as Alpha, Delta, and Omicron, tend to form network modules.

GenBank [4], Nextstrain [5], and SARS-CoV-2 MAT [8] also provide information on mutations of the virus and refined phylogenetic trees to help trace the lineage of the mutations and the variants. The outbreak.info database [11] provides well-organized and clear illustrations and charts to show the global and temporal situation of a variant. Forster et al. [23] emphasized that a network approach for drawing a phylogenetic tree was useful to trace routes of infection in COVID-19 cases during the first global outbreak and provided an example of a phylogenetic tree for tracing such cases and classifying three types of variants. Gupta et al. [24] applied a network approach to trace a lineage of variants and to understand gene functions in specimens of the virus. Sekizuka et al. [16] depicted a phylogenetic network of sampled virus genomes to identify potential infection routes. These databases and depictions are useful for discussing the relationships between sampled virus genomes. However, they have no explicit approach for addressing an unusual combination of mutations contained in a variant. Vcorn SARS-CoV-2 is suitable for visualizing such unusual combinations.

Although the mutations assigned to a given variant should be present in similar specimens, they tend to occur in different specimens to some extent. This may be caused by errors in sequence alignment during the BLASTN search and by biological and evolutionary events. The latter type of events may result from the combination of (or cross-talk between) mutations contained in multiple variants in a specimen and from differences in the frequency of mutation based on the S protein structure according to the positions of base sequences. When a base position shows a high frequency of mutation, the mutation at that position may occur in different cases. The Mu variant contains such mutations, as described in the previous paragraph [10]. However, its infectivity remains largely unknown [25]. The rational tracing of mutations contained in the variant is useful for understanding the evolutionary traits of possible mutations.

Future developments

The Vcorn SARS-CoV-2 database is updated weekly with information on COVID-19 and monthly with information on mutations in the S protein of SARS-CoV-2. The Vcorn project aims to elucidate evolutionary traits in viruses, including not only SARS-CoV-2 but also other viruses, such as influenza virus, using correlation network approaches. In the near future, the project will perform correlation network analyses of all coronaviruses to shed light on the missing link between SARS-CoV-2 and its ancestral virus.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Disclaimer:

This article is autogenerated using RSS feeds and has not been created or edited by OA JF.

Click here for Source link (https://www.biomedcentral.com/)