REV ISS WEB EVA 12138 7 3 403 University of Ottawa

Rev Iss Web Eva 12138 7 3 403 University Of Ottawa-Free PDF

  • Date:31 Jan 2020
  • Views:44
  • Downloads:0
  • Pages:10
  • Size:1.04 MB

Share Pdf : Rev Iss Web Eva 12138 7 3 403 University Of Ottawa

Download and Preview : Rev Iss Web Eva 12138 7 3 403 University Of Ottawa


Report CopyRight/DMCA Form For : Rev Iss Web Eva 12138 7 3 403 University Of Ottawa


Transcription:

Global influenza transmission networks Aris Brosou. evolution is expected to reflect the interaction with its trimmed Talavera and Castresana 2007 Capella Guti errez. human host Russell et al 2008 As a consequence HA is et al 2009 to conserve as many variable sites as possible. the most sequenced gene among the eight single strand upstream and downstream of the coding sequence Because. negative RNA segments constituting the influenza A gen of the potential presence of noncoding sequences in the. ome As of October 28 2013 the Influenza Virus Resource data and of out of frame data alignments were performed. IVR Bao et al 2008 contains 34 540 sequences in directly on DNA instead of protein sequences Aris Brosou. humans worldwide all subtypes confounded while the 2010 Abdussamad and Aris Brosou 2011 This was not. neuraminidase NA gene which encodes the second sur problematic here as sequences within a quarter and a given. face antigen of the influenza virus comes as a distant sec region showed high levels of similarity Alignments were. ond in the database with 22 817 sequences 34 fewer visually inspected with JalView Waterhouse et al 2009. sequences These HA sequences were then analyzed for misaligned sequences were removed H1 HM625636. both H1N1 and H3N2 subtypes For this I describe a CY083655 H3 FJ769860 EU835537 EU642547. novel combination of phylogenetic time series and net EU642548 and gaps were adjusted manually Phylogenetic. work analyses under a stratified design This approach trees for each of these data sets were then estimated using. reveals the existence of waves of genetic diversity that con maximum likelihood with FastTree version 2 1 3 Price. tinuously circulate around the world seasonally and et al 2010 under the GTR substitution model which. unveils a potential shift in the transmission pattern of is general enough to accommodate substitution patterns in. H1N1 at the emergence of the 2009 pandemic Regardless those closely related sequences Only data sets with more. of this shift the results show that the proportion of than two sequences were analyzed One hundred bootstrap. infected people in a given population prevalence is at its replicates were generated for each data set with seqboot. peak when influenza diversity is at its lowest at least in the Felsenstein 2005 and analyzed with FastTree as above. US population Tree lengths were computed for each estimated tree by tak. ing the sum of their branch length and were standardized. by dividing each of them by the number of sequences This. Materials and methods, standardized tree length used as a measure of genetic. Data retrieval diversity is henceforth denoted m, Human HA nucleotide sequences of influenza A viruses of As this measure can be sensitive to phylogenetic uncer. subtypes H1N1 and H3N2 were downloaded from the IVR tainty nucleotide diversity Nei and Li 1979 p was also. for five World Health Organization WHO geographic computed It is the sum of pairwise distances of n aligned. regions Asia Europe North America Oceania and South sequences normalized by the number of comparisons n. America too few sequences H1 258 H3 277 were n 1 2 the distance used is the raw distance that is the. deposited for Africa for viruses collected between January uncorrected number of pairwise differences The R pack. 1 1996 and April 1 2011 These data were split by quarter age pegas Paradis 2010 was modified to take care of sums. for each of the WHO regions The length of this time win with missing data as pairwise deletion was used to handle. dow quarter was chosen to ensure that most alignments gaps. had 2 sequences All data were downloaded in May 2011. resulting in at most 570 data sets 4 quarters for 14 25 years. Time series and network analyses, in 5 regions and 2 subtypes Sequences without informa. tion about collection month were discarded This resulted Time series analysis was performed at the level of diversity. in 8748 H1 and 6587 H3 sequences Figure S1 shows the data m and p To extract seasonality patterns a simple. countries from which data were retrieved Data distribution additive decomposition was performed with the following. is depicted in Figure S2 with a more detailed breakdown model here described for m but a similar equation was. by region in Figure S3 used for p, In parallel weekly prevalence data for the USA were col. mt mt st t 1, lected from the Centers for Disease Control and Prevention.
CDC at www cdc gov flu weekly from the last quarter of where mt represents the estimated quarterly values for m. 1997 to April 2011 Weekly data were averaged by quarters such that for Q quarters mt t 1 2 Q m1 m2 mQ. as summarized in Figure S4 The terms in the right hand side of equation 1 represent. the trend mt the seasonal effect st and an error term. et that is generally a sequence of uncorrelated random. Phylogenetic analyses, variables with a mean of zero The trend mt was estimated. Each of the 570 data sets was aligned with Muscle Edgar with a moving average centered on mt The quarterly addi. 2004 with default parameters Sequences were not tive effect was then estimated as. 404 2013 The Authors Evolutionary Applications published by John Wiley Sons Ltd 7 2014 403 412. Aris Brosou Global influenza transmission networks. st mt m 2 Q0 Q1 and Q2 were included Sampling variances of. both their maximum and variance across all five WHO. The error term is then calculated as in equation 1 for. regions were obtained by rewiring the networks 105 times. more details see Cowpertwait and Metcalfe 2009 pp 19. 22 All time series analyses were performed in R and were. based on the stats package R Development Core Team Results and discussion. 2011 The seasonality component of these time series data. Seasonality patterns of genetic diversity, was extracted as detailed in the electronic supplementary. HA sequences were split or stratified by quarter region. material The complete design summarized in Fig 1 leads. and subtype to compute an estimate of genetic diversity at. to the reconstruction of connectivity networks in terms of. these different levels hereby leading to the analysis of 135. either genetic diversity m or nucleotide diversity p The. H1 and 201 H3 data sets The estimate of genetic diversity. complete analyses were based on both summary statistics. first evaluated here denoted m is the average tree length of. To compare the reconstructed networks PageRank ei, the phylogeny reconstructed for each data set scaled by the. genvalues Brin and Page 1998 were computed with R s. number of sequences entering each data set This scaling. igraph library Csardi and Nepusz 2006 All three lags. ensures that m is independent of sample size t 0 12. P 0 9052 As such m represents the average number of. substitutions per site per sequence during a given year in a. A IVR given WHO region The stratified design has the advantage. of being simple to implement and m is a simple statistic to. 1996 2011 compute with minimal assumptions Critically m does not. Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4, assume that all sequences in a given quarter in a given. WHO region come from the same population Different. populations will be separated by long branch lengths lead. ing to high estimates of m while no population structure. will lead to small estimates of m Because the complete. analysis only relies on slices in time and in space it may be. expected to be more robust than a full modeling approach. i i i i i i i i based on spatial processes Lemey et al 2009 that are. B potentially time inhomogeneous Aris Brosou and Rodri. 1 0 0 5 0 0 0 5 1 0,Seasonal component, gue 2012 The test of this robustness hypothesis is left to.
future work, Because the general approach relies on phylogenetic trees. Fig 1 the estimates of m might be sensitive to phyloge. netic uncertainty Two additional analyses were taken to. 0 4 8 12 16, address this point First average confidence in the phyloge. Time quarters netic trees was estimated by computing mean bootstrap. values B p over all the bipartitions all trees regions and. C D subtypes confounded This mean was very close to 70. B p 0 6996 a value showed by Hillis and Bull 1993 to. correspond to a 95 probability of bipartitions to be true. and B p was close to the mode of the distribution of individ. Connectivity Relative ual bootstrap values Figure S5. connectivity Alternatively nucleotide diversity p a measure indepen. through time, dent of phylogenetic information was also computed In. Figure 1 The analytic framework adopted in this study A HA spite of losing some information when compared to m in. sequences were downloaded from the Influenza Virus Resource IVR the same way that distance methods are less efficient than. for both H1 and H3 viruses and stratified by quarters For each quarter maximum likelihood at reconstructing trees e g Hasegawa. a phylogenetic tree was estimated by maximum likelihood and used to et al 1991 nucleotide diversity p was highly correlated. derive the estimator m of genetic diversity B The detrended seasonal. with m Figure S6 Furthermore depending on the data. component of m was then extracted for each subtype within each. region e g region R1 in red and R2 in blue C Phase shifts were. sets analyzed correlations vary between 45 and 96 and. measured and tested through an autocorrelation analysis D The are consequently quite high while the part of the total vari. results were mapped in terms of absolute connectivities and in terms of ance explained by the model fit varies between 39 and. relative connectivities through time 63 Figure S6 While the analyses below were performed. 2013 The Authors Evolutionary Applications published by John Wiley Sons Ltd 7 2014 403 412 405. Global influenza transmission networks Aris Brosou. on both m and p results show that m or p can be used iniscent of the biphasic pandemic diversity observed in the. almost interchangeably without affecting the conclusions USA Nelson et al 2011 or in Scotland Lycett et al. As with any phylogenetic analysis the approach 2012. described here may be sensitive to sampling biases where a To assess the general impact of the pandemic H1N1 09. number of sequences deposited in the database come from sequences on the results the seasonality decompositions. a small number of places This sampling bias is expected to were conservatively limited to 1996 Q4 2008 fourth quar. be mitigated by pooling individual locations into WHO ter of 2008 While pandemic sequences could have. regions Indeed it is clear from Figure S7 that the five diverged as early as 2006 Abdussamad and Aris Brosou. regions are not homogeneously sampled For instance data 2011 the first casualties were recorded back in March. from South America are relatively sparse see also Figures 2009 Smith et al 2009 Figure S14 confirms the patterns. S2 S3 In spite of this uneven sampling however the described above for the whole 14 25 year period Yet this. diversity peaks are uncorrelated at the a 1 level at similarity does not mean that the detailed correlation struc. two levels i among the five regions and ii between the ture in terms of genetic diversity m and hence global trans. two subtypes Table S1 Regardless of this lack of exact mission patterns did not change. spatiotemporal correlation can one find patterns in each. region for each subtype that might be correlated after a. Structure and stability of global transmission patterns. time phase shift or lag, To address this question I extracted the seasonal signal To better understand the relationship between these peak. through time first from the diversity m data The actual diversities across both geographic regions and subtypes a. decompositions shown in Figures S8 S12 reveal that a spatiotemporal autocorrelation analysis was performed In. seasonal signal exists in each region and for each subtype the first step the autocorrelation functions were computed. Figure S13 summarizes these results A number of patterns for H1 and H3 viruses among regions Figures S15 S17. emerge from this seasonal decomposition First only in Then significant autocorrelations at the 99 level were. Asia are the H1 and H3 diversities peaking at the same plotted on a map of the world for three lags 0 1 and 2. time during Q4 In the other four regions the peak diver quarters Figure S18 For instance a lag of 0 means that. sities are asynchronous among subtypes with H3 diversity the diversities are synchronized between two regions while. generally peaking before H1s except in Europe where it is a lag of two quarters indicates a correlation with a 6 month. the opposite pattern This general result is consistent with shift which corresponds to the time difference between. previous evidence showing that H3 viruses exhibit highest seasons in each hemisphere In the context of global circu. diversity early during the epidemic period in the state of lation of influenza viruses Rambaut et al 2008 Russell. New York Creanza et al 2010 This first point also sug et al 2008 this map represents the temporal connectivity. gests that each subtype has its own global dynamics which in terms of genetic diversity of these viruses among differ. may become synchronized in the tropics probably because ent regions Thus these results naturally lend themselves to. tropical regions have more sustained biphasic epidemics graph theory see Kaiser 2011. per year than other regions Tamerius et al 2011 Second H3N2 viruses show two highly connected subnetworks. ORIGINAL ARTICLE Inferring in uenza global transmission networks without complete phylogenetic information Stephane Aris Brosou1 2 1 Department of Biology Center

Related Books