Complete Mitochondrial Genome of the Eurasian Oystercatcher Haematopus ostralegus and Comparative Genomic Analyses in Charadriiformes

Chaochao Hu1,2, Xue Xu1, Wenjia Yao1, Wei Liu3, Deyun Tai1, Wan Chen4 and Qing Chang1* 1Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Sciences, Nanjing Normal University, Nanjing 210023, Jiangsu, China. 2Analytical and Testing Center, Nanjing Normal University, Nanjing 210023, Jiangsu, China. 3Nanjing Institute of Environmental Sciences, Ministry of Environmental Protection, Nanjing 210042, Jiangsu, China. 4College of Environment and Ecology, Jiangsu Open University (The City Vocational College of Jiangsu), Nanjing 210036, Jiangsu, China. Article Information Received 21 Februaary 2020 Revised 02 May 2020 Accepted 20 May 2020 Available online 10 December 2020


INTRODUCTION
T he resources of mitogenome have rapidly accumulated in recent years thanks to the advanced genomic sequencing, while the mitogenome has not been well studied in Charadriiformes (Baker et al., 2007;Gibson and Baker, 2012;Friesen, 2015;Hu et al., 2017). The Eurasian oystercatcher, Haematopus ostralegus (Charadriiformes, Haematopodidae), is a relatively large wader with an extremely large distribution ranging from Europe to Siberia. This species has been categorized as near threatened in red list of threatened species. Eurasian oystercatcher feeds on mussels, crabs, earth worms and all kinds of invertebrates, rarely on fish. Recent studies of H. ostralegus pay more attention on habitat ecology,

O n l i n e F i r s t A r t i c l e
mitogenome of H. ostralegus, which may provide a valuable resource facilitating further study of population genetics and biogeography of H. ostralegus, and provide useful information for understanding the evolutionary and taxonomic research within Charadriiformes.

Ethics statement
Our experimental procedures complied with the current laws on animal welfare and research in China, and were specifically approved by Nanjing Normal University's Animal Care and Use Committee (Permit #2011-04-008).

Sample and DNA extraction
The specimen of H. ostralegus was collected from a derelict and abandoned mist net in Dongtai, Yancheng City, Jiangsu Province, China (32°46'23.12" N, 120°57'41.68" E), in July 2017. After collection, the tissue was initially preserved in 95% ethanol in the field, and transferred to −80°C in our laboratory for long-term storage at Nanjing Normal University (specimen voucher number: NJNU-Host07). Total genomic DNA was extracted using standard phenol-chloroform methods (Sambrook and Russell, 1989). The quality of DNA was assessed through electrophoresis in a 1% agarose gel and staining with Gold View. The complete mitogenome of H. ostralegus was generated by amplification of overlapping polymerase chain reaction (PCR) fragments (Hu et al., 2017). Then the PCR products were purified using a gel extraction kit (Promega) and sequenced with each of the PCR primers on an ABI 377 sequencer. Sequences obtained were aligned and edited using the software SeqMan (DNAStar, Inc.) to generate complete mitochondrial DNA sequences.

PCR amplification and sequencing
All primers used in this study were taken from Hu et al. (2017). PCR was performed in a 30 μL system, which contained 2 × Taq PCR SuperMix (Tiangen, China) 15 μL, 10 μM of each primer (forward and reverse) 1 μL, 1 μL template DNA and 12 μL sterile double-distilled water (ddH2O). PCR profile was 5 min initial denaturation at 95 °C; 35 cycles of 30 s denaturation at 95 °C, 30 s annealing at 50-55 °C, and 1-2 min extension at 72 °C; and a final extension at 72 °C for 8 min and 4 °C hold. The PCR products were electrophoresed by 1% agarose gel and then purified and sequenced with each of the PCR primers on a DNA sequencer (ABI 3731XL).

Genome annotation and bioinformatics analysis
Sequences obtained were assembled and edited manually using the software SeqMan (DNAStar, Inc.) to generate complete mitochondrial DNA sequences. The protein-coding genes (PCGs) were determined by Open Reading Frame Finder implemented at the NCBI website with the vertebrate mitochondrial genetic code, and then finally confirmed by sequence comparisons with the reported Charadriiformes mitogenomes. The tRNAscan-SE 1.21 (Lowe and Eddy, 1997), MITOS (Bernt et al., 2013), and ARWEN (Laslett and Canbäck, 2008) were used to confirm tRNA annotation results. The skewing of the nucleotide composition was calculated according to the following formulas: AT skew = (A -T) / (A + T) and GC skew = (G − C) / (G + C) (Perna and Kocher, 1995;Lobry, 1996). The tandem repeats were searched in the CR using the Tandem Repeats Finder program (Benson, 1999). Codon usage and nucleotide composition statistics were estimated using DnaSP 5.1 (Librado and Rozas, 2009) and Microsoft Excel 2016. The number of variable sites, the parsimony informative sites, the singleton, and the average uncorrected pairwise distances for each PCG were calculated by MEGA 6.0 (Tamura et al., 2013). The rates of non-synonymous substitutions (Ka, π modified), synonymous substitutions (Ks, π modified), the effective number of codons (ENC) and the codon bias index (CBI) for each PCG was determined with DnaSP 5.0 (Librado and Rozas, 2009).

Phylogenetic analysis
For phylogenetic analysis, PCGs of mitochondrial genomes of 32 Charadriiformes species were used (Table II). Each mitochondrial gene was aligned individually using Muscle in MEGA X (Kumar et al., 2018), and subsequently edited and trimmed. Based on 15 mitochondrial genes (13 PCGs, 12S and 16S) of Charadriiformes species, phylogenetic analysis was performed using Bayesian Inference (BI) and Maximum likelihood analysis (ML), with Columba livia (KP168712) and Gallus gallusa (KM096864) as outgroups. To determine the optimal partitioning of the data, the best-fit partitioning scheme and the most appropriate nucleotide evolution model for each partition were implemented in Partition Finder 1.1.1 (Lanfear et al., 2012). BI method was performed using MrBayes 3.1.2 (Ronquist and Huelsenbeck, 2003). Four Markov Chains Monte Carlo (MCMC) chains were run for 1.0 × 10 6 generations. Two independent runs were performed to allow additional confirmation of the convergence of MCMC runs. ML analysis was performed with RAxML 8.0.0 (Stamatakis, 2014). The node support was calculated with a GTRGAMMA model via rapid bootstrapping (-f a -x option) with ten runs and 1,000 replications to estimate the best topology.

Tests of selection
The effect of natural selection on the evolution of the mtDNA PCGs was assessed by comparing the number of nonsynonymous changes per nonsynonymous sites (dN) with that of synonymous changes per synonymous site (dS) (Yang et al., 2000). We further estimated the impact of selection along the mtDNA phylogeny of seabirds using codon models to assess the rates of synonymous  (Delport et al., 2010), by choosing the vertebrate mitochondrial DNA genetic code. The phylogeny estimated in this work, unrooted and excluding the outgroup, was used in all analyses of selection.

Genome organization
The circular mitogenome of H. ostralegus is 16,798 bp in length with 13 PCGs, 2 ribosomal RNAs (12S rRNA and 16S rRNA), 22 transfer RNA genes, and a noncoding region ( Table I). The annotated mitogenome of H. ostralegus has been deposited in GenBank (accession number: MH727533). The graphical mitogenome map was visualized using the software OGDRAW (Lohse et al., 2013), and the map was shown in Figure 1. The overall nucleotide composition was A: 31.45%, T: 23.46%, C: 31.30%, G: 13.79%. Among the 37 genes, nine genes (tRNA Gln , tRNA Ala , tRNA Asn , tRNA Cys , tRNA Tyr , tRNA Ser , ND6, tRNA Pro and tRNA Glu ) were encoded on the light strand, and the remaining 28 genes were encoded on the heavy strand. Gene overlaps have been found at 7 gene junctions, spanning 1-10 nucleotides, for a total of 25 nucleotides. The longest overlap (10 bp) exists between ATP8 and ATP6. The intergenic spacer regions occurred 19 times, spanning 1-14 bp, for a total of 81 bp. The gene order of the H. ostralegus mitogenome is identical to that of Charadriiformes mitogenomes, without showing any structural rearrangement.

Nucleotide composition
The 32 mitogenomes within Charadriiformes were summarized and compared (Table II). The differences in length are almost due to the size variation of the noncoding region. The nucleotide composition showed highly similar nucleotide composition biases towards AT rich (mean= 55.74, SD= 0.86) (Table II), which is consistent with previous avian mitogenomes (Hu et al., 2017). AT and GC skews are a measure of compositional asymmetry. In Charadriiformes mitogenomes, AT skew values were positive, while the values of GC skew were negative. The AT and GC skew values observed were 0.12 ± 0.02 (mean ± SD) and −0.38 ± 0.01, respectively. In general, AT and GC skews in Charadriiformes mitogenomes are similar to patterns typically found in most animal mitogenomes, which positive AT skew and negative GC skew are found for H-strand, implying the specific bias toward A and C in nucleotide composition (Hassanin et al., 2005;Hassanin, 2006).

Protein-coding genes (PCGs)
The total length of 13 PCGs is 11,392bp accounting for 67.82% of the complete genome. The nucleotide composition of PCGs can be found (Table II). All PCGs began with the ATG start codon, except ND3 (started with ATT) and ND5 (started with GTG). Three types of termination codons are used in this mitogenome, two PCGs (ND1and COI) use the complete stop codon AGG, seven PCGs (COII, ATP8, ATP6, ND3, ND4L, ND5 and Cyt b) stop with TAA, but the remaining four PCGs (ND2, COIII, ND4, and ND6) terminate in the incomplete stop codons T-.
Nucleotide composition bias is also reflected in the codon usage pattern. Among 62 amino acid encoding codons, CUA-Leu (L), AUC-Ile (I), and UUC-Phe (F) were the most frequently used codons. The least frequent codons were CCG-Pro (P), ACG-Thr (T) and CGG-Arg (R). Relative synonymous codon frequencies (RSCU) values were summarized in Table III, revealed that the degenerate codon usage at the third codon positions is generally biased to use more As and Ts than Gs and Cs. The most common amino acids were Leu, Ile and Phe, which are invariably rich in mitochondrial proteins of other birds (Li et al., 2014). However, Charadriiformes species used more codon GGA for amino acid Gly than other three degenerate codons (GGG, GGC, and GGU) (Hu et al., 2017).

O n l i n e F i r s t A r t i c l e
Complete Mitochondrial Genome of Haematopus ostralegus  In order to study the codon usage bias among Charadriiformes, we also analyzed the correlations between ENC (effective number of codons), CBI (codon bias index), the G + C content of all codons (G + Cc), and the G + C content of the third codon position (G + C3s). We found a significant negative correlation between CBI and ENC (R = 0.94, P < 0.05), and a significant positive correlation was found between CBI and G + Cc (R = 0.50, P < 0.05) and G + C3s (R = 0.58, P < 0.05) (Fig. 2). However, other pairs were not correlated with each other pairs. These results were consistent with the neutral mutational theories, in which the G + C content of mitochondrial genome was reported to be the most significant factor in determining codon bias among organisms (Plotkin and Kudla, 2011).

Phylogenetic analysis
The phylogenetic analysis resolved a well-supported clade of Charadriiformes (Fig. 3), which showed great mitochondrial divergence within the Charadriiformes. Relationships of the phylogeny strongly support monophyly O n l i n e

F i r s t A r t i c l e
Complete Mitochondrial Genome of Haematopus ostralegus of the order Charadriiformes, which has a long evolutionary history dating back at least to the late Cretaceous (Baker et al., 2007). This phylogeny was in agreement with previous family hypotheses for shorebirds (Paton and Baker, 2006;Hu et al., 2017)

Test of selection
We compared Ka/Ks ratios for all 13 PCGs in 32 Charadriiformes species. The average value of Ka and Ks ranged from 0.01 of COI to 0.09 of ATP8 and from 0.53 of ATP8 to 0.88 of ND1, respectively (Table IV). The ratio of Ka/Ks of all PCGs was far lower than one (≤ 0.17), indicating that all these genes evolved under purifying selection.  We then applied several methods to detect evidence of positive selection affecting mtDNA PCGs throughout the phylogenetic tree of seabird. Several positive selection sites were found based on site-specific analyses (Table IV). There are 12 different codons in eight genes were suggested to have evolved under positive selection by REL, FEL, or SLAC. Only three codons (in ND2, ND4 and ND6) were concordant in two of the methods (Table IV). MEME, which is particularly sensitive to events of episodic selection, suggested that 27 sites of 11 genes evolved under positive selection, which may have affected several lineages along the evolution of seabirds.

O n l i n e F i r s t A r t i c l e
Complete Mitochondrial Genome of Haematopus ostralegus