Insight into the Phylogenetic Position of Capreolus capreolus , the European Roe Deer, (Mammalia: Cervidae) from Turkey using Mitochondrial Genes

For contributing to a better understanding of the phylogenetic position and evolutionary history of Turkish roe deer, mitochondrial DNA sequence data belonging to a population from Artvin, Turkey were introduced by this study. The cytochrome b (cyt b ) and cytochrome c oxidase subunit I (COI) genes of mitochondrial DNA belonging to the four samples of Capreolus capreolus were partially sequenced and analyzed alongside the sequences obtained from GenBank. The cyt b and COI data sets included sequences that were 1059 and 646 bp long, respectively. The Turkish roe deer population in Artvin had a relatively low nucleotide diversity and high haplotype diversity. A total of two haplotypes were identified for cyt b , while three haplotypes were identified for COI. Of the haplotypes detected, two cyt b and two COI haplotypes were specific to Turkey. Molecular phylogenetic analyses; Maximum Likelihood (ML), Bayesian Inference (BI), and Network revealed topologies and evolutionary networks which were not definitely resolved. Consequently, Turkish roe deer haplotypes were intertwined with the roe deer haplotypes from Central Europe instead of forming an independent evolutionary clade, and were also, found genetically close to Central European haplotypes. According to the evolutionary dating analysis, the divergence of C. capreolus populations in Europe started at the beginning of the Pleistocene (approximately 2.72 mya). This analysis also demonstrated that the divergence of Turkish roe deer began at approximately 0.58 mya. Given the existence of, shared haplotypes, the relative positions of Turkish roe deer haplotypes in the topologies of the ML and BI phylogenetic trees and evolutionary networks, and the times of evolutionary divergence events within this species; the following results may be inferred: (1) roe deer individuals with the same genotype as those in Europe can also be found in areas further east, such as Turkey; (2) these could be considered to be representatives, which may have contributed to the formation and shaping processes of the known admixed genetic structure of European roe deer populations in Europe; and finally, (3) the roe deer populations may have probably entered Turkey from Europe when evolutionary divergence times based on BI analysis are considered.


INTRODUCTION
T he genus Capreolus of the Cervidae family includes deer species known as roe deer. Two species of roe deer are Capreolus capreolus (European roe deer or Western European roe deer) and Capreolus pygargus (Siberian roe deer). These two species differ from each other in body size and morphometric and karyological features (Danilkin, 1996). C. capreolus, known as European roe deer, has a wide distribution in the Palearctic region. This species is commonly distributed in the Marmara, northwest Aegean, Black Sea, northeast Anatolia and Mediterranean regions in Turkey (Demirsoy, 1996). The Siberian roe deer, C. pygargus is mostly found in northeastern Asia (in Siberia, Mongolia, Kazakhstan, Kyrgyzstan, the Korean Peninsula and north-eastern China) (Danilkin, 1996). While hunting and habitat fragmentation have had a major impact on roe deer populations, it is a widespread species and is listed as being of least concern (LC) on the red list of threatened species, according to the IUCN (Lovari et al., 2016). Previous studies on the taxonomy, population genetics and phylogeny of this species have generally been based on morphology and molecular data (Lister et al., 1998;Randi et al., 1998;Hewison and Danilkin,

O n l i n e F i r s t A r t i c l e
2001; Vernesi et al., 2002). Recent molecular studies using mitochondrial and nuclear gene sequences have shown that this species includes populations in Europe with a high degree of genetic diversity in Europe and that these populations generate significant genetic lineages (Randi et al., 2004;Lorenzini and Lovari, 2006;Amiri et al., 2021). According to these results, the genetically differentiated roe deer lineages are localized to a certain degree in southwest Europe (the Iberian Peninsula), south-east Europe, central and southern Italy, northern Portugal and Greece. Contrary to this, however, there is genetic homogeneity and complexity in roe deer populations found in northcentral Europe (Randi et al., 2004;Lorenzini and Lovari, 2006;Royo et al., 2007). Molecular-based studies on the phylogeny and phylogeography of this species are mostly based on control region (CR), cytochrome b (cyt b), and NADH dehydrogenase 2 (ND2) sequence variations of mitochondrial DNA (Randi et al., 1998;Wiehler and Tiedemann, 1998;Vernesi et al., 2002;Randi et al., 2004;Royo et al., 2007;Matosiuk et al., 2014;Amiri et al., 2021). Although cytochrome c oxidase subunit I gene sequences (COI) are available in GenBank, there has been no phylogenetic inference based on the sequence of this gene region of mitochondrial DNA. In addition, samples from Turkey have not been included in the past studies, based on mitochondrial DNA sequence variations, mentioned. This study aimed to contribute to the molecular phylogeny of C. capreolus by sequencing the cyt b and COI gene regions of the samples obtained from Artvin province in the northeast of Turkey, analyzed together with the sequences currently available in GenBank. Thus, it is thought that it will be even more possible to reveal the relative phylogenetic position and evolutionary history of the roe deer samples from Turkey, which has a rich biological diversity, in the roe deer phylogeny.

Sampling, extraction and amplification of genomic DNA
In this study, tissues belonging to four roe deer individuals who died as a result of traffic accidents or naturally from the Artvin province of Turkey were used. The experimental studies were approved by the Ethics Committee of Artvin Çoruh University (Relevant permit date and number: 05.04.2022-45536). In the scope of molecular studies, partial sequences of cyt b and COI genes were analyzed. Total genomic DNA was extracted from muscle tissues according to the CTAB isolation method (Doyle and Doyle, 1990). Amplification of both gene regions was performed by polymerase chain reaction employing the following primer pairs: H15915R and L14724 a for cyt b (Irwin et al., 1991) and LCO1490 and HC02198 for COI (Folmer et al., 1994) gene regions. For the amplification reactions, a master mix in 25 µl volume was prepared with the following contents:10.6 μl sterile water, 2.5 μl buffer (750 mM Tris-HCl pH 8.8 at 25 °C) 200mM (NH 4 ) 2 SO 4 , 0.1 % Tween 20, Fermentas), 4 μl dNTP mix (A, C, G, T 200 µM), 2.5 µl MgCl 2 (25 mM), 2 μl forward and reverse primers (20 pmol, Fermentas), 0.4 μl Taq polymerase (500 U, Fermentas) and 1 μl DNA (at least 500 ng/μl for each sample). PCR reactions were performed in an Applied Biosystems® Veriti® 96-Well Thermal Cycler, according to the procedure maintained by Irwin et al. (1991) for cyt b and by Folmer et al. (1994) for COI genes. In the next step, PCR products were loaded on 0.8% agarose gel and run by applying 70 V for 90 minutes in 1XTAE. The thermo scientific GeneRuler 100 bp DNA ladder was employed as a marker to estimate the size of each DNA sample on agarose. Afterwards, all gels were washed with ethidium bromide (EtBr) for 30 minutes and then viewed using the KODAK Gel Logic 100 System. The sequencing procedure was accomplished by Macrogen Europe (Macrogen Inc., Amsterdam, Netherlands).

Phylogenetic and evolutionary dating analyses
Alignment of forward and reverse sequences of each sample and obtaining of consensus sequences were performed by BioEdit 7.0.5.3 (Hall, 1999). The cyt b data set included sequences which were 1059 bp long, while the COI data set comprised 646 bp of sequences. The cyt b and COI sequences belonging to C. capreolus and Cervus elaphus downloaded from GenBank in both data files were used as the outgroup. A model-based phylogenetic analysis method, maximum likelihood (ML), implemented in MEGA X, was employed to determine the evolutionary relations (Kumar et al., 2018). For the construction of ML phylogeny, the TN93 for cyt b data set and the K2 for COI data set were chosen using MEGA X as the most suitable nucleotide substitution models (Kumar et al., 2018). The robustness of the ML phylogenetic tree was determined by using a nonparametric bootstrap with 1000 replicates. In addition, a median-joining network was constructed using Network 10.2.0.0 to exhibit relationships among subpopulations (Bendelt et al., 1999). DNA polymorphism, number of haplotypes, haplotype and nucleotide diversity, and their standard deviations were determined using DNAsp v6.12.03 (Rozas et al., 2017). To evaluate the demographic past of populations Fu's Fs (Fu and Li, 1993;Fu, 1997) and Tajima's D neutrality test (Tajima, 1989) were calculated by DNAsp v6.12.03 (Rozas et al., 2017). The genetic distance estimation between populations was determined under the Kimura-2 parameter model (K2P) with 1000 bootstrap replicates in MEGA X (Kumar et al., 2018).

O n l i n e F i r s t A r t i c l e
The BEAST v1.8.0 software was used to date evolutionary divergence using cyt b sequences (Drummond et al., 2012). The cyt b sequences of Cervus elaphus, Muntiacus reevesi, Muntiacus crinifrons, Alces alces, Odocoileus virginianus, and Mazama americana were downloaded from GenBank and they were employed as outgroups in this analysis. In the Bayesian Inference analysis, the Yule tree prior was selected and a strict molecular clock approach was used. For the strict clock approach, the nucleotide substitution rate was set to be the half of the 2% (0.01) per nucleotide per million years for the cyt b sequences (Taberlet et al., 1998;Johns and Avise, 1998). TN93+I was selected as the best nucleotide substitution model by MEGA X (Kumar et al., 2018). Two calibration points were used; one was the divergence time of Cervini/Muntacini (9±1 million years ago, mya) and another one was the age of extant Odocoileini (5±1 mya) (Hassanin et al., 2012 and references therein). One independent MCMC (Markov Chain Monte Carlo) for 10,000,000 generations with a sampling frequency of every 1000 generations was executed. Tracer v1.6 (part of the BEAST package) was used to check whether the ESS >200 (the lower bound of effective sample size) was valid or not in the independent run. The acquired tree file containing 10,000 trees was summarized using TreAnnotator v1.8.0 (part of the BEAST package) by removing the initial 10% (1000 trees) of the sampled trees as burn-in, and a 50% majority-rule consensus tree was constructed to calculate the BPPs and divergence times of the tree nodes. The tree constructed from the BI analysis was visualized and modified by FigTree v1.4.0 (part of the BEAST package).

RESULTS
The cyt b sequences of the roe deer population in Turkey included three variable regions, no parsimonyinformative sites, and three singleton variable sites. Both nucleotide and haplotype diversity (π=0.00142±0.00075, h=0.500±0.265) were relatively low in the cyt b gene. The number of detected haplotypes was two in cyt b. Both of the two cyt b haplotypes (Haplotype 1 and 2) were specific to Turkey. The nucleotide frequencies were A=30.2%, T(U)=29.8%, C=27%, G=13%. On the COI side, the sequences of Turkish specimens consisted of two variable regions, one parsimony-informative and one singleton variable. This population had relatively low nucleotide diversity and high haplotype diversity (π=0.00181±0.00056, h=0.833±0.222). The three COI haplotypes were determined in a total of four roe deer samples. Two of them (Haplotype 1 and 3) were specific to Turkey. One haplotype (Haplotype 2) was shared with the haplotypes from France and Poland. The nucleotide frequencies were A=28.1%, T(U)=33.2%, C=22.1%, G=16.7%. Tables I and II show the haplotypes of the roe deer samples used in the study and the detail of the haplotypes downloaded from GenBank and used in the phylogenetic analyses.    The ML tree reconstructed by the cyt b sequences revealed a complex phylogeny. The resulting topology was evaluated as six phylogenetic groups (clades C1 to C6) in which all haplotypes from different countries were nested. Haplotypes from Turkey were included in the C2 clade, which also consisted of haplotypes from Ukraine, France, Russia, and Iran (Fig. 1). According to the ML phylogenetic tree based on the COI sequences using the Kimura-2 nucleotide substitution model, two main clades appeared, consisting of haplotypes from Central Europe and Korea, with low bootstrap support. Turkish haplotypes from Artvin and one haplotype from China obtained from GenBank were involved in the Central Europe clade, which also comprised haplotypes from France, Italy, Austria and Poland (Fig. 2). The results of phylogenetic trees were approved by the median-joining haplotype networks. The haplotype network generated by the cyt b haplotypes also revealed six complex phylogenetic groups within C. capreolus, similar to the ML topology. The haplotype network based on the COI sequences revealed two distinct haplogroups, one in Central Europe and another in Korea, as revealed by the ML tree ( Figs. 1 and 2). The topology of the BI tree was similar to the ML tree based on cyt b. According to that, six complex phylogenetic groups, not fully resolved, were determined within C. capreolus. The BI phylogenetic tree, which was calibrated using two different divergence times (1. Cervini/Muntacini= 9±1 mya and 2. the age of extant Odocoileini: 5±1 mya), showed that European roe deer populations began to differentiate genetically within themselves about 2.72 million years ago before the present (Fig. 3). The Turkish roe deer population had the lowest nucleotide and haplotype diversity among all populations in both COI and the cyt b sequences. Tajima's D and Fu's Fs statistics were not statistically significant for the Turkish population. On the other hand, Fu's Fs statistic had a high negative value for the populations from Poland in the cyt b sequences, indicating a recent population expansion. However, this statistic was positive in the COI sequences of the same population (Table III). Considering the genetic distance values calculated based on the Kimura-2 parameter, the Turkish population was genetically closest to France in COI sequences and Italy in cyt b sequences (Table IV).

DISCUSSION
The genetic diversity and phylogeography of European roe deer, C. capreolus, were broadly investigated by using mitochondrial DNA sequences in past studies. The molecular marker primarily utilized in most of those studies was the sequence of the control region (or D-loop) (Randi et al., 1998;Wiehler and Tiedemann, 1998;Vernesi et al., 2002;Randi et al., 2004;Zachos et al., 2006;Baker and Hoelzel, 2013;Matosiuk et al., 2014;Horcajada et al., 2018;Liu et al., 2020). The sequences of NADH dehydrogenase 2 and cyt b have also been used in a small

O n l i n e F i r s t A r t i c l e
P.S. Seker number of studies for the aforementioned purposes (Amiri et al., 2021;Matosiuk et al., 2014;Kashinina et al., 2018). Although the cyt b and COI sequences of roe deer living in diverse countries have been available in GenBank, the relevant gene sequences of the Turkish samples have not been present. Among the past studies on the roe deer phylogeny and phylogeography, the samples from Turkey have not been included in phylogenetic analyses in any study, except for one study based on RFLP (Lorenzini and Lovari, 2006). Also, there has not yet been any study on the European roe deer phylogeny using the COI sequences.
In this study, the mitochondrial DNA cyt b and COI genes of the roe deer population living in Artvin, Turkey were partially sequenced, and thus genetic data related to this population were provided. The data obtained by this study and the cyt b and COI sequence data available in GenBank were analyzed together in order to try to contribute to the molecular phylogeny and evolutionary history of the European roe deer. In the genetic diversity and phylogenetic analyses based on both gene sequences, not only were the haplotypes specific to Turkey determined, but haplotypes shared with Central European populations (from France and Poland) were also detected. Moreover, according to the topologies of the phylogenetic trees reconstructed by the ML and BI methods in this study, it was identified that the Turkish haplotypes were intertwined with the haplotypes from Central Europe instead of forming an independent group, and they were genetically close to Central European haplotypes. Supporting this, the estimates of genetic distance using the Kimura 2-parameter model between roe deer populations showed that the Turkish population was genetically close to populations from Austria, Italy, Iran, and France. These results were in line with the findings of the only study including a sample from Turkey, that of Lorenzini and Lovari (2006). In terms of the phylogeographic scenario revealed by past studies on roe deer, it has been thought that the genetic structure of roe deer in Northern and Central Europe is an admixture shaped by the resulting adaptive radiation of roe deer populations found in Mediterranean refuges (the Iberian Peninsula, southern Italy, and Greece) during the Pleistocene (Randi et al., 2004;Royo et al., 2007). The role of the populations living in refuges further east, which may have contributed to the formation of this complex genetic structure, has not been fully known. It has often been emphasized that refuge areas in the southeast of Europe, such as Turkey, played an important role in the formation and shaping of today's European biodiversity during the glacial periods (Hewitt, 1996;Taberlet et al., 1998;Bilgin, 2011), and this has also been proved by numerous phylogenetic and phylogeographic studies performed on different taxa (Bilgin, 2011 and references therein). The results of this study indicated that individuals with the same genotypes in Europe can also be found in areas further east, such as Turkey, and these can be considered as representative in terms of having contributed to the shaping of the mentioned admixed genetic structure of European roe deer populations. The current study therefore makes a substantial contribution to a better understanding of the processes shaping the genetic diversity and phylogeny of European roe deer, and the role of Turkey in this, by presenting new genetic data from Turkey for this species.
Evolutionary divergence time analysis based on cyt b sequences demonstrated that European roe deer populations began to differ genetically within themselves at about 2.72 mya before the present. This time period, which approximately corresponds to the Pliocene-Pleistocene transition, is compatible with the proposal that the genus Capreolus first appeared and began to speciated across Europe between 3 and 2.5 or 2 mya (Randi et al., 1998;Valli, 2010). The genetic differentiation of Turkish samples along with the samples from Europe and Iran corresponds to approximately 0.58 mya. This time period is earlier than the evolutionary divergence times (between O n l i n e

F i r s t A r t i c l e
Insight into the Phylogenetic Position of Capreolus capreolus 9 115 and 11.7 ky) and the occurrence of haplotype diversity (370 and 150 ky) identified in previous studies for European roe deer populations (Randi et al., 1998;Vernesi et al., 2002;Matosiuk et al., 2014). This situation can be explained by suggesting that the estimated divergence time could reflect a much earlier date than any radiation event, because the evolutionary divergence events might have lasted longer than the duration of the glacial periods (Taberlet et al., 1998;Lorenzini and Lovari, 2006). Given the presence of common haplotypes, the relative positions of roe deer haplotypes from Turkey in the ML and BI phylogenetic tree topologies, and the times of evolutionary divergence events within this species, it is conceivable that roe deer populations in Turkey may have entered from Europe. The statistically significant negative values of Fu's Fs test calculated by the current study indicated a population expansion (Fu and Li, 1993;Fu, 1997) for the European roe deer populations, and, in particular, for the Polish population, supporting this idea. In addition, a recent study in Iran by Amiri et al. (2021) revealed similar findings, which serve to strengthen this contention. It is clear that more comprehensive analyses involving different markers are needed, along with more sampling, in order to make more precise inferences about the phylogeny and phylogeography of Turkish roe deer populations. In this sense, this study can be considered a preliminary study for the determination of genetic peculiarities and inference of phylogeny and evolutionary history of C. capreolus from Turkey.

ACKNOWLEDGMENTS
I would like to thank Dr. Gökçe Ali Keleş from Artvin Çoruh University and Dr. Engin Selvi from Ankara University for their help in sample supplying and experimental studies.

Funding
No funding was received for conducting this study.

IRB approval
This study has been approved by the Ethics Committee of Artvin Çoruh University at which the study was conducted (Relevant permit date and number: 05.04.2022-45536).

Ethical statement
All stages of this study (sampling, experiments, phylogenetic and evolutionary analysis) were carried out in accordance with the rules of scientific ethics. Hereby, the author declared that it is the author's own original work, it has not been previously published elsewhere, it reflects the author's own research and analysis in a truthful and complete manner.

Statement of conflict of interest
The author has declared no conflict of interest.