t i l Cloning and Bioinformatics of CTSD Gene and its Expression at the Onset of Puberty in Duolang Sheep

Recent studies have demonstrated that CTSD gene plays a role in the regulation of reproduction in mammals. However, the role of CTSD gene in the onset of sheep puberty remains unknown, the cDNA sequence of CTSD from Duolang sheep was cloned and sequenced. The expression levels of CTSD were detected in the hypothalamus, pituitary, ovary, uterus and oviduct in the three development periods of prepuberty, puberty, and postpuberty. The results showed that the coding region of CTSD gene of Duolang sheep had a full length of 1,239 bp, encoding 412 amino acids. The sequence of Duolang sheep CTSD was relatively similar to those of other mammals and was in line with evolutionary relationship. The mRNA of CTSD gene was expressed in all the tissues of Duolang sheep at the prepuberty, puberty, and postpuberty. The expression level of CTSD in the hypothalamus and uterus was relatively low. But its expression level in the ovary at puberty was significantly higher than that in other tissues. These results suggest that CTSD may play an important role in the onset of puberty in sheep by regulating the development and ovulation of follicles in the ovary.


INTRODUCTION
X injiang is an area where ethnic minorities gather, there is a large demand for beef and lamb in Xinjiang, and the supply and demand are in a tight balance. In recent years, the high-frequency breeding technique of three litters every two years in southern Xinjiang has become popular to reduce production costs and improve the fecundity and production efficiency of beef cattle and mutton sheep (Luo and Liu, 2019;Ma and Zhang, 2019), however, most of the sheep breeds in Xinjiang are sexual late maturing breeds with late puberty and low reproduction rate, which greatly limits the quantity of sheep and mutton production. Many studies have proved that the advancement of puberty and early mating age of ewes have no obvious adverse effects on their growth and development, but are very beneficial to production and breeding, because it shortens the generation interval and increases the final number of offspring produced by ewes (Nonneman et al., 2016).
Cathepsin D (CTSD) exhibits different functional roles in various tissues of different species, it plays a key role in the regulation of reproduction in mammals and aquatic animals. It is closely related to the occurrence and progression of human malignant tumors and is a marker of cancer development (Zhang et al., 2014). In insects, CTSD is an essential proteolytic enzyme involved in the insect metamorphosis (Yuanping et al., 2006). CTSD also affects meat quality and has a significant impact on beef marbling (Guixing et al., 2011). The bovine CTSD gene is located on chromosome 29 and contains eight introns and nine exons, with a total length of 9,443 bp. There are four single-nucleotide polymorphism (SNP) sites in its exons (Balbín, 1994;Zimin et al., 2009). The chicken CTSD gene is located on chromosome 5, and its synonymous mutation site in exon 3 has a significant impact on egg weight and yolk weight (Liu et al., 2021). The CTSD gene plays a key role in cell apoptosis and ovarian growth and development (Cocchiaro, 2016;Morais et al., 2016), CTSD involved O n l i n e

F i r s t A r t i c l e
in vitellogenic deposition and hydrolysis during ovary development of Chinese sturgeon (Lihong et al., 2018). Morever, it is widely expressed in the forelimb, hindlimb, fat, longissimus dorsi muscle, heart, liver, spleen, kidney, lung, and stomach of pigs (Mei, 2007). Pan et al. (2011) cloned the cDNA of the CTSD gene of Pinctada maxima, finding that it had a full length of 1742 bp and was expressed in all tissues, but the expression was the highest in gonads, followed by the hepatopancreas, indicating that CTSD had an effect on gonadal development. Zhou et al. (2020) cloned the CTSD gene of Qianbei Ma sheep and found that its expression was the highest in the ovary. Feng et al. (2018) found that the expression level of CTSD in the ovary of high-prolificacy Hu sheep was significantly higher than that in low-prolificacy ewes. Puberty is the time when an animal was first in estrus and ovulates, and when reproduction begins. The hypothalamic-pituitary-gonadal axis is the key pathway that regulates the puberty of sheep. Duolang sheep is an excellent local sheep breed in Southern Xinjiang, it has the characteristics of year-round estrus and early sexual maturity, it can reach the age of puberty between 3-4 months and can be bred. In this study, the cDNA sequence of CTSD from Duolang sheep was cloned and sequenced. The expression levels of CTSD were detected in the hypothalamus, pituitary, ovary, uterus and oviduct at the three development period of prepuberty (juvenile), puberty, and post-puberty. The aim of the study was to provide a scientific basis for further studies on the relationship between CTSD gene function and puberty in sheep.

MATERIALS AND METHODS
This work was conducted in accordance with the specifications of the Ethics Committee of Tarim University of Science and Technology. Animal research on was conducted in compliance with the guidelines of the Animal Ethics Committee (SYXK 2020-009).

Animals and sample collection
A total of thirty female Duolang sheep of similar age (2-3 months), health, feeding conditions and body weight (10-15 kg) from Xinjiang Wuzheng Green Agricultural Development Co., Ltd, were selected as the experimental animals. The ewes were evaluated for puberty after 90 days of age. Rams were placed with the ewes at 9 am and 6 pm to test for estrus for 1-2 h. The criteria for assessing an estrous status that the ewes were agitated and very sensitive to external stimuli, frequently urinated, had a red and swollen vulva with mucus, and stood still to accept mating (Dantas, 2016).
Ten Duolang sheep were slaughtered in each of the prepubertal period (90 days), the pubertal period, and the postpubertal period. Hypothalamus, pituitary, uterus, ovaries, oviduct, and other tissues were rapidly collected. Tissues were cut into small pieces using sterile surgical scissors, placed in 5-ml cryopreservation tubes, and stored in liquid nitrogen for later use. This process requires rapid manipulation to prevent RNA degradation inside the tissues.

Design and synthesis of primers
According to the mRNA sequence of the sheep CTSD gene published in the NCBI GenBank database (accession number: AF164143.1), the primers for fluorescence quantitative primers and coding sequence (CDS) cloning were designed using Primer 5.0 software ( Table I). The β-Actin gene (ACTB) was used as the internal reference gene. The primers were synthesized by Sangon Biotech (Shanghai) Co., Ltd.

Extraction of total RNA from tissues and cDNA synthesis
Total RNA from the hypothalamus, pituitary, uterus, oviducts, and ovaries of the prepuberty, puberty, and postpubertal periods was extracted using TRIzol reagent (Thermo Fisher Scientifific, Waltham, MA, United States). The concentration and purity of total RNA was determined using ultramicro spectrophotometry, and the samples were qualified by observing the optical density ratio (OD 260 / OD 280 ), total RNA extracted from the tissues was reversetranscribed to synthesize first-strand cDNA using a reverse transcription kit (TaKaRa). The total reverse transcription system was 20 μL, including 2 μL 5× gDNA buffer, 1 μL gDNAEraser, 5 μL RNase Free ddH 2 O, 2 μL RNA, which ran at 42 °C for 2 min, then 1 μL PrimeScript RT Enzyme Mix, 1 μL RT Primer Mix, 4 μL 5× PrimeScript buffer, and 4 μL RNase-free ddH 2 O were added, and the reaction was run at 37 °C for 15 min and 85 °C for 5 sec. The cDNA was stored in a freezer at -20°C.

PCR amplification and cloning
The CTSD CDS sequences was amplified by PCR using cDNA as the template. The PCR system consisted of 25 μL, including 12.5 μL 2× PCR Master Mix, 1 μL each of the upstream and downstream primers, 1 μL cDNA, and 9.5 μL ddH 2 O. The PCR amplification program was 95 °C predenaturation for 5 min; 40 cycles of 95 °C denaturation for 20 s, 58 °C annealing for 30 s, and 72 °C extension for 30 s; and then a 72 °C extension for 15 min. Five microliters of PCR product was analyzed by 1.5% agarose gel electrophoresis at 120 V and 110A for 20-30 min to observe the length of the target gene fragment. Bioinformatic analysis of the CTSD gene After the sequencing results were spliced using DNAMAN software, the cloned sequence was searched for homology with the reference sequence of the CTSD gene in other species published on NCBI using the Basic Local Alignment Search Tool, then the nucleotide sequences of the CTSD gene were used to construct a phylogenetic tree and homology alignment using Mega 5.0 software. The open reading frame (ORF) was searched using the NCBI online program (http://www.ncbi.nlm.nih.gov/orffinder/) and the CDS region was translated into amino acid sequences. Using the ProtParam program on the ExPASy online website (http://www.expasy.org/resources), the amino acid sequence of the CDS region was input into the program to analyze the physicochemical properties of the CTSD protein. PSORT II software (http://www.psort. hgc. jp/form2.html) was used to analyze the localization of the CTSD protein, and the signal peptide region and cleavage site within the amino acid sequence were predicted using the online program of the Signa1P 4.0 server (http:// www.cbs.dtu.dk/services/SignalP-4.1). the SOPMA and PYRE2 (http://www.sbg.bio.ic.ac.uk/phyre2/html/page. cgi?id=index) online program were used to predict protein secondary structure and tertiary structure.

Real-time fluorescence PCR (qRT-PCR)
The expression of Duolang sheep CTSD gene were dected in the hypothalamus, pituitary, uterus, oviduct, and ovary by real-time PCR(qPCR). The qPCR system was 15 μL, including 1 μL of cDNA template, 7.5 μL of 2× Transtant qPCR Mix, 0.5 μL of each of the upstream and downstream primers (100 μmol/L), and 5.5 μL of ddH 2 O. The cDNA was diluted threefold using ddH 2 O. The ACTB gene was the internal reference gene. The CTSD gene was detected by qPCR in triplicate. The reaction program was as follows: predenaturation at 95°C for 10 min; denaturation at 95°C for 20 s; annealing at 60°C for 30 s, and 40 cycles of denaturation at 95°C for 15 s, 60°C for 30 s, and 95°C for 15 s. The melting curve was drawn automatically by the machine (the base temperature was 65°C, increasing by 0.5°C every 5 s for amplification (increased to 95°C). The cycle threshold (CT) values of the internal reference gene and the target gene were read. The relative expression level of CTSD gene was calculated using the 2 -ΔΔCT method. The final results were compared using IBM SPSS Statistics 26 software for one-way analysis of variance.

Western blot analysis
Total protein extraction: the tissue was frozen in liquid nitrogen, and 100 mg of the tissue was added to 500 μL of RIPA lysate and 5 μL PMSF. and then, the mixture was placed on ice for 30 min and homogenized; the mixture was then centrifuged at 12,000 rpm and 4°C for 15 min, and the supernatant was collected. A BCA assay kit was used to determine total protein concentrations and the protein samples with loading buffer were boiled at 100℃ for 10 min. Protein samples (20-40μg of protein in each lane) were electrophoresed on 10% SDS-polyacrylamide gels. The proteins were separated by PAGE and transferred onto polyvinylidene fluoride (PVDF) membranes. The membranes were then incubated with the indicated primary antibodies. Next, the blots were incubated with the primary antibodies overnight at 4°C: anti-CTSD antibody (1:1,000, ab46020, Abcam, Boston, MA, USA) or anti-GAPDH antibody (1:1,000). Finally, after washing with TBST, the blots were incubated with an alkaline phosphatase-conjugated secondary antibody (1:5000 dilutions in TBST) for 1 h at 37°C. The reactive proteins were visualised using chemiluminescence (ECL) western blot reagents and quantified using ImageJ software. The levels of the target proteins were normalised to those of anti-GAPDH (ab7291; Abcam), which was used as an internal control.

PCR amplification and sequence analysis of the CTSD gene
The ovarian cDNA of Duolang sheep was used as the template for PCR amplification. Five microliters of the product were subjected to 1.5% agarose gel electrophoresis. The electrophoresis showed that the obtained target O n l i n e

F i r s t A r t i c l e
fragments were consistent with the expected results ( Fig.  1). After cloning and sequencing, the CDS region of the CTSD gene was 1,239 bp, encoding 412 amino acids.  Gene homology and phylogenetic relationship of CTSD gene in Duolang sheep DNAMAN was used to convert the nucleotide sequence into an amino acid sequence for homology alignment with the amino acid sequences of CTSD in Duolang sheep (Fig. 2).
The homologies with ovis arise,capra hircus,bos taurus,sus scrfa were high, at 100%, 98.5%, 97.33%, 86.17%, respectively, and the homologies Homo sapiens, Mus musculus, Anas platyrhynchos, Gallus were 84.06%, 78.6%, 65.29%, 64.56%. This indicates that the CTSD gene has been relatively stable during the species evolution process and was is in line with evolutionary law, so the data can be used for further analysis.

Prediction of physicochemical properties and hydrophilicity/hydrophobicity
The CDS region of the Duolang sheep CTSD gene encoded 412 amino acids, with a relative molecular mass of 31159.69 and an isoelectric point of 9.72 (Table II). The protein had 41 glycines and 40 leucines, accounting for the highest proportions of amino acids (10% and 9.7%, respectively), tryptophan was the rarest amino acid (only 1%). The number of negatively charged amino acid residues (Asp+Glu) was 35, and the number of positively charged O n l i n e , and the mean hydrophilicity was -0.008. A protein with a negative mean value was a hydrophilic protein, and a protein with positive value was hydrophobic, so CTSD was predicted to be a hydrophilic protein. Further analysis using PortScale showed that the isoleucine at position 24 was the most hydrophobic position of CTSD, with a score of 4.5; the lysine at position 129 had the strongest hydrophilicity of -3.9. As shown in Figure 5, it had negative amino acids than positive amino acids. These data strongly suggested CTSD was a hydrophilic protein.

Prediction and subcellular localization of CTSD gene signal peptides in Duolang sheep
The online program of the Signa1P 4.0 server was used to predict the presence of signal peptide cleavage sites in the CDS of the CTSD gene. The results of the neural network method mainly involved three scores (Fig.  6): The C score (cleavage site score), S score, and the Y score. Each amino acid has a C score and an S score. The C score is the highest at the cleavage site, the S score is the highest in the signal region, and the Y score is the most likely site of signal peptide cleavage. The prediction results of this study showed that the C, S, and Y scores reached their peaks between the 22 nd and 23 rd amino acids, where a signal peptidase cleavage site was predicted. The subcellular localization of CTSD according to the PSORT Prediction online program showed that 44.4% of the CTSD protein of the Duolang sheep was present in the extracellular space (including the cell wall), 22.2% of the protein was present in the vacuole, and the percentage of CTSD protein in the cytoplasm, mitochondria, and cytoplasmic membrane were all 11.1%.

Prediction of secondary structure and tertiary structure
The secondary and tertiary structures of the proteins were predicted using SOPMA and Phyre 2 online platforms. The results showed that the CTSD protein was composed of 41.5% random coils, 31.80% extended chains, 19.17% α-helices and 7.52% β-turns. The tertiary O n l i n e

F i r s t A r t i c l e
Q. Li et al.
structure was based on the template D3PSGA model (Fig.  8). 365 residues (89% of the sequence) have been modeled with a confidence of 100.0% using a single maximum score template.

Analysis of the mRNA and protien expression level of the CTSD gene in Duolang sheep
This study detected the expression of CTSD gene in hypothalamus, pituitary, uterus ,ovary and oviduct of Duolang sheep in the three developmental periods (Fig. 9). The expression of the CTSD in the ovary and oviduct was higher than that in hypothalamus, pituitary and uterus (P<0.05), and the expression was the lowest in the hypothalamus during the prepuberty of Duolang sheep (P<0.05); the difference between the same tissues at different stages was not significant (P>0.05). During the prepuberty, the expression levels of CTSD in different tissues were in the following order: oviduct > ovary > pituitary > uterus > hypothalamus. The expression level of CTSD in the ovary, hypothalamus, pituitary, and oviduct was upregulated in the process from prepuberty to puberty, while the expression of CTSD was lowest in the uterus, and it was significantly higher in ovary than in hypothalamus, pituitary, and uterus (P<0.05). The

O n l i n e F i r s t A r t i c l e
Cloning and Bioinformatics of CTSD Gene and its Expression order of CTSD expression levels in puberty was ovary > oviduct > pituitary > hypothalamus > uterus, in the process from puberty to postpuberty, the expression of CTSD in the hypothalamus, pituitary, ovary, and oviduct began to show a decreasing trend, with the lowest expression in the hypothalamus and oviduct, the expression in ovary was still higher than that in other tissues, but the differences were not significant (P>0.05). The CTSD expression levels at postpuberty were in the following order: ovary > pituitary > uterus > hypothalamus > oviduct. In the three developmental stages, the CTSD was highly expressed in the ovary of Duolang sheep at the puberty and postpuberty, and its expression in the ovary was significantly higher than that in other tissues at puberty (P<0.05).
To confirm protein expression pattern of CTSD in the hypothalamus, pituitary and ovary at three different pubertal stages was detected using western blotting (Fig. 10). As shown in Figure 10, CTSD protein was highly expressed in the ovary and pituitary, but significantly increased from prepuberty to puberty (P>0.01). Expression of the CTSD protein in the pituitary and hypothalamus showed some changes among different pubertal stages, but the levels did not reach significance (P>0.05).

DISCUSSION
In mammals, the CTSD was first identified as a ubiquitous lysosomal enzyme that participates in many biological processes (Benes et al., 2008). Recent studies have focused on the genomic structure of the CTSD and its role in pathology (Sheng et al., 2013). However, the function of the CTSD in the onset of puberty in sheep is still unclear. The onset of puberty is closely associated with changes in transcription and expression levels of related genes, in this study, we successfully cloned the cDNA sequence of the CTSD gene in Duolang sheep. The prediction of its physicochemical properties and hydrophilicity indicated that the CTSD might be a stable hydrophilic protein.
The signal peptide and subcellular localization analysis showed that the CTSD was located in the extracellular space and cell wall (44.4%), in the vacuole (22.2%), and in the cytoplasm, mitochondria, and cytoplasmic membrane (11.1% each), with a signal peptide cleavage site between positions 22-23, which is consistent with the analysis of the CTSD gene in Tianfu sheep by (Wang, 2014). CTSD is an indicator of the degree of malignancy of severity of serous cystadenocarcinoma of the ovary, and its expression in malignant serous cystadenocarcinoma of the ovary is higher than that in benign tumors (Chai et al., 2012). CTSDplays an important role in the hydrolysis and transformation of various proteins in cellular processes (Benes et al., 2008;Minarowska et al., 2007), and also plays a role in protein degradation (Lihong et al., 2018;Sheng et al., 2013). In fish, CTSD mediates processing of vitelloprotein in oocytes and is stable during sexual maturation (Bourin et al., 2012), and is very abunvdant in oocytes during the vitellogenesis phase. CTSD is a key enzyme in the process of vitellogenesis (De Stasio et al., 1999) and plays an important role in the growth and development of follicles. Therefore, it is essential for egg production (Lihong et al., 2018). During ovarian development, the highest expression level of CTSD mRNA appeared in the early stage of vitellogenesis, and then decreased gradually. Therefore, it has been suggested that CTSD contributes to the overall growth and development of ovoid tissues and plays an important role in determining animal reproductive traits (De Stasio et al., 1999). Brooks et al. (1997) showed that CTSD messenger RNA (mRNA) is expressed in both ovarian and non-ovarian tissues, (including liver, muscle, spleen, and testis). In this study, the CTSD gene was detected by qRT-PCR, which found that CTSD was expressed in the reproductive tissues of Duolang sheep in prepuberty, puberty, and postestrus periods, and the expression level of CTSD in ovary was significantly higher than in other tissues during puberty (P<0.05), which is consistent with findings of Zhou et al. (2020) that CTSD expression was the highest in ovary and the lowest in uterus of Qianbei Ma goat. Feng et al. (2018)  cells before ovulation. Aboelenain et al. (2015) showed that the upregulation of lysosomal enzymes was positively correlated with luteal regression, CTSD is also involved in the degeneration of the corpus luteum after estrus, which leads to the start of the next estrus. The above results indicate that CTSD is closely involved in ovarian development and maturation and ovulation in Duolang sheep, suggesting that CTSD may be involved in the regulation of the onset of puberty in Duolang sheep, these results may provide a new theoretical basis for exploring the regulatory mechanism of sheep reproductive traits.

CONCLUSION
We successfully cloned the cDNA sequence of Duolang sheep CTSD. Bioinformatic analysis showed that the CDS region of CTSD is 1,239 bp, encoding 412 amino acids. Homology alignment with 8 different species, the average amino acid homology of Duolang sheep with ovisarise, capra hircus, bos taurus, sus scrofa, homo sapiens, mus musculus, anas platyrhynchos, gallus were 84.13%, indicating that the species was highly conserved and functively stable in accordance with evolutionary law. CTSD is a stable hydrophilic protein with signal peptide sites. CTSD is expressed in different tissues of Duolang sheep during the juvenile, puberty, and postestrus periods but was expressed the highest in ovary during puberty, where its expression was significantly higher than it was other tissues (P<0.05). The expression level of CTSD in uterus and hypothalamus was relatively low, suggesting that the CTSD gene may play an important role in the onset of puberty by regulating the development and ovulation of follicles in the ovary.