t i l e Preliminary Exploration of Transcriptome Comparative Analysis of Muscle Tissues from Asymptomatic and Diseased Epinephelus fuscoguttatus Infected with Nervous Necrosis Virus

Through the study of differential gene expression in the muscle tissue from asymptomatic and diseased Epinephelus fuscoguttatus infected with nervous necrosis virus (NNV), this research initially explored the main genes and key pathways related to anti-NNV. This study used the Illumina HiSeq 4000 sequencing platform to sequence the transcriptome of E. fuscoguttatus , the transcriptome data clustered by TGICL to obtain Unigene, and bioinformatics methods were then adopted to perform functional annotation classification, metabolic pathways and SSR feature analysis. The results showed that a total of 42.40 Gb data were obtained from transcriptome sequencing, and 38,817 Unigenes were obtained after assembly and de-redundancy; Using BLAST to annotate the assembled unigenes with seven functional databases (NR, NT, GO, COG, KEGG, Swissprot and Interpro), a total of 23,417 unigenes were annotated (accounting for 61.14%); A total of 253 differentially expressed genes (DEGs) were screened, with 199 up-regulated genes and 54 down-regulated genes; GO analysis showed that DEGs was mainly enriched in cellular processes, biological regulation, cell parts, membrane parts, binding and catalytic activity; COG analysis showed that DEGs was mainly related to posttranslational modification, protein turnover, chaperones, intracellular trafficking, secretion, vesicular transport, transcription, amino acid transport and metabolism; KEGG analysis showed that DEGs was mainly related to cardiac muscle contraction, renin-angiotensin system, adrenergic signaling in cardiomyocytes, oxidative phosphorylation; The distribution types of SSR sites and their characteristic analysis results showed that the most important repeat type was mononucleotide repeats (7,840, accounting for 51.31%), followed by dinucleotide repeats (4,007, accounting for 26.22%), while the number of tetranucleotide, pentanucleotide and hexanucleotide repeat was relatively small. Through the above analysis, it could be concluded that the difference in gene expression of the transcriptome between asymptomatic group and the infected group was significant. PHGDH, PASAT1, RPL38, RPS2, ATP6V1D, ATP6V11, etc. may be the key genes that affect the expression of anti-NNV for E. fuscoguttatus , which mainly related to the inhibition of the capsid protein encoded by NNV’s RNA2. This research not only provided a theoretical basis for the breeding of anti-NVV, but also made up for the gap in the research field of anti-NVV of E. fuscoguttatus.


INTRODUCTION
E pinephelus fuseoguttatus, also known as tiger grouper, commonly known as brown-spotted grouper, is classified as perciformes, perciidae, epinephelinae, Epinephelus (Ma et al., 2020). It was a warm water island reef fish and mainly distributed in the tropical and subtropical regions of the Indian Ocean and Pacific Ocean (Yang and Tang, 2016). In Southeast Asia, it was mainly distributed in the southeast coast of China, Japan,

O n l i n e F i r s t A r t i c l e
the Philippines, etc., where the resources were scarce. In recent years, because of its delicious meat, fast growth, and large market potential, the promising breeding prospects (Fu et al., 2013), E. fuseoguttatus has become one of the important and high-quality species in mariculture in Fujian, Guangdong, Hainan and other regions in China (Yan et al., 2012). At present, the research of E. fuseoguttatus was mainly concentrated in the fields of breeding (Tian et al., 2019;Wu et al., 2021), physiology (Zhou et al., 2017;Jiang et al., 2018), biochemistry (Qiu et al., 2017;Zhang et al., 2019), nutrition (Guo et al., 2009;, etc. With the continuous expansion of the scale of aquaculture, the continuous update of the aquaculture model and the continuous increase of the aquaculture density, resulting in the outbreak of fish diseases, the research on fish disease prevention and control has been paid more and more attention. As the most common disease, the nervous necrosis virus (NNV) is a matter of great concern because of its highly contagious and high fatality rate. The nervous necrosis virus (NNV) is a member of the Nodaviridae family. Nodavirus family could be divided into Alphanodavirus and Betanodavirus, the former infects mainly insects, and the latter mainly infects fish (Kim, 2020). In the late 1980s, NNV was found and reported in the dead juveniles of Lates calcarifer from Australia (Giazebrook et al., 1990) and Oplegnathus fasciatus from Japan (Yoshikoshi and Inoue, 1990). The NNV mainly harmed the central nervous system of diseased fish, especially the brain and retina (Munday et al., 1992). The outbreak of NNV mainly occurred in the early life history stage of fish, and it has also been reported in adult fish (Chen, 2006). The host range of NNV infection extends from mariculture fish to freshwater fish, and gradually spread to marine invertebrates, most of which belong to bivalves, cephalopods, and gastropods (Bitchava et al., 2019;Gou et al., 2004). At present, the research on the pathogenic mechanism of NNV was still in an exploratory state. Chang and Chi (2015) used immunofluorescence staining, flow cytometry, and immunoprecipitation analysis to show the presence of heat shock cognate protein 70 (HSC70) on the cell surface, which interacts with the NNV capsid protein. Iwamoto et al. (2004) constructed a chimeric virus of NNV between Pseudocaranx dentex (SJNNV) and Hyponhodus septemfasciatus NNV (SGNNV). Ito et al. (2008) designed further experiments with chimeric viruses based on different nucleotide regions of RNA2 from NNV. Therefore, with the increasing frequency of NNV disease outbreaks during breeding and culturing of E. fuscoguttatus, it's urgent to find new technologies or methods to further breed disease resistant species and find out its disease resistance principle.
In recent years, with the rapid development of transcriptome sequencing technology, the determination of transcriptome cDNA sequence through high-throughput sequencing technology reveals that all genes expressed in specific cells or tissues have been widely used in many species (Xiong et al., 2018). The application of transcriptome technology in the fields of disease prevention and control analysis (Wang et al., 2020), germplasm genetic diversity analysis (Wang et al., 2008), genetic relationship identification (Zhao et al., 2018), genetic map construction (Sun et al., 2008) and molecular marker-assisted breeding (Xie et al., 2013;He et al., 2017) has also gradually become a research hotspot. The application of transcriptome technology to analyze the expression of differential genes has also been widely used in the research of aquatic animals, such as Fugu rubripes (Furukawa et al., 2004), Epinephelus quernus (Rivera et al., 2003), Salmo salar (King et al., 2005), Thunnus thunnus (Clark et al., 2004), Parargyrops edita (Yang et al., 2015), Scophthalmus maximus (Pardo et al., 2005), Macrobrachium rosenbergii , Nibea albiflora (Gong et al., 2016), etc. Therefore, using transcriptome technology to analyze the differential gene expression in asymptomatic and diseased E. fuscoguttatus infected with nervous necrosis virus (NNV) can be a new access to help solving the urgent issues.
This study mainly focused on the comparative analysis of transcriptomes about anti-NNV of E. fuscoguttatus. This study used the Illumina HiSeq 4000 sequencing platform to sequence the transcriptome of muscle tissue from asymptomatic and diseased E. fuscoguttatus infected with NNV. After using TGICL to cluster the transcriptome data to remove redundancy, we analyzed the differential genes through bioinformatics methods, such as functional annotation classification, metabolic pathways and SSR feature analysis, etc. Based on the above research, we could not only obtain the key genes related to disease resistance-related immune pathways, but also provided theoretical references for establishing a basic population of anti-NNV about E. fuscoguttatus breeding. This research further fills the gap in the research field of anti-NNV about E. fuscoguttatus. It is of great significance to promote the process of anti-NNV breeding of E. fuscoguttatus.

Ethics statement
This study was carried out in accordance with the animal care and use committee at the South China Sea Fisheries Research Institute. Efforts were taken to minimize fish suffering and included administering anesthesia. The study did not involve endangered or protected species.

Fish material
The fish materials needed for the experiment were O n l i n e with good growth conditions were cultured in the square cement pool (10m×5m×2m). All fish were injected with NNV (virus titer 10 5 TCID 50 ml -1 ) by intraperitoneal injection at a dose of 0.5 ml/fish. The nervous necrosis virus was provided by Sanya Tropical Fisheries Research Institute. After being infected with the virus, all fish were temporarily cultured for 10 days, and then 3 asymptomatic and 3 diseased E. fuscoguttatus were randomly selected. The water exchange rate was 100~200% per day and water quality indicators were as follows, water temperature was 28~31℃, salinity was 32‰~35‰, pH is 7.5~8.2, dissolved oxygen ≥7.2 mg·L -1 , ammonia nitrogen ≤0.2 mg·L -1 , nitrite ≤0.02 mg · L -1 . In order to avoid the spread of NNV virus, our culturing wastewater was discharged after disinfection and sterilization treatment.

F i r s t A r t i c l e
The collection and pretreatment of sample 100.0mg of back muscle tissue were sampled respectively from all 6 E. fuseoguttatus, quickly frozen in liquid nitrogen for 0.5h, and then stored at -80℃ for further analysis. All muscle tissue samples were taken from the same site, behind the dorsal fin side. Finally, the muscle tissue samples were kept in the low temperature by dry ice and then sent to the Shanghai Majorbio Bio-pharm Technology Co., Ltd for transcriptome sequencing and other testing items.

Transcriptome sequencing of E. fuseoguttatus
Shanghai Majorbio Bio-pharm Technology Co., Ltd was commissioned to extract total RNA from various tissues of E. fuseoguttatus, and sequence using Illumina HiSeq 4000 to obtain the raw reads. The raw reads was filtered to obtain high-quality clean reads (All clean reads were submitted to the SRA database and the accession number is PRJNA799649), which were then assembled with Trinity (http://trinityrnaseq.sourceforge. net), and then TGICL (Pertea et al., 2003) was used to cluster the transcripts to obtain unigene. MISA (http:// pgrc.ipk-gatersleben.de/misa/misa.html) was used to detect unigene's SSR site, mononucleotide repeat greater than 12 times, dinucleotide repeat more than 6 times, trinucleotide and tetra-nucleotide repeat more than 5 times, pentanucleotide and hexanucleotide more than 4 times, which was judged as SSR site.

Gene ontology (GO) terms and kyoto encyclopedia of genes and genomes (KEGG) pathways enrichment analyses
Using the DESeq2 to carry out the difference analysis to get the preliminary difference analysis result. Setting the criteria for screening differential genes as |log2 (fold change)| ≥1 and P < 0.05. Perform GO and KEGG on the calculated significantly differentially expressed genes. The GO analysis includes: Cellular component (CC), molecular function (MF), biological process (BP).

Protein-protein interaction network (PPI) construction of differentially expressed genes
Based on the differentially expressed genes, PPI was constructed using the STRING database (https://stringdb.org). The database stores the genetic information of downstream regulatory relationship, in which the association between proteins was based on previous experimental verification, literature mining, gene fusion, co-expression analysis, and calculation prediction. Genome meta-analysis and other technical methods, this analysis was mainly used to find and explore the mutual regulation of genes in the network, and then used the cytoHubba algorithm in Cytoscape software to weight the genes in the regulatory network to obtain the core position of the network Hub genes.

De novo assembly and analysis of the Epinephelus fuscoguttatus muscle tissues
A total of 6 libraries were constructed using total RNAs from muscle tissues of asymptomatic and sick E. fuscoguttatus and sequenced. The transcriptome sequencing obtained a total of 42.4 Gb clean data, and the clean data of each sample reached more than 6.68 Gb, and the percentage of Q30 bases was more than 94.12%. After assembly and de-redundancy, 38,817 Unigenes were obtained, with a total length of 46,667,869bp and an average length of 1,202bp. When the length of the transcript was accumulated to half of the total length, the length of the corresponding transcript) of 2,335bp; GC percentage of 47.31%. The length distribution of unigenes was shown in Figure 1. Among them, the minimum fragment length was 201bp and maximum was 17,051bp; the maximum fragment length was 25,404bp, and there were 1,583 unigenes with fragment lengths greater than 4,500; Busco score was above 90%.

Annotation of assembled unigenes
Unigenes were compared against the protein databases including the GO, KEGG, COG, NR, Swissprot, and Pfam database by Blastx. The results of the annotation are shown in Table I

Differential gene expression analysis of transcriptome data
The differential gene results obtained were shown in Figure 3. A total of 253 differentially expressed genes were obtained, of which 199 up-regulated genes and 54 down-regulated genes. Thus, a heat map of differential expression of genes against NNV of E. fuscoguttatus was constructed (Fig. 4).

The function analysis of GO, COG, KEGG
GO was an international standard gene function classification system. The using of GO database for comparison analysis could divide unigenes into three functional categories: Biological process, cellular components, and molecular functions. It can be seen from Figure 5. The biological process could be subdivided into 8 categories, among which the number of unigenes related to cellular processes were the largest (58), followed by biological regulation (37), and localization with the least quantity (2); Cellular components can be subdivided into 8 categories. Among them, there were more unigenes related to cell parts (56) and membrane parts (43), while the number of unigenes related to extracellular region part was the least (7); Molecular functions could be subdivided into 4 categories. Among them, the number of unigenes related to binding was the largest (82), followed by catalytic activity (57), while the transcription regulator activity was the least (14). COG was an orthologous family protein database. Function annotation, classification and protein evolution analysis could be performed through COG database comparison. There were 212 unigenes annotated by COG function in the E. fuscoguttatus transcriptome, and these unigenes were divided into 3 major paths for cellular processes and signaling, metabolism and information storage and processing, and further subdivided them into 20 categories (Fig. 6). Among them, the number of unigenes related to posttranslational modification, protein turnover, chaperones was the largest (20); followed by intracellular trafficking, secretion, and vesicular transport (17), Transcription (16), amino acid transport and metabolism (8), while related to RNA processing and modification was the least (1).  KEGG was an integrated database dealing with the connection between genomes, biological pathways, diseases, drugs and chemicals. According to different functions, this study divided the KEGG database into six major paths for metabolism, genetic information O n l i n e

F i r s t A r t i c l e
W. Fang et al.
processing, environmental information processing, cellular processes, organismal systems, human diseases. KEGG functional enrichment analysis was performed on 263 differential genes, and a total of 204 signaling pathways were significantly enriched. As shown in Figure  7, the first 20 significantly enriched pathways were listed. Significantly enriched pathways were cardiac muscle contraction, renin-angiotensin system, adrenergic signaling in cardiomyocytes, oxidative phosphorylation and so on.

Identification of SSRs
The SSR search of unigenes of E. fuscoguttatus was carried out through MISA, and a total of 15,281 SSR loci were detected in 9,362 unigenes. Among them, the main repeat type was mononucleotide repeats (7840, accounting for 51.31%), followed by dinucleotide repeats (4007, accounting for 26.22%) and trinucleotide repeats (3025, accounting for19.80%), while the number of tetranucleotide, pentanucleotide and hexanucleotide repeats were relatively small (Fig. 8). Mononucleotide and dinucleotide repeats were abundant, 2 and 4 kinds respectively. The largest number of repeats in the mononucleotide repeat was A/T (7,603), and the counterpart in the dinucleotide repeat was AT/CG (3,189).
Predecessors have accumulated rich experience in the selection of SSR primers for E. fuscoguttatus and NNV. Zhang et al. (2019) developed a polymorphic SSR marker based on the transcriptome sequence of artificial gynogenesis E. fuscoguttatus, which was applied to the genetic diversity analysis of brown-spotted grouper and its relatives and molecular marker-assisted breeding research. Chen et al. (2004) used TaqMan probe technology to design and established SSR primers for the detection of grouper nerve necrosis virus (NNV). In present study, by referring to the results of previous studies and gene sequence of similar fish on NCBI database, we screened out the 57 optimal SSR primers, among which the amplification results of dinucleotide and trinucleotide repeats were mostly, as shown in Table II. However, the stability and polymorphism of these SSR primers still need further study.

Screening of key genes against neural necrosis virus
Using the method of network modeling to construct a protein interaction network. Starting from the network level and analyzing the topological attributes of the network, it was possible to dig out the interaction relationships between important proteins from the complex biological data.
PPI of differentially expressed genes was shown in Figure 9. Through the interaction analysis of differentially expressed proteins, we looked for the hub-gene, among which PHGDH, PASAT1, RPL38, RPS2, ATP6V1D, ATP6V11, EEF2B, RPS2 and other genes were at the core.

DISCUSSION
E. fuscoguttatus was one of the main economic fish in Southeast Asia. However, in recent years, due to pollution of the culturing environment, degradation of water quality, and increasing breeding density, NNV disease occurred frequently, especially in the seedling stage. Therefore, there was an urgently need to carry out related research on grouper's resistance to NNV disease. This study used NNV to infect E. fuscoguttatus, and randomly selected individuals with clinical symptoms and asymptomatic. Transcriptome sequencing was performed by taking the muscles from the same part of the back to obtain their differentially expressed genes and carried out correlation analysis.
In this study, high-throughput sequencing technology was used to mine the genetic resources of the E. fuscoguttatus transcriptome, and a total of 42.40 Gb data was obtained. After assembly and de-redundancy, 38817 unigenes were obtained, with a total length of 46667869 bp and an average length of 1202 bp. The N50 was 2335 bp and the GC content was 47.31%, indicating that the transcriptome data was of high quality and could provide support for the development of the basic population selection and breeding of E. fuscoguttatus against NNV disease. Using BLAST to annotate seven functional databases (NR, NT, GO, COG, KEGG, Swissprot and Interpro) for the assembled unigenes, a total of 23,417 unigenes have been annotated (61.14%), which can provide reference sequences for the discovery of the anti-NNV functional genes of E. fuscoguttatus. Especially through the analysis of GO, COG and KEGG functional classification and metabolic pathways, not only can provide basic data for the cloning and functional analysis of E. fuscoguttatus anti-NNV genes, but also provide a large amount of usable genetic resources for its pathogenetics research.
Transcriptome sequencing was an effective method for the development of SSR molecular markers. At present, it has been successfully developed a large number of SSR molecular markers from the Hynobius amjiensis (Wang et al., 2017), Acipenser schrenckii (Kong et al., 2015), Apiscerana cerana (Xiong et al., 2017), Solenaia oleivor (Xu, 2014) and Siniperca chuatsi (Yuan, 2015). In this study, transcriptome sequencing of the muscle tissue from E. fuscoguttatus was performed, and a total of 15,281 SSR loci were detected in 9362 differential gene unigenes. The results of high-throughput sequencing showed that these SSR loci were diverse, including mononucleotide, dinucleotide, trinucleotide, tetranucleotide, pentanucleotide, and hexanucleotide repeat, and the number of different repeat types was significantly different. Among them, the most repeat type was mononucleotide, followed by dinucleotide repeat. Among the dinucleotide repeat, the most repeated motif was AT/GC, which accounts for 79.58%, that is different from other species such as Siniperca chuatsi (Yuan, 2015), Megalobrama amblycephala (Zeng et al., 2013). This contradiction indicated that there were species O n l i n e

F i r s t A r t i c l e
Preliminary Exploration of Transcriptome Comparative Analysis of Muscle Tissues 9 differences in the base repeat motifs of different species. This study screened out 57 optimal SSR primers, but failed to complete its validity verification and polymorphism detection analysis due to other reasons. Therefore, the relevant analysis needed further in-depth research.
In this study, RNA-seq was used to detect the gene expression profile of muscle tissue from asymptomatic and diseased E. fuscoguttatus infected with NNV. The results showed that compared with the control group, there were 253 differentially expressed genes (DEGs) in the asymptomatic group, of which 199 were up-regulated and 54 were down-regulated. GO analysis showed that DEGs were mainly enriched in cellular processes, biological regulation, cell parts, membrane parts, binding, and catalytic activity. COG analysis showed that DEGs are mainly related to posttranslational modification, protein turnover, chaperones, intracellular trafficking, secretion, vesicular transport, transcription, amino acid transport, and metabolism. KEGG analysis showed that DEGs are mainly related to cardiac muscle contraction, renin-angiotensin system, adrenergic signaling in cardiomyocytes, oxidative phosphorylation. It can be speculated that the asymptomatic group can tolerate the invasion of the NNV virus may be related to the activation of the adrenaline signaling pathway, which further enhances immune regulation to increase the body's ability to resist the virus. Based on PPI analysis, the selected core DEGs were PHGDH, PASAT1, RPL38, RPS2, ATP6V1D, ATP6V11, EEF2B, RPS2. These genes may affect the capsid protein encoded by NNV's RNA2, thereby making E. fuscoguttatus resistant to NNV infection. In the study of the pathogenic mechanism of NNV, Zhu et al. (2021) found that RNA2 can encode the capsid protein of the virus, and the capsid protein was the only structural protein of NNV, which were closely related to the pathogenic mechanism of the virus. Therefore, it can be made clear that the asymptomatic individuals have showed great specific defense to NNV disease based on this research basic. However, the screening of key genes against NNV in this study had certain limitations, and further research was needed in the future.

CONCLUSION
In summary, this study made a preliminary study on grouper's anti-NNV at the genome level, and the differentially expressed genes were successfully obtained. Through functional analysis of GO, COG, KEGG, etc., we have made relevant explorations of its key genes and SSR molecular markers. This study laid the foundation and provided a theoretical basis for the prevention and treatment of NNV diseases. In the future, we need more in-depth research in the selection and breeding of diseaseresistant basic groups. By screening parents with key genes resistant to NNV, we could breed disease resistant offspring, and ultimately achieve the purpose of disease prevention and control, and increase yield and benefit.