Identification of Novel Biomarkers MCM2 and GINS2 for Cervical Cancer

Cervical cancer (CC) is the most common malignant tumor in women, and its prognosis is poor. The key genes and pathways of CC need to be further discovered. GEO2R was used to identify differentially expressed genes (DEG), GO and KEGG enrichment were analyzed by DAVID. Then, the PPI network is constructed with STRINGS. The HUB gene and module of DEGS were obtained by Cytoscape. Finally, GEPIA also analyzed the differential expression and survival of key genes. 234 DEG were extracted from GSE9750. The uterus is the fourth organ highlighted in the concentrated analysis. The functional changes of DEGS are mainly related to cell cycle progression, cell cycle, helicase activity, DNA helicase activity, exosome and p53 signal pathway. In addition, five HUB genes and one key module were identified. Survival analysis showed that MCM2 and GINS2 were significantly correlated with overall survival. Expression analysis showed that MCM2 and GINS2 were highly expressed in cancer tissues, but low in normal tissues, which was consistent with the results of GEO analysis. Correlation analysis showed that there was a significant positive correlation between MCM2 and GINS2. This study suggests that MCM2 and GINS2 may be new biomarkers to predict the prognosis of CC.


INTRODUCTION
C ervical cancer (CC) is a common gynecological malignant tumor, it is the fourth leading cause of cancerrelated death in women all over the world and ranks eighth among the most common cancers (Cosper et al., 2020). In recent decades, the incidence of CC in young women is on the rise (Lin et al., 2019). There were an estimated 527,600 new cases and 265,700 deaths worldwide in 2012. The global death toll in 2018 was 311,000. Although the morbidity and mortality of CC have decreased due to the improvement of diagnosis and treatment in recent years, the prognosis of secondary metastatic cancer and tumor recurrence is very poor (Nambaru et al., 2009).
Although human papillomavirus (HPV) is a prerequisite for CC (Yuan et al., 2018), only a few women infected with the virus develop cancer (Huijsmans et al., 2016). Therefore, other risk factors should be considered as auxiliary factors leading to the progression of CC (Luyten et al., 2014). Abnormal regulatory genes play an important role in the occurrence and development of cervical squamous cell carcinoma Alifu et al., 2018). Many studies have identified the key genes in cervical squamous cell carcinoma and normal cervical tissues through gene expression profile technology, and a large number of differentially expressed genes (DEG) have been detected (Shah et al., 2020). However, DEGs reported in different studies varies greatly, and only part of them are consistently detected. Thus, it is urgent to find new and effective targets for anti-CC therapy.
In this study, we selected the following microarray dataset GSE9750 from the Gene Expression Omnibus (GEO) database to identify DEGs. DEGs was analyzed through Kyoto Encyclopedia of Genes and Genomes O n l i n e F i r s t A r t i c l e (KEGG). A protein-protein interaction (PPI) network was constructed to clarify the important relationship between DEGs and to find HUB genes. In addition, the differential expression and survival analysis of HUB genes were carried out on Gene Expression Profiling Interactive Analysis (GEPIA). The purpose of this study is to better understand the characteristics of the genes and signal transduction pathways related to CC through bioinformatics analysis.

Microarray data
The gene expression profiles of GSE9750 were downloaded from GEO database. GSE9750, which was based on GPL96 platform (Affymetrix Human Genome U133A Array). The GSE9750 dataset contained 33 tumor samples and 24 normal cervical samples.

Identification of DEGs
The online tool, GEO2R was applied to determine DEGs in normal cervical tissues and CC tissues. Adjusted P-values were used to reduce the false positive rate using the Benjamini and Hochberg false discovery rate method by default. Adjusted P≤0.05 and |log fold change (FC)|≥1.5 were set as cut-off values. A total of 234 DEGs were then identified, including 55 up-regulated and 179 down-regulated genes. Eventually, the top 5 genes were determined as hub genes ranked by the Degree method in cytoHubba, a plugin in Cytoscape 3.6.0 software.

GO and KEGG enrichment analysis
The database for annotation, visualization and integrated discovery is a public online bioinformatic database which helps to identify the most significant enriched functional genes and biological pathways. To further analyze the DEGs, GO and KEGG enrichment analyses were performed by using the DAVID. GO analysis was used to annotate biological process (BP), cytological component (CC), and molecular function (MF) of genes, and KEGG enrichment analysis was used to understand the relevant signaling pathways, p value < 0.05 was considered to be statistically significant.

PPI network and key module analysis
We constructed the protein-protein interaction network (PPI) of DEGs by using Search Tool for the Retrieval of Interacting Genes database based on the confidence scores. What's more, we further visualized the PPI by Cytoscape. And the Molecular Complex Detection (MCODE) plugin in Cytoscape was used to filter the key modules in the network with degree cutoff = 2, node score cutoff = 0.2, k-core = 2, and max. depth = 100. The criteria were set as follows: MCODE scores >3 and number of nodes >4.

Key genes screening and analysis
The genes with degree ≥10 in the network were identified as key genes, Gene Expression Profiling Interactive Analysis (GEPAI) is an interactive web application for gene expression analysis. We visualized the expression of key genes in CC tissues and normal tissues by box plots in GEPIA, and the overall survival analysis and correlation analysis of key genes was also performed.

Identification of DEGs
There were 33 CC tissues and 24 normal cervical tissue samples analysed in this study. Firstly, the GEO2R tool was employed to identify DEGs using the following cut-off values: Adjusted P≤0.05 and |logFC|≥1.5. As a result, a total of 234 DEGs were identified, including 55 up-regulated and 179 down-regulated genes ( Fig. 1).

GO enrichment and KEGG pathway enrichment
We uploaded all DEGs to DAVID to identify overrepresented GO categories and KEGG pathways. The enriched results of tissue expression were revealed, the screened DEGs were enriched in tissues including Foreskin, Keratinocyte, Skin, Tongue, Esophagus, Bone marrow, Plasma, Uterus, Placenta and Epithelium (Table I).
In the GO analysis, the screened DEGs mainly participate O n l i n e  (Table II). PPI network construction and analysis PPI network of 234 DEGs was constructed in STRING, the network visualized by Cytoscape. MOCD plug-in screened out one key modules, which were composed by up-regulated genes (Fig. 2). Red nodes represent up-regulated genes. Degree ≥ 10 as screening criteria, 5 key genes were screened to form key modules; Their names are shown in Table III.   Key gene analysis All aforementioned 5 hub genes were analyzed using the prognostic values of OS and DFS via the GEPIA website. GINS2 and MCM2 were significantly associated with OS (log-rank P= 0.035 and 0.0092 ( Fig. 3A and B). The analysis of these two genes revealed that low expression levels led to better survival status. The other hub genes did not exhibit statistical significance. The genes GINS2 and MCM2 were then subjected to further analysis. Expression levels of these two genes are displayed in Figure 4A, B. Both GINS2 and MCM2 presented high expression levels in CC tissues, but exhibited low expression levels in normal cervical tissues. In addition, Pearson correlation analyses between the genes are presented in Figure 4C. Results revealed that GINS2 was positively correlated with MCM2 (R=0.52, P=0).

DISCUSSION
In the study, we investigated the potential prognostic association between CC and DEGS in GSE9750. The results showed that there were 234 DEGs in 24 normal cervical tissues and 33 CC tissues, including 55 upregulated genes and 179 down-regulated genes. Both up-regulated and down-regulated genes were enriched in multiple organs. Notably, the uterus was the fourth organ highlighted in the enrichment analysis. Five hub genes were screened and one module was identified. GINS2 and MCM2 genes have potential prognostic value in patients with CC.
Although the morbidity and mortality of CC have declined in recent years due to improvements in diagnosis and treatment, the clinical outcome of advanced diseases is still very dim (Cuschieri et al., 2014). In addition, lymph node metastasis can lead to higher mortality and recurrence rates (Liu et al., 2020), even in patients with early CC (Li et al., 2016), and exact lymph node status information is essential for tailored adjuvant therapy . However, so far, there are no sensitive biomarkers that specifically reflect the indications of lymph node metastasis, as well as the early detection and prognosis of CC.
GINS complex 2 (GINS2), also known as PSF2, encodes a protein with a molecular weight of about 21 kDa (Ye et al., 2019). GINS2 belongs to the GINS O n l i n e

F i r s t A r t i c l e
Identification of Novel Biomarkers MCM2 and GINS2 5 complex family, which also includes GINS2, GINS3 and GINS4. GINS complex plays an important role in initiating DNA replication and cell cycle (Chi et al., 2020). The GINS family is involved in the maintenance of micro chromosomes (MCM) 2-mel-7 complex and cdc45 maintain stable interaction, which can correctly establish and maintain DNA replication bifurcations (Peng et al., 2016). In addition, the GINS component may play a role in cell division or, more accurately, in chromosome segregation. Ouyang et al. (2017) found that GINS2 gene knockout inhibits the proliferation, tumorigenicity, migration and invasion of CC cells (Ouyang et al., 2017). This is consistent with the results of this study, high expression of GINS2 in CC, low expression in normal cervical tissues, therefore, low expression of GINS2 leads to better survival. These results suggest that GINS2 may be a new index to identify high-risk patients and may be used as a clinical biomarker to predict the prognosis of patients with early CC.
In addition, GINS2 is reported to be associated with several other types of cancer . Such as, genome-wide gene expression profile analysis shows that GINS2 is highly expressed in lung cancer . Zheng et al., (2014) believe that GINS2 is related to the invasiveness of breast cancer and speculate that it is related to lung metastasis . Besides, the increased expression of GINS2 can promote the proliferation of leukemic cells and reduce the sensitivity of leukemic cells to apoptosis (Gao et al., 2013). These findings suggest that GINS2 plays an important role in cancer progression.
A series of events during HPV infection can cause host cells to experience an unplanned cell cycle (Hernadi et al., 2003). This phenomenon leads to cell division out of control, promotes cell proliferation, and then leads to cancer. In HPV-related CC, cancer cells upregulate specific genes that control several steps of DNA replication. Microchromosome maintenance complex (MCM) is an essential protein in DNA replication and cell division. Kaur et al. (2019) showed that the expression of MCM gene was up-regulated during the carcinogenesis of CC, which once again proved that MCM is a proliferation marker in the DNA replication pathway, which makes dysplasia and cancer cell proliferation more and more out of control (Kaur et al., 2019). MCMs is a candidate marker of cell proliferation. The increase of MCMs level indicates the proliferation of malignant cells. This is consistent with the results of this study, the expression of MCM2 is high in CC tissue and low in normal cervical tissue.
In addition, there is some evidence that MCM can predict tumor progression. Studies have shown that MCM protein is highly expressed in several other types of cancers (Pruitt et al., 2007), such as lung cancer (Cheung et al., 2017), breast cancer (Yousef et al., 2017;Issac et al., 2019), colon cancer (Burger, 2009), and other cancers (Deng et al., 2019) MCM protein may also be used as a potential diagnostic or prognostic marker for them.

CONCLUSION
In this study, 234 DEGs and 5 hub genes were identified in CC patients. Only MCM2 and GINS2 genes have prognostic value in patients with CC. This study suggests that MCM2 and GINS2 may be potential prognostic biomarkers of CC. In addition, MCM2 and GINS2 may play a carcinogenic role in cervical tumors. These genes may play a role through cell cycle process, cell cycle and helicase activity, DNA helicase activity, exosome and p53 signal pathway. Further research is needed to explore the functional role of these genes, especially in metastasis and cancer progression, in order to guide the clinical direction. The purpose of this study is to better understand the characteristics of cervical cancerrelated genes and signal transduction pathways by means of bioinformatics, and to provide further research ideas for discovering new pathogenesis, more prognostic factors and potential therapeutic targets of CC.

O n l i n e F i r s t A r t i c l e O n l i n e F i r s t A r t i c l e
Identification of Novel Biomarkers MCM2 and GINS2