Emergence of Novel Unique Recombinant Forms and Multiple Subtypes in Gag-Pol Region of HIV-1 in Punjab, Pakistan

1Department of Biochemistry, University of Central Punjab, Lahore 54782, Pakistan 2Ghazi National Institute of Engineering and Sciences (GNIES), Taunsa by Pass Road, Pakistan Chowk DG Khan, Pakistan 3Advanced Diagnostic Laboratory, Punjab AIDS Control Programme, Government of Punjab, Lahore 54000, Pakistan 4Department of Biochemistry and Molecular biology, University of Gujrat, Gujrat 50700, Pakistan 5School of Biological Sciences, University of the Punjab, Quaid-I-Azam Campus, Lahore 54590, Pakistan Article Information Received 24 August 2021 Revised 29 September 2021 Accepted 07 October 2021 Available online 13 December 2021 (early access)


INTRODUCTION
A cquired immunodeficiency syndrome (AIDS) is caused by the human immunodeficiency virus (HIV) which is a lentivirus and belongs to the family retroviridae. It is an RNA virus with a 9.2-kb genome that encodes O n l i n e

F i r s t A r t i c l e
ficiency virus. On the other hand, HIV-2 seems to have its roots in Sooty Mangabey (Cercocebusa tys), Gabon, and Cameroon (Reeves and Doms, 2002). HIV-1 possesses a high genetic variation and the deficiency of proof reading in the viral transcriptase provokes an increased rate of recombination and genetic mutations frequently. These mutations have given rise to emerging escape mutants with the passage of time (Abram et al., 2010). However, the increasing number of mutations in HIV-1 in the current situation has evolved to multiple genetic recombinants mainly such as circulating recombinant forms (CRFs) and unique recombinant forms (URFs) (Tongo et al., 2016). The universal Los Alamos HIV database does however, displays many CRFs circulating around the world. Asian countries have a predominancy of Subtypes A, B, C, D and G with multiple CRFs (Junqueira et al., 2016;Sanders-Buell et al., 2010;Sarker et al., 2008). In a developing country like Pakistan with a high burden of socio-economic conditions, HIV poses a significant health concern with the increasing number of cases and outbreaks. The trend has shifted from low to high rise in HIV infection in the country. Like any other developing country due to resource limited settings, AIDS is a major public health burden in Pakistan due to increase in HIV-1 positive cases in the near past. Recent reports of NACP ranked Punjab with the highest number of HIV cases of 75000, Sindh with 60,000, Balochistan as 5275 followed by Khyber Pakhtunkhwa with 16322 cases (Khanani et al., 1988). Urban development, migration from smaller cities for education, employment and medical facilities contributed to population rise mainly in the Punjab region. Thus migrational influx however could be one of the possible reasons of the increased number of HIV-1 cases. In low income nations like Pakistan, monitoring of HIV drug resistance and population-based surveillances are highly the need of the hour, to ensure the efficacy of continuing the treatment regimen for HIV-1 (Bennett et al., 2006;Jordan et al., 2008;Gilks et al., 2006). Increased urbanization in Punjab contributes as a major facilitating factor for exploration of HIV-1 subtypes, drug resistance mutations and genetic variations. Pakistan, however, demonstrates paucity in molecular and computational HIV related studies to date. This study focusses on investigating the drug-resistance mechanisms of the circulating HIV-1 subtypes in Punjab and highlighting the phylogenetic relationship among the isolates. The recombinant strains and URFs discussed in this study are however reported for the first time in both Gag and Pol genes in HIV-1 patients in Punjab, Pakistan. The data analyzed in the current study, unveils for the first time the major targeted areas of outbreak in Punjab for both Gag and Pol genes molecularly and computationally. Hence, such studies on this deadly virus will provoke necessary health implementations for treatment and control of HIV-1 in Pakistan in the near future.

MATERIALS AND METHODS
HIV-1 positive samples were directed from different treatment centers, to Punjab AIDS Control Programme (PACP) Lahore between October 2018 to March 2020. Among them only 122 HIV-1 positive samples were selected with increased viral load for the current study. Only 30 HIV-1 (0.1%) positive isolates fulfilled the quality processing criteria for genotype analysis. Informed consent was taken from all the selected subjects. RNA was extracted from 200 μl of plasma by QIAamp Viral RNA Mini kit (Qiagen, Valencia, CA, United States) according to the manufacturer's instructions. cDNA was prepared by the Revert Aid First Strand method. Both Gag (HXB2:781-1861) and Pol fragments were amplified (HXB2: 2147-3462) in a nested polymerase chain reaction (Chen et al., 2014). The successfully amplified samples for HIV-1 Gag and Pol were used for genotyping and phylogenetic analysis with their mentioned sociodemographics (Table I). Amplified samples were sequenced by First Base Laboratories Sdn Bhd, Seri Kembangan, and Selangor, Malaysia on an ABI 3730xl automated DNA analyzer (Applied Biosystems, Foster City, CA). Edited sequences were evaluated for drug resistance mutations using the surveillance drug resistance mutations (SDRM) software and the Stanford University Algorithm (http://hivdb.stanford.edu). REGA HIV subtyping tool was used for subtyping of Gag and Pol (http://www.bioafrica. net/subtypetool/html/). Intersubtype mosaicism in the potential sequences was investigated by the Recombination Identification Program (RIP version 3.5.1). The neighborjoining method using MEGA (Molecular Evolutionary Genetics Analysis, version 5.1) was used for phylogenetic analyses (Tamura et al., 2007). For non synonymoussynonymous substitution (dn-ds) analysis of sequences SNAP programme was used. DNA sequences were aligned using CLUSTAL W (Huang et al., 2012;Wilkinson et al., 2014). The HIV-1 isolates of pol gene, were analyzed for RT and PR regions by Calibrated Population Resistance (CPR) Tool (Version 6.0) (Chen et al., 2012). HIV-1 sequences were then submitted to GenBank under the following accession numbers: MT462197-MT462205, MT436255-MT436257, MT489302-MT489309 and MT561289-MT561297. The Gag and Pol sequences were edited using Bio-Edit 7.0 software.

Demographic characteristics of study isolates
The isolates selected in the current study were located O n l i n e

O n l i n e F i r s t A r t i c l e
present for the very first time molecular and computational analysis of Gag-Pol from Punjab region including three regions of outbreak Kot Imrana Sargodha, Faisalabad and Gujrat. The major transmission routes among the detected subjects were heterosexual transmission accounting of 16.6% as Immigrant workers, 10% as FSW, 6.66% as infection through spouse and 6.66% as transgenders. Intravenous drug transmission ranked the second highest route with 26.6% while the other transmission routes i.e. 10% as homosexual transmission and unknown routes accounted for 6.66% of infections (Table I).
HIV genotyping analysis Prediction of genetic variability reveals all the possible viral evolutionary information. Therefore, HIV-1 Gag and Pol sequences were analyzed by REGA HIV subtyping tool for subtype determination (Jeannot et al., 2009). HIV subtype A1 was found to be the most predominant subtype found in Pakistani population with 30% ratio among the isolates. CRF02_AG accounted for 10%, Subtype C a potential recombinant had 3.33% and the Unique Recombinant forms (URFs) accounted for 13.3% in the HIV Gag sequences (Table I). Boot scanning analysis of the Gag sequences however revealed four unique recombinant forms as shown in Table I. All the isolate sequences were subjected with bootscan cluster support 1.0 and 0.87 with a confidence level of 99.9%. The prevalence of URFs 13.3 % and CRF02_AG 10% was synchronous with each other which strongly strikes towards the notion that there might have been frequent recombination among the subtype A1 main strains itself ( Table I) Figure 1. Among the pol sequences, bootscanning analysis revealed that maximum number of subtype determination was accounted for unique recombinant forms (URFs) among the 13 isolates (Table  II, Fig. 3). Intersubtype recombination was demonstrated strongly among the pol genotyped isolates. All Pol URFs demonstrated 40% of ratio among the isolates, and subtype D with 6.66% (Table II). The subtype genetic variability and diversity varied with different routes of transmission in infection (Table II). HIV infection caused due to quack treatment also demonstrated subtype A1 prevalence as it is caused by common syringe usage. Some IDUs and 2 isolates MBBL06 and MBBL07 were quack infected and demonstrated genotype A1 (Table I). However, two isolates MBBL05 and MBBL27 did show the genotype CRF02_AG which does suggest the notion that there might be a strong heterosexual transmission route from expats returning from abroad depicting this genotype ( Fig. 2A-C). This kind of genotypic pattern shows an association with multiple transmission routes (Tables I  and II). The proportion of URFs among the pol genotyped isolates was significantly higher than Gag. This increased prevalence strikes towards an increased vulnerable hypervirulency in the HIV virus (Zahra et al., 2021).

Recombination and substitution analysis
Genetic diversity initiates intersubtype mosaicism among the candidate genes. There was a high rate of recombination among the isolates identified with Gag gene in our study (Supplementary Fig. 1). Window size= 200, Confidence threshold = 0.9, Gap option= 3, and Multistate characters = yes, were used for the recombination analysis of the isolates. The RIP anlaysis of these four isolates MBBL01, MBBL05, MBBL14 and MBBL10 demonstrated a similarity score of 0.87 ( Supplementary Fig. 1A), 0.91 ( Supplementary Fig. 1B), 0.92 ( Supplementary Fig. 1C), and 0.89 ( Supplementary  Fig. 1D). This increased level of mosaicism however denotes to a high rate of intersubtype mosaicism occurring among the region. However previous studies in Pakistan did demonstrate this fact about co-circulating viruses due to increased genetic diversity (Abidi et al., 2014). To measure the rate of natural selection, 30 isolate sequences O n l i n e

F i r s t A r t i c l e
Genetic Diversity and Emergence of HIV-1 Unique Circulating Recombinant forms in Pakistan were subjected to SNAP programme for computing dn/ds ratio. The SNAP results however revealed higher values for Subtype A1 and Subtype C denoting a positive selection pressure and sequence conservation within the Pakistani population. Cumulative behavior by SNAP analysis was shown in Supplementary Figure S2 and Supplementary  Table SI.

Genotypic analysis of drug resistance
Anti HIV drugs such as nucleoside reverse transcriptase inhibitors, protease inhibitors and nonnucleoside analog reverse transcriptase inhibitors are generally the recommended combination of drugs used by HIV-1 infected individuals in Punjab, Pakistan as provided at PACP treatment centers. Substitution mutations with a high degree of drug resistance poses a challenge on the treatment and efficacy of HIV-1. Hence, to combat the hinderances evaluation of the RT and PR region in Pol gene would pave pathways in exploring the associated drug resistance in circulating HIV-1 subtypes. Our analysis demonstrated an accessory resistance mutation V32I only in one isolate MBBL17 originating from Lahore. These findings however exhibit a synchronization with previous studies done in Sindh-Pakistan where a single primary mutation (L90M) was detected in a patient with PI usage history (Shah et al., 2011). The drug resistance analysis in our study neither showed major nor minor mutations significantly suggesting the fact that HIV-1 strains are sensitive to NNRTIS and NRTIS being used in Pakistan. Our investigations also shed light on the notion that the circulating viruses reflect sensitivity especially towards protease inhibitors like lopinavir, darunavir and atazanavir. Some non-significant substitutions were observed in the whole length of the RT region. The substitutions were: V35T, D121Y, V60I, K49R, I135T, K122E, I142L, K166R, Q207G, R211K, A272P, K275R, K277R, Q278H,

O n l i n e F i r s t A r t i c l e
A. Zahra et al. The RT region exhibited minor mutations leaving an insignificant impact on drug resistance to date (Rhee et al., 2003;Lai et al., 2014;Gatanaga et al., 2010 Similarly, absence of substantial changes in the RT region denoted a conferred resistance to reverse transcriptase inhibitors (RTIs). Only one isolate had a mutation V32I which conferred a low level resistance to NNRTIs (Table  II). Medium and low resistance levels to six PIs were detected against atazanavir (ATV), indinavir (IDV), lopinavir (LPV), nelfinavir (NFV), saquinavir (SQV) and tipranavir (TPV). High-level and moderate resistances were also revealed among six NRTIs such as lamivudine (3TC), zidovudine (AZT), abacavir (ABC), stavudine (D4T), didanosine (DDI) and entricitabine (FTC)) and four NNRTIs [efavirenz (EFV), etravirine (ETR), nevirapine (NVP) and rilpivirine (RPV)].

Phylogenetic analysis
Phylogenetic analysis demonstrated that 17 Gag sequences from this study clustered into three different clades. Among the isolates, 10 clustered with subtype A originating from Pakistan, Kenya, Uganda and Rawanda. Of these, 3 clustered with sequences corresponding subsubtype 02_AG, having its descent from Kenya, Ghana and South Africa. Three isolates demonstrated URFs and 1 sequence clustered within a clade representing subtype A and G clustered within a clade representing sub-type G from Kenya, Cameroon and Ghana (Fig. 4A, B), showing an evolutionary origin from African countries. The isolate's travel histories were insufficient to have reasonably connected probable ancestral links. As a result, clustering of subtypes 02 AG, A, and G in African countries, therefore brought to evidence an evolutionary origin of HIV-1 from Africa (Huet et al., 1990). However, 13 Pol sequences were identified in our study, among them the highest ratio was of URFs. Forty percent of the sequences clustered with unique recombinant forms and only one isolate MBBL26 clustered with subtype D. This nature of clustering isolates with a wide range of reported databases therefore, indicates the genetic diversity, heterogeneity and potential evolutionary dynamics of HIV-1. These analyses give the notion of presence of many subtypes and intersubtype recombinant forms at the population level (Song et al., 2018).

DISCUSSION
Investigations based on HIV-1 Gag and Pol analysis among the sequences targeting the circulating genotypes in the Punjab Province was the main aim of the current study. The sociodemographic analysis of this study however, revealed that Lahore due to its urbanization and economic development does have the highest number of HIV-1 patients in the Punjab including three major regions of outbreak i.e., Kot Imrana (Sargodha), Gujrat (Jalalpur Jattan) and Faisalabad. In the regions of outbreak common syringe usage by the local quacks, and rural medical facilities was the main culprit of the outbreak (Zaid and Afzal 2018;Wahid, 2019). These findings do support the fact that indirect facilitation and fast population mobility could possibly be the major routes in spreading of HIV-1 and transmitting the viruses to the community as well. In this study, HIV-1 genotypes prevailing predominantly in the heterosexually infected population in Punjab were investigated based on the analysis of HIV-1 Gag and Pol sequences. Lahore being the capital of Punjab province could possibly lead to increased genetic diversity of HIV-1 due to its growing population. Lack of proper biosafety measures in hospitals, frequent migration in the suburb from nearby regions for education and health has contributed to the rise in infection (Rana and Bhatti, 2018). To add to the fact, previously reported researches however have brought to light the phenomenon of HIV-1 infection among people migrating frequently resulting in emergence of novel subtypes, inter-subtype recombinants, and CRFs (Zhong et al., 2007;Kalichman et al., 2017;Lebedev et al., 2019). The subtype, SNAP and RIP analysis of this study however reveals major molecular revelations in HIV-1 sequences from Punjab, Pakistan. Nonetheless, phylogenetic and genotypic analyses of drug resistance in HIV-1 sequences brings to light the major significant and non-significant mutations especially in the Pol region. Such novel molecular analysis from Punjab, Pakistan, reported in our study could possibly pave pathways in understanding of the underlying mechanisms for HIV-1 in resource limited countries like Pakistan. HIV-1 subtype O n l i n e

F i r s t A r t i c l e
A. Zahra et al. distribution appears to be quite diverse in other Asian countries too. In Iran previous studies have reported HIV-1 subtypes A and B mainly in the IDUs (Sarrami-Forooshani et al., 2006). Similarly, studies reported from countries like Uzbekistan, Kazakhstan and Yemen also demonstrated a prevalence of subtypes A, B and CRF02_AG which share synchrony with the present data indicating the commonly circulating prevalent subtypes in Asia (Kurbanov et al., 2003;Carr et al., 2005;Bobkov et al., 2004;. HIV molecular epidemiological research in Pakistan is still in the infant stages, and yet well affirmed HIV prevention policies need to be designed. Previously reported data in Pakistan in HIV-1 genotypes evidently demonstrates genetic diversity with existing co-circulating genotypes such as: CRF_ AG, A, B and CRF_AE which are synchronous with our findings also (Mujeeb and Altaf, 2003;Altaf and Mujeeb, 2002;Hyder and Khan, 1998). However, our data demonstrates that apart from Subtype A1, the URFs were becoming more prevalent than the previously reported subtypes A, G and CRF_02 AG (Khan et al., 2006;Yaqub et al., 2019). This changing trend of introduction of novel and new recombinant subtypes is because of returning migrant workers and expatriates in Pakistan. This unchecked migrational influx of expatriates, ultimately provokes increased incidence of HIV-1 infection. It has also contributed in altering the pattern of different HIV-1 subtypes and increased drug resistances turning it to a more challengeable situation for the concerned health authorities. Absence of major resistance-associated substitutions in our findings likewise agree with another study conducted in Sindh, Pakistan, where no major resistances were recognized (Baqi et al., 1998;Khanani et al., 2011;Khan et al., 2018). Our study does come up with the notion that circulating viruses are highly sensitive to all protease inhibitors, implicating lopinavir, atazanavir and darunavir. The 30 isolates in this investigation had no significant major or minor mutations in the RT area that conferred resistance to reverse transcriptase inhibitors (RTIs). This strongly denotes the fact, that the prevalent HIV-1 strains in Pakistan are sensitive to the NRTIs and NNRTIs. PR and RT regions display down regulation in prevalence of drug-resistance associated mutations in the infected individuals. Combination drug regimens of PIs in combination with RTIs like lopinavir, ritonavir, nevirapine, lamivudine, tenofovir and zidovudine, could be continued to treat patients. Furthermore, continuation of these studies pave ways for essential modifications in drug regimens which could further be applicable for effective therapeutics and subsequent control in the future. Hence, the current study unveils novel information on circulating subtypes and drug resistance mutations in Gag-Pol genes for HIV-1 patients for the first time in Pakistan's Punjab province. The phylogenetic and computational revelations made by this study will be extremely beneficial for clinicians, physicians and researchers in determining the best possible therapeutic strategies for controlling the transmission of this virus. It also uncovers primitive information about the evolutionary links and drug resistance patterns of the recent existing HIV strains in a developing country like Pakistan. Future studies are therefore highly the need of the hour to control the spread of this virus and monitor drug resistance mechanisms for implementation of appropriate therapeutic options.

ACKNOWLEDGMENTS
We are highly grateful to Dr. Asim Altaf and Dr. Munir Ahmad Malik, Punjab AIDS Control Programme for providing assistance and facilities at PACP. We are also very grateful to the laboratory staff at PACP for providing the opportunity to collect the samples and experimental work as recommended by WHO. This study was undertaken in collaboration and support by the Punjab AIDS Control Programme (PACP) and Institute of Public Health (IPH) Lahore, Pakistan.

Funding
The research study was funded by the University of Central Punjab Pakistan and Punjab AIDS Control Programme.

Supplementary material
There is supplementary material associated with this article. Access the material online at: https://dx.doi. org/10.17582/journal.pjz/20210824080807

Statement of conflict of interest
The authors have declared no conflict of interest.