Quantitative Studies in Upland Cotton ( Gossypium hirsutum L . ) using Multivariate Techniques

| Sustainable production of cotton be contingent upon development of genotypes having better yield, tolerance with respect to abiotic and biotic stresses and enhanced fiber quality. Forty five elite lines of cotton were used to evaluate the genetic variability for fifteen parameters viz., plant height (cms), days taken in 50% flowering, monopodia plant -1 , sympodia plant -1 , boll weight (g), No. of bolls plant -1 , ginning out turn (%), lint index(g), seed index(g), 2.5 percent span length(mm), bundle strength(g/tex), micronaire (µg/in), fibre elongation (%), uniformity ratio and yield plant -1 (g). Using Mahalanobis D 2 analysis, these parameters were assembled into seven clusters. Among these clusters, cluster I and VII were largest each having nine and eight genotypes respectively, followed by cluster VI having seven genotypes. According to the illustrations by using hierarchical cluster analysis, total genotypes were grouped in seven clusters with 11 genotypes in cluster VI after that cluster II comprising 9 genotypes. The random distribution among genotypes showed that no parallelism exists amongst genetic and geographical diversity. First seven components in principal component analysis (PCA) having eigenvalue more than 1, showed 91.131 % the cumulative variance, while PC-1 alone showed 32.47 % variance. Hierarchical cluster analysis and PCA provided an opportunity to identify subgroups of clusters at different stages, so that every single subgroup may be analyzed critically and it will be helpful for incorporation of desirable characters in future breeding programmes


Introduction
C otton (Gossypium hirsutum L.), generally grown for its fiber in more than eighty countries of the world, is considered chief cash earning commodity and backbone of Pakistan's economy by contributing 0.8% to GDP and 4.1% to value added in agriculture (Anonymous, 2019). The cotton industry needs higher quantity and quality of raw cotton due to revolution in textile technology. Therefore, it is need of the time to develop high yielding cotton genotypes with superior fiber quality. The development of genotypes having genetically superior qualitative and quantitative traits is inevitable to combat with biotic and abiotic stresses (Bakhtavar et al., 2015). Exploitation of genetic diversity is very useful to identify desirable genotypes in existing germplasm for cotton improvement (Asha et al., 2013). The genetic diversity is assumed as major prerequisite for breeding program to overcome unexpected effects on crop plants due to frequent changes in climatic conditions (Rathinavel, 2017;Jarwar et al., 2019). The selection of genotypes having wider genetic diversity for numerous yield and fiber quality parameters is vital for future strategies of cotton breeding (Shabbir et al., 2016).
Multivariate analysis is being used as major tool to explore genetic divergence in genotypes ( Jarwar et al., 2019). Normally, hierarchical cluster analysis, Mahalanobis D 2 statistics and principal component analysis are used to explore genetic divergence in multivariate studies. Cluster analysis is important because it is helpful for in-depth analysis by splitting the clusters into sub clusters. The PCA is multivariate tool which extracts most valuable facts from data array into principal components (Sharma, 2006). While partitioning the total variation, PCA is very appropriate statistical tool which is useful to obtain suitable parents for effective breeding strategies (Akter et al., 2009;Nazir et al., 2013). Multivariate analysis approaches on cotton genotypes were accomplished by scientists, which enabled them to categorize the existing germplasm into distinctive clusters based upon fibre quality and yield traits (Shakeel et al., 2015). The experiment aimed to investigate genetic divergence and association amongst forty-five cotton genotypes and categorize them into different classes using multivariate techniques. So, that this information can be utilized in further breeding studies for heterosis and ultimately resulting in improvement of yield and quality parameters by selecting highly divergent parents.

Plant materials and location properties
Forty-five cotton strains (Table 1)

Management and design of trial
The genotypes were arranged in completely randomized block design having 3 replicates. The plot size of each entry was 4.54 m × 3.03 m, which comprised of four rows having plant x plant spacing 30 cms, while row x row spacing was 75 cms. Delinted seed of each entry was treated with fungicide and insecticide before sowing on beds. Gap filling was practiced after one week of sowing to ensure the plant population. Pre-emergence weedicide was applied before sowing and after germination manual weeding was practiced in the trial. Thinning was done at 25 days after sowing. Recommended dose of fertilizer was used viz., N: P: K @ 80:35:30 kg/ha respectively. Twelve irrigations were applied to the experiment during the season while plant protection measures were adopted as per requirement.

Traits measurement
Ten representative and undamaged plants from each plot were randomly marked for identification and data collection of parameters viz., plant height (cms), days to 50 % flowering, monopodia plant -1 , sympodia plant -1 , boll weight (g), No. of bolls plant -1 , ginning out turn (%), lint index (g), seed index (g), 2.5 percent span length (mm), bundle strength (g/tex), micronaire (µg/in), fibre elongation (%), uniformity ratio and yield plant -1 (g). Plant height (cms) was recorded with measuring rod from the base to the tip of the plant. Days to 50 % flowering was obtained by counting the days from sowing date to flower appearance on 50% plants. Monopodia plant -1 and sympodia plant -1 were calculated by counting the number of indirect and direct fruit bearing branches respectively. Boll weight (g) was calculated by picking 50 bolls from top, middle and base of each guarded plant and dividing the total weight by number of bolls. No. of bolls plant -1 was obtained by counting the total no. of bolls of guarded plants. Yield plant -1 (g) was obtained on plot basis. The seed of each entry was ginned with single roller ginning machine and lint gained from samples was weighed to calculate GOT % with the formula given below:

Fiber characteristics
At full maturation, the seed cotton was picked carefully and ginned after drying under sunshine. The fiber quality traits were assessed by Uster-1000 High Volume Instrumentation (HVI) (Sasser, 1981).

Statistical analysis
Tochers' method was used for Mahalanobis D 2 analysis as given by (Rao, 1952). Agglomerative heirarchial clustering method was worked out for cluster analysis following the method demonstrated by (Anderberg, 1993). Principle component analysis (PCA) was performed as given by ( Jackson, 1991).

Results and Discussion
Highly significant differences were observed in analysis of variance among 45 genotypes of cotton for 15 quantitative parameters ( Table 2). The genetic divergence depicted by 15 parameters was illustrated in Table 3 and Figure 1, which showed the contribution of each trait to the total genetic divergence. All the forty five cotton strains were congregated into seven clusters based upon D 2 statistics by using Tocher's method which depends upon the principle that the inter-cluster D 2 values must be higher than intracluster D 2 average values (Table 4). There was random distribution of forty five strains among seven clusters having maximum strains in cluster I (9 strains) while 8 strains in cluster VII and 7 strains in cluster VI. Six strains were present in cluster III and IV while 5 strains in cluster II and 4 strains in cluster V. Diagrammatic relationship among clusters was illustrated in Table 5, keeping in view the mean of intra and inter cluster D 2 values. Similar findings were observed by (Singh et al., 2012;Singh and Dubey, 2011;Srinivasulu et al., 2010;Eswara et al., 2009;Gopinath et al., 2009). Cluster-V showed maximum intra-cluster distance whereas it was minimum for cluster-VI and VII, which indicates that the strains in cluster-V was of diverse genetic makeup and these strains might belong to different genetic pool, whereas the inclinations were contrary for cluster VI and VII (Table 5, Figure 2). The strains of clusters IV and VI showed maximum inter-cluster distance which indicates existence of divergence in genetic makeup of strains among those clusters. Lowest inter-cluster distance was exhibited among strains in cluster-I and II, demonstrating the similarity among strains of this group regarding all parameters. The grouping array of the strains in clusters and inter-cluster distances showed that very less domestication was occurred and ultimately no parallelism amongst geographical distribution and genetic divergence of studied strains. Similar observations were described by (Singh et al., 2012;Mohanty and Prusti, 2002;Mohanty, 2001;Lokhande et al., 1987). While considering intercluster distances among the groups, best desirable segregants may be obtained by crossing the strains of clusters IV and VI after confirming their general combining ability. These outcomes are in line with the findings of (Asha et al., 2013). According to hierarchial clustering (Ward's minimum variance) method, forty five strains were assembled into 7 clusters (Table 6). Cluster-VI was biggest comprising 11 strains after that cluster-II having 9 strains, cluster-V having 7 strains, cluster-I having 6 strains, cluster-IV and-VII (5 strains each) and cluster-III (2 strains). Average intra and inter-cluster Euclidean distance (D 2 ) was computed following Ward's minimum variance technique and was illustrated in Table 7. Cluster-II exhibited maximum intra-cluster Euclidean 2 distance having value of 259.54 after that clusters-I (181.65), III (151.93), IV (149.78), VII (115.33), VI (107.84) and V (0.00) demonstrating maximum variability contained by cluster-II related to other clusters. The range of intercluster Euclidean distance (D 2 ) was from 169.13 (among clusters VI and VII) to 769.72 (clusters VI and V). According to the study, the clusters VI and VII are highly divergent, consequently, the strains from these two clusters may further be exploited in future breeding strategies for heterosis studies. Similar conclusions were submitted by (Asha et al., 2013;Lakshmi et al., 2009;Srinivasulu et al., 2010;Altaher and Singh, 2003).
Principal Component Analysis (PCA) revealed that Eigen values of first seven components was more      (Table 8). Therefore, inference can be drawn that the most valuable information of data set was present in first seven principal components. Similar observations were previously described by (Kumari et al., 2019;Shah et al., 2018;Latif et al., 2015;Kaleri et al., 2015;Saeed et al., 2014).

Conclusions and Recommendations
The PCA and hierarchical cluster analysis confirmed findings of each other. Non-correspondence between geographic diversity and genetic divergence was confirmed by adopting these three methods of grouping. Hierarchical cluster analysis provided an opportunity to identify subgroups of clusters at different stages, so that every single subgroup may be analyzed critically and will be helpful for incorporation of desirable characters in future breeding programs.

Novelty Statement
Hierarchical cluster analysis and principal component analysis provided an opportunity to identify subgroups of clusters at different stages, so that every single subgroup may be analyzed critically and it will be helpful for incorporation of desirable characters in future breeding programmes.