Assessment of Advanced Artificial Intelligence Techniques for Streamflow Forecasting in Jhelum River Basin
Research Article
Assessment of Advanced Artificial Intelligence Techniques for Streamflow Forecasting in Jhelum River Basin
Muhammad Waqas1*, Muhammad Shoaib2, Muhammad Saifullah1, Adila Naseem4, Sarfraz Hashim1, Farrukh Ehsan1, Irfan Ali3 and Alamgir Khan1
1Department of Agricultural Engineering, MNS-University of Agriculture Multan, Pakistan; 2Department of Agricultural Engineering, Bahauddin Zakariya University, Multan, Pakistan; 3Natural Resources Division, Pakistan Agricultural Research Council (PARC), Islamabad; 4Institute of Food Science and Nutrition, Bahauddin Zakariya University, Multan, Pakistan.
Abstract | Streamflow forecasting is a crucial hydrological variable. In the current study, the Artificial Intelligence (AI) based techniques: TB (Tree Boost), DTF Decision Tree Forest, SDT Single Decision Tree and conventional Multilayer Perceptron Neural Networks (MLPNN) are used for predicting streamflow of Jhelum River basin. The dataset was divided into two sections, i.e., training dataset (1971-2000); and testing dataset (2001-12). The tendency investigation was done by the Sen’s slope and Mann–Kendall (MK). Decreasing trends annually and seasonally found in MK and Sen’s Slope tests. The highest decreasing trend of -2.23 was observed in Autumn at Narran station, while the lowest change of -0.09 annually observed at Garhi Habibullah station at 95% of the significance level. The flow duration curves (FDCs) of all basin stations showed that DTF performed better and is more effective than other AI techniques. R2, RMSE, and NSE assessed the performance evaluation. DTF was more efficient AI techniques with the average evaluation parameters R2, NSE, and RMSE are 0.998, 0.992, and 382 m3/sec. The assessment revealed that DTF has potential and may be considered as an alternative method for streamflow forecasting.
Received | February 04, 2020; Accepted | June 10, 2021; Published | June 27, 2021
*Correspondence | Muhammad Waqas, Department of Agricultural Engineering, MNS-University of Agriculture Multan, Pakistan; Email: [email protected]
Citation | Waqas, M., M. Shoaib, M. Saifullah, A. Naseem, F. Ehsan, S. Hashim and I. Ali. 2021. Assessment of advanced artificial intelligence techniques for streamflow forecasting in Jhelum River Basin. Pakistan Journal of Agricultural Research, 33(x): 580-598.
DOI | https://dx.doi.org/10.17582/journal.pjar/2021/34.3.580.598
Keywords | Data-driven models, Forecasting, Hydrological cycle, Modelling, Indus basin
Introduction
The use of freshwater in the agricultural, industrial and domestic sectors is increasing daily due to the increasing world population and intensive farming practices (Maja et al., 2021). Surface water is the most commonly available freshwater resource, and agriculture is the primary user of it. The efficient use of open surface water is becoming a real challenge for researchers (Huang et al., 2021; Lehner et al., 2021). Globally, the planning and management of freshwater resources partake a crucial part in economy and agriculture sustainability at national and regional scales.
Irregular precipitation depths, Rapid urbanization, and frequencies affect water administration practices or storage capacity (Francipane et al., 2021; Jha et al., 2007; Singh et al., 2013; Sofia and Tarolli, 2017). These problems follow to distinguish around the up-to-date short-term base procedures that instructive streamflow forecasting (Makkeasorn et al., 2008). Hydrological systems are very complicated to forecast, especially streamflow due to non-linear behavior between inputs and outputs. In Past decades, numerous forecasting methods used by investigators to discourse the runoff misplaced statistics issue (Tayyab et al., 2019).
Typically, researchers divided these techniques into data-driven and process-driven approaches (Barge and Sharif, 2016). Data-driven techniques are used for short-term streamflow forecasting (STF). Two types of techniques are used to predict STF: intelligent-based algorithms; and simple mathematical equations (through its statistical features, which neglect the physical processes involved in it) (GEP Box and Jenkins, 1976). Process driven methods are implemented for continuing forecasting (LTF), like as rainfall-runoff forecasting, by applying the corporeal contributions of the basin tangled in this procedure. The LTF includes weekly, monthly, and yearly forecasts, while STF consists of an hourly and daily basis predictive. LTF is majorly used in sediments of transportation, water management, hydropower, strategies and planning of pools, while STF is essential in mitigation of flood (Sudheer et al., 2002).
For acquisition of data in LTF and STF, the following techniques are used: MLR (Multiple Linear regression); LR Linear regression; ARIMA (Auto integrated moving with exogenous input); ARMA (Autoregressive moving average); and AR (Autoregressive) (GE Box, 1970; Salas, 1980; Valipour, 2015; Valipour et al., 2013; Valipour and Montazar, 2012; Wu et al., 2009). These techniques have been used since 1970 to predict streamflow. Furthermore, all models were unable to find out the non-stationary and non-linear associations in hydrological procedures (Meng et al., 2019).
At the end of the twentieth century, artificial neural networks (ANNs) recognized hydrological modeling and set it benchmark (Abdellatif et al., 2015; Dariane et al., 2018). During the last two decades, AI-based techniques got significant attention in streamflow forecasting by researchers and hydrologists (Jothiprakash and Magar, 2012; Kentel, 2009; Terzi and Ergin, 2014; Valipour and Montazar, 2012; Yaseen et al., 2016). Several (AI) based approaches are successfully applied for predicting the streamflow of rivers. These approaches comprise ANNs, SVM, SOM, ACO, PSO, GA and GEP (Babovic et al., 2000; Mehr, 2018; Robert et al., 2020).
ANNs have appeared as significant black-box approaches (Coulibaly et al., 2000). Tiwari and Chatterjee (2011) used hybrid wavelet bootstrapped (HWB) in forecasting daily discharge, which is also an extension of ANN. Chokmani et al. (2008) revealed that the ANN model developed by using HWB techniques was found more superior in predicting streamflow than others. He linked ANN and regression models to estimate river flow stream affected by icy conditions, and they revealed that ANN techniques produce better results in winter streamflow estimation.
Many studies revealed that ANNs have some limitations and drawbacks to predict streamflow. These include stopping criteria, overfitting issue, low learning speed, backpropagation issue, or other human intervention such as rate of learning and learning epochs (Yaseen et al., 2015). Thus, there is a need to develop some approaches to overcome these problems and generate better results as compared with ANNs.
In this study, three AI-based techniques are used: Single decision tree (SDT) (Quinlan, 1986), a rational technique that usage sets of forecast variables to forecast the mark value; Tree Boost (TB) (Freund and Schapire, 1996) generate an order of trees through the conclusion of the single tree alter hooked on the next division of the tree; and DTF (Breiman, 1996) cluster of conclusion trees whose forecasts are shared to change to the universal prediction.
Different algorithms CART, ID3, association rules, and Out of Bag are used in these techniques. Bagging and boosting are also investigated to construct strong predictors for streamflow forecasting. The bagging method (Breiman, 1996) was proposed to improve the forecast ability of frail forecasters. Boosting is also a well-known ensemble technique with the similar viewpoint, which generates a linear combination among models (Hancock et al., 2005).
These AI-based techniques partake not widely implemented in hydrological investigation, particularly in streamflow forecasting. Some researchers did a few recent implementations of these techniques in the hydrological analysis. Vezza et al. (2010) Determined that the CARTs performed better other than three classification methods in rapports of variance. The Connotation consequence of both stochastic GBRTs and BRTs methods own regular streamflow prediction presentation, these are better than CART and SVR and models. He also knows that the predicting correctness of the regular streamflow of CART is remarkably advanced in collaborative learning paradigms (Erdal and Karakurt, 2013). In forecasting hydrological components, the application of the CART is limited. Four cataloguing approaches: Sis (seasonality indices), RPA (residual pattern approach), WCA (weighted cluster analysis), and CARTs were castoff with morpho climatic basin physiognomies (Vezza et al., 2010).
As mentioned above, many AI-based techniques were employed for streamflow forecasting, but still, some techniques have not yet been evaluated, such as Decision Trees. These trees are an AI-based approach to extract some valuable data from the complete dataset (Deo et al., 2017; Etemad-Shahidi and Mahjoobi, 2009; Keshtegar and Kisi, 2017; McGarry et al., 1999; Preis, 2008; Quinlan, 1987; Xu et al., 2005).
Sen’s Slope (Şen, 2011) was implemented in many studies (Ali et al., 2019; Mallick et al., 2021; Wang et al., 2020) to detect the modification in magnitude, and Mann-Kendall (Mann, 1945) trend assessment was used to confirm the significance of tendencies. In the conclusion, some statistical assessment parameters are used in current research, namely RMSE (root mean square error), R2 (coefficient of determination), (Levinson and Physics, 1946), and Nash-Sutcliff efficiency (NSE) (Nash and Sutcliffe, 1970), to compare results of developed techniques.
Objectives of this study are:
- Evaluation of advance artificial intelligence model for Upper Jhelum basin.
- Comparison with conventional methods.
- To examine the potential of AI-based methods for streamflow forecasting.
Materials and Methods
As mentioned earlier, the aim of this investigation to assess or compare AI-based streamflow modeling results. Data collection, daily based on streamflow stations situated in the basin, was collected from 1971-2012. To train and test the applied techniques streamflow dataset was employed. Q(i-30) was set as the target variable during the training and testing of models. Whereas, Q(i-1), Q(i-2), Q(i-3) …… Q(i-29) was set as the predictor comprises on daily flow lacked by 1-day, 2-days, 3-days respectively for the models as input shown in Table 3. (Sheikh, 2001) revealed that the southern parts of Pakistan lie below 30° of the north pole, especially the southern Punjab Balochistan and Sindh. Pakistan land comprise of semi-arid, arid, and hyper-arid areas. It has a varied climate because of the elongated longitudinal extent from 24-37°N. Therefore, there are four seasons in which two major rainy seasons, namely summer (June-September of each year) and winter (December to March each year). Based on rainy seasons in Pakistan, the dataset of the whole basin was alienated into four seasons, namely autumn, summer, spring, winter respectively. Results of this study are evaluated based on applied performance evaluation criteria mentioned above in Figure 2. It can be elaborated in four steps: Trend analysis, Statistical analysis, Flow duration curves. The Sen’s Slope and Mann-Kendall trend assessment and estimator remained working for tendency investigation. The coefficient of RMSE root mean square errors, R2 Determination, NSE Nash-Sutcliffe Efficiency, Evaluation standards were designated to evaluate the competence of forecasting all functional AI techniques. Flow duration curves were used to assess the results between observed and predicted (Da Silva et al., 2015; Saifullah et al., 2016; Shoaib et al., 2018; Tayyab et al., 2018).
Study area
Current investigation was associated with the predicting of streamflow in the Jhelum River basin, Western Himalayas, which is located in longitude 73° 07ʹ to 75°40ʹ E and latitude 33° 00 to 35°12ʹ S of the Himalayas mountain range having a total area of 33,425 km2 up to Mangla reservoir situated in Azad Kashmir, Pakistan (Mahmoodet al., 2015). The major tributaries of the Jhelum River basin are Neelum, Kunhar, and Jhelum River. At the same time, the two more rivers Poonch and Kanshi, joins at Mangla Dam.
Eight streamflow gauge stations of the Jhelum River basin were used in this study. Two hydrological regimes contribute to the flow of Jhelum River basin, a rainfall regime and a nival regime that depends on concurrent rainfall and melting of winter snow, respectively. Moon soon season affect the lower basin portion of the Jhelum River. The Kohala and Azad Pattan stations have the highest average streamflow of 757.31m3/sec and 793.49 m3/sec. Both stations lie in rainfall regimes (Azmat et al., 2016). During the summer, mountain snow melted and overflow 20% to maximum momentous reporting about 45% in January. Whereas the Kanshi sub-catchment has zero snow cover and the Poonch sub-catchment has low elevation, it is snow-covered. Furthermore, Neelum and Kunhar sub-catchment have high elevations with the largest snow-covered area. Significant characteristics of the Jhelum River basin and its sub-basins are shown in Table 1.
Table 1: Characteristics of Jhelum River basin and its sub Catchment.
Basin/Sub-Basin |
Mean Annual Rainfall (mm) |
Discharge m3/(sec) |
Elevation (m) |
Area (Km2) |
Kanshi |
898 |
6 |
310-867 |
1298 |
Poonch |
973 |
128 |
329-4698 |
4270 |
Neelum |
1509 |
325 |
671-6285 |
7414 |
Kunhar |
1696 |
103 |
634-5106 |
2631 |
Mangla |
1046 |
967 |
300-6528 |
33435 |
Data acquisition
Water and Power Development Authority and Surface Water Hydrology provided daily based streamflow datasets from 1971 to 2012. Eight streamflow gauge stations of the Jhelum River basin were used in this study. Seasonal distribution of 42 years datasets of all stations was done and then further alienated into two testing and training. In current investigation 1971 to 2012, statistics used for training from 2001-12 was used to run and test the model.
Table 2: Summary of basic statistics of streamflow of individual stations in the Jhelum River basin and Zoning of Basin.
Zones/ Sub-Basins |
Stations Name |
Lat. (dd) |
Long. (dd) |
El (m) |
Std. |
Median |
Mean |
Mode |
Cv |
Cs |
Zone I <500m |
Kotli |
33.2 |
73.4 |
400 |
185.02 |
85 |
128.16 |
20 |
144.37 |
0.7 |
Azad Pattan |
33.7 |
73.6 |
485 |
668.6 |
552.07 |
793.49 |
0 |
84.26 |
1.08 |
|
Zone II <1000m |
Kohala |
34.1 |
73.5 |
560 |
637.78 |
515.6 |
757.31 |
0 |
84.22 |
1.14 |
Muzaffarabad |
34.4 |
73.5 |
670 |
318.12 |
178.48 |
324.18 |
103 |
98.13 |
1.37 |
|
Domel |
34.4 |
73.5 |
701 |
271.51 |
169.8 |
268.32 |
0 |
101.19 |
1.09 |
|
Garhi Habibullah |
34.4 |
73.4 |
820 |
96.55 |
55.99 |
99.41 |
0 |
97.13 |
1.35 |
|
Zone III >1000 |
Chinari |
34.2 |
73.8 |
1070 |
244.24 |
206.52 |
294.77 |
0 |
82.86 |
1.08 |
Naran |
34.9 |
73.7 |
2400 |
52.39 |
20.76 |
45.88 |
9.5 |
114.2 |
1.44 |
The Mean annual rainfall of 1696 mm was found at the Kunhar sub-basin with discharge103 m3/sec. The highest discharge of 967 m3/sec was detected at Mangla sub-basin due to other sub-basin contributions. It has the largest area of 33435 Km2 and has the lowest elevation than another sub-basin. The Neelum sub-basin has a discharge of 325 m3/sec. Kanshi sub-basin has the lowest discharge contribution 6 m3/sec in the catchment due to its elevation and area, elaborated in Table 2.
Table 3: Input for AI techniques.
Input for AI techniques |
Q(i-1), Q(i-2), Q(i-3) ………. Q(i-29), Q(i-30) |
Target Variable |
Q(i-30) |
Predictors |
Q(i-1), Q(i-2), Q(i-3) …… (i-29) |
In current investigation, Basin of river Jhelum River division into three zones concerning elevation as Zone I < 500m, Zone II< 1000m, and Zone III <3000, further elaborated in Table 2. Zone I consist of two streamflow stations: Kotli and Azad Pattan, which have a mean streamflow contribution to Jhelum River Basin are 128.16 m3/sec and 793 m3/sec. Zone II consists of five stations, namely Kohala, Muzaffarabad, Domel, and Garhi Habibullah, with mean streamflow of 757.31 m3/sec, 324.18 m3/sec 268.32, and 99.41 m3/sec, respectively, from which Garhi Habibullah has the lowest contribution to the catchment. Whereas Zone III is the highest zone in the Jhelum River Basin has two streamflow stations Chinari and Naran which contribute 294.77 m3/sec and 45.88 m3/sec. The average streamflow statistics of stations of particular zones was employed to interpret and reduce model input dataset requirements.
Trend analysis techniques
Mann-Kendall trend test (MK test): In hydrology and climatology filed, find out the trend by the Mann-Kendall (MK) non-parametric analysis (Saifullah et al., 2016). In the field of climatology and hydrology, the trend is widely detected by the (Mann, 1945) non-parametric Mann-Kendall (MK) test (Saifullah et al., 2016). We can determine the S and MK by the following formula;
Equation 1
Equation 2
=0 if xj- xi=1
=-1 if xj-xi<1
Where; length of data set can represent as “n” the values of xi and xj are at the times of i and j. In formula values of S in positive and negative trend illustrate the decreasing or increasing tendency of the statics set correspondingly. The following expression is cast-off in this study, where the dataset length (n) is greater than 10.
Equation 3
Where; ti is the number of data values. The Z value is determined after the determination of the variance of time series data which is calculated with the following equation;
Equation 4
Z Value is then compared with standard normal distribution table with significance levels (𝛼=1%, 𝛼=5%, and 𝛼=10%) (Ali et al., 2019). The [Ho] is excluded if the Z value is more significant than |Z| > |Z1−𝛼/2|. Therefore, trend is substantial. In other case, the [Ho] is accepted (Saifullah et al., 2016).
Sen’s slope estimation
True slop can be estimate on trend existing (alter per year) the Sen’s techniques is used in non-parametric (Şen, 2011). Sen’s technique implemented on those cases where the tendency must be expected in linear. In formula Sen’s non-parametric denotes the f(t) is equivalent mainly utilized in hydrology and water resources to estimate the true slope of any trend (Da Silva et al., 2015). It is used in cases where the trend is assumed to be linear. Mathematically,
Equation 5
In Equation 1, B is a constant, and Q is a Slope. To get Q, we must determine the slope of all datasets.
Equation 6
In Equation 2, j>k. slope estimates Qi if in xj, in time series the values is n we will get N=n(n-1)/2.
Really, N values is median of slope in the Sen’s estimator and these values are ordered from lowest to higher. Mathematically Sen’s estimator is:
If N is Even,
Equation 7
Q=1/2[N(N/2) + Q(N+2/2)] … (8)
Confidence interval on two-sided 100(1-α) % about the slope is estimated and can be obtained by the non-parametric method on the normal distribution. It is valid if n>10. In this study, the confidence level is computed at two different levels. 1) α=0.01 and 2) α=0.05. To get these confidence levels first, we compute,
Cα=Z1- α/2 (VAR(s))1/2 … (9)
Z1-α/2 is obtained from the standard normal distribution, whereas VAR(S) is already described in Equation 4. M1= (N+Cα)/2 and M2= (N+Cα)/2 are computed. Qmax upper limits and Qmin lower limits of the confidence level contribute to finding the Qi. If the M1 and M2 are not the whole numbers, then the lower and upper limit are interpolated. To get the B value, the N values of differences xi - Qti are calculated (Salmi, 2002).
AI-based techniques
Single decision tree (SDT): SDT consists of three elementary levels: Root Node, Interior node, child node, and terminal node shown in Figure 1. The Interior node connects nodes with other nodes. Child nodes further split into terminal nodes. Terminal node explicit output value. Overall, it consists of two stages known as tree building and pruning stages. During the first stage, the dataset is arranged from top to bottom, and secondly, during tree pruning, data having high entropy level is removed or altered. After pruning of dataset, the tree is constructed. In the second level, a relationship is built between predicted and target variables. A weight variable is assigned to each interior node which further split it into child and terminal node. Furthermore, if no weight variables are set to interior nodes and child nodes, then a fixed variable is allocated to the dataset (Sherrod, 2003).
Splitting nodes formula: AI techniques which are mentioned following equation is employed to separate the predicator variable.
Equation 10
K represent the predictor variable number or categories.
For appropriate and efficient outcomes is required for the evaluation of the values and comparison with their qualities. Main purpose of the SDT to in heterogeneity and homogeneity among with each node during the creation of the regression trees (Sherrod, 2003). Predication of missing values are done due to multiple reason, preferably it not be happening but some how probability of data missing. Unfortunately, in the hydrology process, missing statistics due to numerous reasons, like as technical individual engaging, instrument or weather. But in the Decision tree, use surrogate splitters to approximate the values of the predictor with missing values. The SDT narrates which line will be in right side or which in left side or new created node associated the splitter surrogate with the main separate of the row This connotation among the prime and substitute splitters calculated by what method numerous rows are disappeared beforehand in from child nodes of left and right.
Still, the definite scheming behind schedule this purpose is complicated. This substitute forecast value is categorized in decreasing direction of connotation (Association rule) (Zhang and Zhang, 2002). Furthermore, the connotation among the substitute of prime splitters is furthermost repeatedly castoff to determine importance of predictor variables. The model studies in what way variables associate through the forecast variables (Wan et al., 2007).
Decision tree forest (DTF)
DTF is the tree decision group whose forecasts are joint to get the overall prediction. The algorithm behind the DTF is Random Forest (Breiman, 2001). The competence of the DTF cannot be attained over SDT or further AI methods employed in current investigation. The “Out of bag” method is accepted in DTF intended for authentication of the prototypical. It transports model a release assessment without wanted any further distinct datasets to authenticate the prototypical (Sherrod, 2003). Thousands predicator values of variable must be tackle with a single test. In predictable variables they tackle the missing values., must use the appropriate method to separate the splitters (Lewis, 2000). Datasets of Jhelum River basin must observe the 10928 (N) stream flow of mentioned stations. Through catching, harvest an arbitrary sample of “N” annotations as of each evaluating position’s data must show the re-placement in Figure 4. Coarsely around two by third statistics will be designated by selection, and third part of the data must know as Out bag data (Breiman, 1996). This process is replicate for every procedure of the building a tree Decision tree must be built with the aid of the statics in row selection is the first stage Never touch the tree for the purpose of pruning until they completed. During the construction of the tree must permit a set which predicate the whole set they must known as the splitters of probable each node is created in forest.
Some forecaster variables might not be designated for every split, but the leftover of variables in the preceding split will be encompassed in the subsequent split in the forest. DTF has two elements of stochastic: (1) the input for separately a tree choosing rows. (2) For every node split in the DTF considered a set of predictor variable values as a candidate. Due to this purpose, which is not well understood yet, this scholastic behavior of the DTF create it most efficient and accurate (Yadav and Pal, 2012). The algorithm of DTF cited above throughout its structure, to control the simplification error, in the tree of forest to each tree taking out of bag rows turns over the tree and percentage of error of the prediction is measured. To find out the total generality error frequency, average all errors of whole trees existing in the forest. During modeling all rows are used in and no one has created black as a separate assessment set. Testing procedure is very abrupt because only created a solitary forest (alike to V fold irritated authentication some additional trees formed) (Sherrod, 2003). The origin node divided each section recursively according to the Decision tree Forest learning algorithm (DTFLA). DTFLA is widely employed in applied methods for inductive interference (Mitchell, 1997).
Tree boost (TB)
TB prototypical can be illustrate graphically as Preliminary tree formfitting the statistics preliminary trees remaining then fed into the second tree, minimizing the fault. Events are recreated over the successive tree sequence as complete procedure is mentioned in Figure 5. The forecasted conclusion is shaped by addition the weighting effect of respectively tree. Adaboost (Freund and Schapire, 1996) or further composite-tree grounded on bagging and boosting methods cannot show outcomes improved than TB. It is highly resilient for the reason that it was employed in Huber M-regression loss function. For misplaced variables, it employed a very precise technique. To avoid overfitting TB, use cross-validation random row sampling (Sherrod, 2003). Boosting (Freund and Schapire, 1996) and bagging (Breiman, 1996) are present methods for expounding the predictive effect of classifier learning systems. In cooperation set are classifiers that assemble by elective bagging by generating pretending boot strap models of the data and enhancing the weights of instance. Different researches prove that TB performs well with DTF and few requests with others. Hereafter it is fantastic to attempt both methods and relate the outcomes (Sherrod, 2003).
In the current study, TB was used to predict streamflow. TB was trained and tested for improving the ability of streamflow prediction. By enhancing the accuracy of the predicting purpose by Boosting method, in a sequence smearing the procedure continuously and integration the production of each purpose with weighting, the forecasting error is minimizing. In diverse cases, the predicting accuracy of sequence significantly upsurges the accurateness of the prime function. Jerome H. Fried man advanced the TB algorithm, which was used with (Friedman, 1999). Improve and enlighten the precision of techniques created on decision trees. Earlier investigations elaborate the models accumulated using tree boost are the utmost precise for any modeling method known. “Multiple Additive Regression Trees” (MART) and “Stochastic Gradient Boosting” is recognized as tree Boost. Functionally tree boost algorithm is analogous to decision tree forests as it create an ensemble tree. The tree boost model contains a arrangement of trees, nonetheless the decision tree forest includes an assemblage of trees that equivalent respectively other (Sherrod, 2003). Mathematically,
Equation 11
F0 initiate the series value (target is the median of regression model) “pseudo-residual” X is vector value remaining in the sequence at this point, T1(x), T2(x) are trees configuring to the pseudo-residuals or B1, B2, etc. are the continuous of the subdivision node predicting evidences that are planned by the Tree boost algorithm.
Conventional multilayer perceptron neural networks (MLPNN)
ANNs appropriate fit in the process of hydrological where considerable associations are complex and not simply understandable (Kasabov, 1996). In water resource engineering and hydrology MLPNN and RBFNN which are the primary form of ANNs widely employed (McGarry et al., 1999). MLPNN which contains of three layers input, hidden and output layers correspondingly illustrate in Figure 6. Every layer comprises numerous neurons which are interweaved with other layers through weights. In the primary layer, each neuron existing in it obtains an input array. It extracts a production over an identity purpose which is the contribution of the secreted layer as the same situation for the output layer, which obtains output from the secreted layer as input. It is developed by a neuron transmission function, which is a mathematical function. All neurons in three layers, as mentioned earlier, are interconnected, but there is no direct connection in between (Shoaib et al., 2014).
Model evaluation criteria
The efficiency and fitness of the advanced techniques can be measured by the statistical parameters (Shoaib et al., 2014). The implementation of AI-based techniques created for the monthly streamflow predicting was calculating through four diverse statistical parameters that define errors associated to the approaches. In current investigation, three statistical model authentication parameters used for the assessment of the efficient performance of models: (1) Coefficient determination (R2) (Menard, 2000); (2) Root mean square error (RMSE)(Levinson and Physics, 1946); (3) NSE (Nash and Sutcliffe, 1970) which are describe as:
Equation 12
Equation 13
Equation 14
Qpre and Qobs are the predicted and observed flows, although Qmean is the of observed flows of mean. The coefficient of determination (R2) express us in what way to fit line of regression tactics the genuine statistics in regression, value 1 demonstrates that line proficiently fits the real statistics. To measure evaluated output exactness, RMSE is castoff. It arrays from 0 to eternity which illustrate the no match or match among predicted and observed outputs. The Nash Sutcliff efficiency (NSE) value ranges among -1 to 1. It is widely used for the assessment of hydrological techniques in this investigation NSE is in percentage. The NSE can calculate the aptitude of the technique to forecast the experiential output. Although the nominal value of RMSE and a high percentage of NSE illustrate a good model. The RMSE outcomes illustrate the integrity of the assessment for higher discharges. On the other hand, Nash-Sutcliffe Efficiency (NSE) normally propose technique’s capability to forecast observed discharge values (Shoaib et al., 2018).
Flow duration curves (FDC)
FDC (Searcy, 1959) is a graphical illustration that signifies the flow of the stream that happens or is surpassed about percent of the period (Oeurng et al., 2019). It is built among the experiential flow data composed from diverse organizations and departments and exceedance prospect, which fall on the y-axis. The general equation for exceedance probability:
Equation 15
N= stream flow observation (no units); M= listing the position on ranking (no companies).
The utmost substantial shares of FDCs are its higher or lower shares to assess catchment characteristics. The low stream flow portion of FDC’s illustrate in what way they capability of the catchment to sustain in dry and hot seasons. Although the high stream portion illustrations the kind of flood regime, which probably to have the catchment. Steep bends in FDCs (flow duration curves) illustrate the reason of floods by rain naturally in small catchments, whereas the much flatter bends close the upper portion are due to floods instigated by snowmelt. The flat curves illustrate the flows in the low flow portion due to natural or non- natural streamflow (Shoaib et al., 2014).
Results and Discussion
This study’s objective was to evaluate advanced AI-based techniques for streamflow forecasting in Jhelum River Basin in Western Himalayas, Pakistan. During spring and winter seasons, rainfall occurred in the mountain areas, mainly in Kunhar, Neelum, and Jhelum, which are sub-basins of the catchment in the form of snow. Therefore, snow melting is occurred due to temperature changes. The outputs of the MK test in annual and seasonal streamflow trend series are shown in Table 4. During the winter season, streamflow Naran and Chinari showed decreasing trend, whereas other places exhibited a cumulative trend due to precipitation increased in winter. The decreasing trend in streamflow also showed by Naran, Muzaffarabad, Chinari, Kohala stations during the spring season except for Garhi Habibullah, Azad Pattan, and Kotli stations. It is worth to note that the temperature is increased during the summer and autumn seasons in the Jhelum river basin, but all streamflow stations showed decreasing trends except Domel. These results show that the annual streamflow in all areas of the Jhelum river basin like Naran, Garhi Habibullah, Muzaffarabad, Chinari, Kohala, Azad Pattan, and Kotli showed the decreasing trend except Domel station because most stations lie in snow feed areas. Domel station showed an increasing trend due to high rainfall. These results are found accurate and precise, which can also be justified. Overall, in seasonal and annual trend analysis, the significance level (α) is blank, which means the α is greater than
From Tables 4 and 6, it can see that the values of Z were negative for Narran, Garhi Habibullah, Muzaffarabad, and Chinari Streamflow stations which indicates the negative trend. Domel station has a positive Z value which shows the upward trend in the dataset. Although, the annual trend in the catchment was downward. On the other hand, the yearly (α) in the dataset is more significant than 0.1, except at Chinari station, the (α) is at 0.05. Overall, annually, the significance in the catchment was downward. The Z value is positive in the winter season except for Narran and Chinari stations, showing the upward trend in the dataset. In winter, the (α) in winter is 0.05 at Narran, Garhi Habibullah and Muzaffarabad stations, whereas Chinari, Kohala, Azad Pattan, and Kotli have (α) 0.1. Only Domel has (α) 0.1. In the spring season, the Z value is negative in Narran, Muzaffarabad, Chinari, and Kohala, which indicates the downward trend in the dataset. On the other side, the Z value is positive at remaining stations have an upward trend.
Whereas, the (α) is more significant than 0.1, indicating the MK test’s significance. In the Summer season, except for the Domel station, all others have a negative value of Z which illustrate a descending tendency in dataset. All streamflow gauge stations have (α) is greater than 0.1, which indicates the significant direction in the summer season. During autumn, the Z value was negative and showed a downward trend in the dataset except for the Narran, Garhi Habibullah, and Domel. The (α) at Naran station at 0.1 and Chinari is 0.05 significance level. Other all stations have a significance level greater than 0.1. Though, they illustrate the monotonic trend and
Table 4: Seasonally and Annually Mann-Kendall test for Jhelum River basin.
Stream flow stations |
Naran |
Garhi Habibullah |
M. Abad |
Chinari |
Domel |
Kohala |
Azad Pattan |
Kotli |
||||||||
Time series |
Z |
α |
Z |
α |
Z |
α |
Z |
α |
Z |
α |
Z |
α |
Z |
α |
Z |
α |
Winter |
-2.23 |
* |
2.12 |
* |
2.15 |
* |
-1.5 |
1.73 |
+ |
0.28 |
0.8 |
0.28 |
||||
Spring |
-1.0 |
1.17 |
-0.8 |
-1.47 |
0.97 |
-0.35 |
0.04 |
0.13 |
||||||||
Summer |
-1 |
-0.63 |
-1.54 |
-1.78 |
+ |
1.1 |
-1.39 |
-0.91 |
-1.58 |
|||||||
Autumn |
2.47 |
* |
0.67 |
-0.76 |
-1.86 |
+ |
1.31 |
-0.63 |
-0.11 |
-0.2 |
||||||
Annual |
-1.17 |
-0.09 |
-0.98 |
-2.1 |
* |
1.01 |
-1.13 |
-0.35 |
-0.93 |
suitable assessment method is Mann-Kendall. Annually the linear trend in the dataset was negative except for the Domel station, which indicates the downward trend overall in the dataset. Annually the movement in the dataset was negative except for the Domel station. Annually, the Sen’s Slope has negative Q values, which indicates the downward trend overall in the dataset. Seasonally the Sen’s Slope values of Q. In the winter season, Naran and Chinari stations have a negative value of Q. All other stations have a positive value of Q, which indicates the positive trend in the winter dataset. During the Spring season, the Q values are negative in Naran, Muzaffarabad, Chinari, and Kohala station, which directs to the downward trend in the dataset. Whereas, at Garhi Habibullah, Domel, Azad Pattan, and Kotli, values of Q are positive, which shows an upward trend in the dataset of spring. In the Summer season, the trend is downward at all stations except Domel station. Therefore, in the summer season, the overall trend is downward. The values of Q are positive only at stations Naran, Garhi Habibullah, and Domel stations during the autumn season. All remaining stations have an upward trend in the dataset.
Linear trend of The Sens’s Slope estimator is always true slope is engaged. Seasonally and annually, the Sen’s Slope values of Q are mentioned in Tables 6 and 7. The Mk test results of different zones are mentioned in Table 5. The significance was more significant than 0.1 and in zones I and II and 0.1 at Zone III. A positive trend was detected in Zone I and Zone II, but a negative trend was found in Zone III during the winter and spring seasons. According to the Mk test, the Trend was positive and upward. The trend was overall negative in the summer season, with more than 0.1 except zone III, where the significance was 0.1. During the autumn and annually, the trend was negative except Zone III, with a value greater than 0.1, but in zone III, the significance was 0.1. generally negative trend must observe and importance must create less than, there was a negative trend detected, and the importance was found to a lesser amount of 0.1. The results of Sen’s slope estimator (Q) were mentioned in Table 7. During the winter and spring seasons, the result of Q was explicit that there was a negative trend in zone II and zone III but positive in zone I. In contrast, there was a negative trend in summer, autumn, and annually in all catchment zones. Many studies reveal that significant trends were detected using Mann Kendall tests, Sen’s Slope, Mann Whitney U and Student t-test in streamflow for upper Jhelum basin, western Himalaya, and its sub-basin, i.e., Kanshi, Neelum, Poonch, and Kunhar (Azmat et al., 2016; Khan et al., 2015; Mahmood et al., 2015; Tahir et al., 2015; Yaseen et al., 2014).
Table 5: Seasonally and Annually Mann-Kendall test for Zones in Jhelum River baisn.
Stream flow stations |
Zone I |
Zone II |
Zone III |
|||
Time series |
Z |
α |
Z |
α |
Z |
α |
Winter |
0.69 |
1.95 |
+ |
-1.50 |
||
Spring |
0.20 |
0.56 |
-1.58 |
|||
Summer |
-1.17 |
-0.76 |
-1.89 |
+ |
||
Autumn |
-0.17 |
0.33 |
-1.58 |
|||
Annual |
-0.43 |
0.02 |
-2.04 |
* |
Note: Significance level (α) of MK test.
Performance evaluation criteria revealed that in training and testing of applied AI techniques, including SDT, DTF, and TB, each zone (Z1, Z2 and Z3) and whole upper Jhelum river Basin are best techniques. The average value of R2 and NSE found 1 for all applied techniques during training and testing of models mentioned in Tables 7 and 8, respectively.
Evaluation criteria R2, RMSE, and NSE showed that DTF embraces superiority on the SDT and TB. In Figure 7, it can be seen clearly that the results of R2 are approximately 1.00 of the annual forecast for all AI techniques except the Kotli station and meet the original data requirements. At Kotli station, the DTF showed a good forecast with results of 0.91 and
Table 6: Seasonally and Annually Sen’s Slope Estimate (Q) for Jhelum River Basin.
Stream flow stations |
Naran |
Garhi habibullah |
M. Abad |
Chinari |
Domel |
Kohala |
Azad Pattan |
Kotli |
Time Series |
Q |
Q |
Q |
Q |
Q |
Q |
Q |
Q |
Winter |
-0.28 |
0.64 |
70.82 |
-5.96 |
320.48 |
74.34 |
164.48 |
26.13 |
Spring |
-0.59 |
1.05 |
-66.99 |
-6.75 |
252.07 |
-113.11 |
38.49 |
4.5 |
Summer |
-0.87 |
-1.52 |
-464.67 |
-14.75 |
410.04 |
-902.88 |
-749.87 |
-186.85 |
Autumn |
0.28 |
0.09 |
-13.38 |
-1.85 |
66.56 |
-37.64 |
-13.38 |
-3.22 |
Annual |
-1.31 |
-0.3 |
-357.97 |
-32.16 |
806.04 |
-890.6 |
-346.27 |
-159.23 |
Table 7: Seasonally and Annually Sen’s Slope Estimate (Q) for Zones of Jhelum River Basin.
Zones |
Zone I |
Zone II |
Zone III |
Time Series |
Q |
Q |
Q |
Winter |
3.051 |
-2.187 |
-2.988 |
Spring |
0.742 |
-10.360 |
-3.852 |
Summer |
-12.955 |
-28.700 |
-7.981 |
Autumn |
-0.256 |
-1.820 |
-0.915 |
Annual |
-12.478 |
-38.666 |
-17.145 |
0.93 in training and testing, respectively of R2. In contrast, the other three techniques (SDTF, TB MLPNN) were not found accurate, with results 0.41, 0.42, 0.40 and 0.46, 0.54, 0.45 in training and testing, respectively. In comparison, DTF was found most accurate than MLPNN, SDTF, and TB. DTF results are approximately near 1 (Sharma et al., 2013; Tayyab et al., 2018). R2 is the topmost model assessment measures cited and employed by numerous investigators or hydrologists during the procedure of hydrology and predicting and estimating diverse hydrological cycle mechanisms (Kisi and Cimen, 2011). In Figure 7, the results of NSE for both training and testing cases of AI techniques results revealed that the TB and SDT showed the most efficient results of 1.00 during training of models at all stations except for Kotli station. At Kotli station, DTF performed well with the average result of 0.85 and 0.65 during the training and testing of models. While, SDTF, TB and MLPNN showed less efficient results of 0.42, 0.43, 0.40 and 0.47, 0.54 0.46 in training and testing respectively. High values of NSE illustrate the efficiency of Models. So, SDT is the utmost operative method for the predicting of streamflow rendering to NSE outcome.
In contrast, the DTF or TB partake virtuous potential intended for predicting for the reason that their outcomes are reasonable related by MLPNN. In Figure 8, RMSE results showed that all gauges of the DTF have better annual streamflow prediction than SDT and TB. DTF is more efficient than the traditional MLPNN. The lesser value of the RMSE shows the fitness of the model (Sharma et al., 2013; Shoaib et al., 2014). The average RMSE value for DTF in testing and training is 382 m3/sec and 2846 m3/sec. Therefore, for the upper Jhelum river basin, the DTF is most effective for predicting annual streamflow according to RMSE results.
The data of the whole upper Jhelum river basin, including streamflow gauge station, were alienated into different weather like June to September (Summer), October to November (Autumn) December to March (winter),
Table 8: Evaluation training results of Zones (Z1), (Z2) and (Z2) and Overall Jhelum River Basin.
Techniques |
Training |
|||||||||||
Zone I |
Zone II |
Zone III |
Overall Basin |
|||||||||
R2 |
NSE |
RMSE |
R2 |
NSE |
RMSE |
R2 |
NSE |
RMSE |
R2 |
NSE |
RMSE |
|
SDT |
1.00 |
1.00 |
4326.61 |
1.00 |
1.00 |
890.61 |
1.00 |
1.00 |
92.99 |
1.00 |
1.00 |
120737.30 |
DTF |
1.00 |
1.00 |
861.43 |
1.00 |
1.00 |
181.68 |
1.00 |
1.00 |
16.24 |
1.00 |
1.00 |
27203.45 |
TB |
1.00 |
1.00 |
4821.75 |
1.00 |
1.00 |
837.20 |
1.00 |
1.00 |
84.08 |
1.00 |
1.00 |
130277.90 |
MLPNN |
1.00 |
1.00 |
6148.18 |
1.00 |
1.00 |
1533.72 |
1.00 |
1.00 |
91.24 |
1.00 |
1.00 |
130277.90 |
Table 9: Evaluation testing results of Zones (Z1), (Z2) and (Z2) and Overall Jhelum River Basin.
Techniques |
Testing |
|||||||||||
Zone I |
Zone II |
Zone III |
Overall Basin |
|||||||||
R2 |
NSE |
RMSE |
R2 |
NSE |
RMSE |
R2 |
NSE |
RMSE |
R2 |
NSE |
RMSE |
|
SDT |
0.99 |
1.00 |
1929.18 |
1.00 |
1.00 |
488.33 |
1.00 |
1.00 |
54.85 |
1.00 |
1.00 |
70682.47 |
DTF |
1.00 |
1.00 |
257.53 |
1.00 |
1.00 |
83.04 |
1.00 |
1.00 |
10.46 |
1.00 |
1.00 |
9971.38 |
TB |
1.00 |
1.00 |
1235.14 |
1.00 |
1.00 |
397.12 |
1.00 |
1.00 |
43.51 |
1.00 |
1.00 |
49114.65 |
MLPNN |
1.00 |
1.00 |
1952.96 |
1.00 |
1.00 |
580.82 |
1.00 |
1.00 |
54.10 |
1.00 |
1.00 |
65877.67 |
Spring (April to May),), and based on rainy seasons in Pakistan (Sheikh, 2001). Then AI-based techniques (SDT, DTF, TB, and MLPNN) were applied to the training and testing datasets. After training and testing models, performance evaluation criteria were engaged with predicted values for the seasonal streamflow. The results of R2 were described in Figures 9 and 10, it can easily understood that DTF is the most efficient model in all-season winter, summer, spring, and autumn, with the average result of 0.98, 0.95, 0.99, and 0.98, respectively. However, there was some irregularity in streamflow contribution from the Narran gauge station. Where other two techniques SDT and TB, do not show efficiency as compared to the DTF. DTF also performed most effectively and gave better efficiency in all four seasons.
At the end, when these results were compared with conventional MLPNN, the DTF also indicated better results in comparison. The outputs of RMSE for the training and testing of AI-based techniques are represented in Figures 10 and 11. It can be seen that SDT has good results as compared to DTF and TB. As measure assessed output precision RMSE is used. Results clearly show that STD has good potential to forecast monthly streamflow. When AI techniques outputs compared to conventional MLPNN, the outcomes of RMSE are approximately the same. In both cases, the results of NSE of AI techniques are practical and meet the original data requirements presented in Figures 13 and 14. In this study, DTF and SDT were found the utmost efficient AI method intended for the cyclical streamflow predicting associated with other techniques like TB and SDT. When we compare these outcomes with MLPNN, DTF are the most effective and accurate strategy and illustrate better values range than TB, SDT and MLPNN and. In Figures 13 and 14, the DTF results are approximately near to 1. The results of the NSE are out of 1. The higher value of the NSE illustrate the efficient outcomes (Nash and Sutcliffe, 1970; Shoaib et al., 2018; Zaman et al., 2018). So, for the Jhelum River basin, the DTF is the most appropriate method for predicting streamflow rendering to NSE outcomes. In contrast, TB or SDT have great potential intended for predicting.
.
In Figures 15 and 22, illustrate that hydrographs of the low, medium and high stream were created by the artificial intelligence techniques like SDT, MLPNN, DTF and TB to analyses for their capability.
The FDCs illustration of exceedance probability versus noted and simulated flows which illustrate the given discharges were exceeded through the shown period, which was exposed by numerous investigation in the past (Archer and Fowler, 2008; Babur et al., 2016; Hayat et al., 2019). The flow is considered high, which fall equal among 1 to10 percent of the period. Likewise, from 11 to 89, flow is occupied as the medium flow, and from 90 to 100 percentile, lows are taken as low flows. The percentile flows from 11-49 as a high medium flow, and from 50-89 the flow is considered low medium flow. DTF is a better AI technique for medium high and high percentile flows and better bonds with flow duration curve of observed flow compared with other FDCs of SDT, TB, and MLPNN. The TB FDCs better bond with experiential flow FDC for low and medium-low percentile flows than other FDCs. The FDCs of MLPNN for high and medium percentile flows also illustrate good potential and bonds with observed flow FDCs compared to FDCs of SDT. The FDCs of all places of the upper Jhelum river basin predicted and observed by diverse AI techniques disclosed that the DTF performed better than other methods. Though, the capability of the DTF in the predicting of medium-high and high discharges is perfect as paralleled to other AI techniques. At the same time, the TB executed outstanding and fit for the low and medium-low discharges for the long term of predicting. SDT also disclosed good potential to forecast streamflow of high flows.
Conclusions and Recommendations
By timely and effective forecasting of streamflow magnitude, peaks, and duration, many lives, enormous money, and infrastructure can be saved as complete safety is challenging. NSE (Nash-Sutcliffe Efficiency), RMSE (root mean square error) and R2 (Coefficient of determination), are three indices that are deployed as performance evaluation criteria. Through Sen’ Slope and MK analysis illustrate the trend of entire station. The seasonal analysis of the upper Jhelum river basin also was performed, and applied AI techniques were trained and test for each streamflow gauge station. The outcomes were also presented as flow duration curves (FDCs) among forecasted and observed data for each station. The outcomes of applied AI techniques (SDT, DTF, TB, and MLPNN) that comprise Q(i-1), Q(i-2), Q(i-3)….Q(i-29), Q(i-30) as input have executed great effectiveness and accuracy.
In contrast, DTF is evaluated as the most effective AI technique among other applied methods based on performance evaluation criteria results in different zones and the whole upper Jhelum river basin. In contrast, SDT and TB have also better performed in annual streamflow forecasting. From all over techniques SDT techniques considered the best approach intended for the entire catchment. On an individual basis in annual streamflow forecasting at the entire catchment, the performance evaluation criteria were satisfied. The average results of evaluation parameters R2 and NSE for DTF are 0.998 and 0.992, respectively.
On the other hand, the average RMSE value for DTF in testing and training are 382 m3/sec and 2846 m3/sec, respectively. Trend analysis of the whole catchment outcome effective in both techniques by Sen’s Slope and Mann-Kendall Methods. The highest decreasing tendency of -2.23 m3/sec was observed in Autumn at Naran station, while the lowest change of -0.09 m3/sec was found at Garhi Habibullah station. The seasonally and annually streamflow showed a significant decrease at a 95% level of confidence. Based on these results, it was concluded that Sen’s Slope and Mann-Kendall approaches are effective for inclination analysis in the upper Jhelum river basin. Flow duration curves (FDCs) were engaged to observe the accuracy between observe and predicted streamflow in the current catchment. The outcomes of FDCs revealed that the DTF was a better AI method for high and medium-high percentile streams and well promises through FDC of observed flow compared to others. The FDC of TB better bonds with low and medium-low percentile flows. The percentile flows also illustrate well in FDCs of MLPNN for high and medium flow. However, the DTF’s ability to forecast high and medium-high discharges are appropriate compared to further AI techniques. It can predict different hydrological processes like reference evapotranspiration, rainfall-runoff prediction, or forecasting sediments transport.
Novelty Statement
These Artificial intelligence techniques (AITs) are never used for the forecasting of streamflow of the Mangla Catchment in Pakistan.
Author’s Contribution
Muhammad Waqas: Conceived the data, applied methodology and did all trend analysis and write up of manuscript.
Muhammad Shoaib: Wrote abstract and also give technical input during research.
Muhammad Saifullah: Wrote conclusion, did statistical analysis and improve results & Discussions.
Adila Naseem: Data entry and wrote result and discussion.
Sarfraz Hashim: Overall read and edited the manuscript.
Farrukh Ehsan: References and images creation.
Irfan Ali: Data Collection.
Alamgir Khan: Wrote introduction.
Conflict of interest
The authors have declared no conflict of interest.
References
Abdellatif, M. E., Osman, Y. Z., & Elkhidir, A. M. 2015. Comparison of artificial neural networks and autoregressive model for inflows forecasting of Roseires Reservoir for better prediction of irrigation water supply in Sudan. International Journal of River Basin Management, 13(2), 203-214. https://doi.org/10.1080/15715124.2014.1003381
Ali, R., A. Ismael, A. Heryansyah and N.J.H. Nawaz. 2019. Long term historic changes in the flow of lesser Zab River, Iraq. 6: 22. https://doi.org/10.3390/hydrology6010022
Ali, R., A. Kuriqi, S. Abubaker and O.J.W. Kisi. 2019. Long-term trends and seasonality detection of the observed flow in Yangtze River using Mann-Kendall and Sen’s innovative trend method. 11(9): 1855. https://doi.org/10.3390/w11091855
Archer, D. R., & Fowler, H. J. (2008). Using meteorological data to forecast seasonal runoff on the River Jhelum, Pakistan. Journal of Hydrology, 361(1-2), 10-23. https://doi.org/10.1016/j.jhydrol.2008.07.017
Azmat, M., Choi, M., Kim, T. W., & Liaqat, U. W. (2016). Hydrological modeling to simulate streamflow under changing climate in a scarcely gauged cryosphere catchment. Environmental Earth Sciences, 75(3), 186. https://doi.org/10.1007/s12665-015-5059-2
Babovic, V., M. Keijzer and M. Bundzel. 2000. From global to local modelling: A case study in error correction of deterministic models. Paper presented at the Proceedings of the Fourth International Conference on Hydro Informatics, Iowa city.
Babur, M., M. Babel, S. Shrestha, A. Kawasaki and N.J.W. Tripathi. 2016. Assessment of climate change impact on reservoir inflows using multi climate-models under RCPs. The case of Mangla Dam in Pakistan. 8(9): 389. https://doi.org/10.3390/w8090389
Barge, J., and H.J.W. Sharif. 2016. An ensemble empirical mode decomposition, self-organizing map, and linear genetic programming approach for forecasting river streamflow. 8(6): 247. https://doi.org/10.3390/w8060247
Box, G., 1970. P. and Jenkins, G.M. Time series analysis: Forecasting and control.
Box, G. E., Jenkins, G. M., Reinsel, G. C., & Ljung, G. M. (2015). Time series analysis: forecasting and control. John Wiley & Sons.
Breiman, L. 1996. Out of bag estimation. Bagging predictors. 24(2): 123-140. https://doi.org/10.1007/BF00058655
Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32. https://doi.org/10.1023/A:1010933404324
Chokmani, K., T.B. Ouarda, S. Hamilton, M.H. Ghedira and H. Gingras. 2008. Comparison of ice-affected streamflow estimates computed using artificial neural networks and multiple regression techniques. J. Hydrol., 349(3-4): 383-396. https://doi.org/10.1016/j.jhydrol.2007.11.024
Coulibaly, P., Anctil, F., & Bobée, B. (2000). Daily reservoir inflow forecasting using artificial neural networks with stopped training approach. Journal of Hydrology, 230(3-4), 244-257. https://doi.org/10.1016/S0022-1694(00)00214-6
Da Silva, R.M., C.A. Santos, M. Moreira, J. Corte-Real, V.C. Silva and I.C.J.N.H. Medeiros. 2015. Rainfall and river flow trends using Mann–Kendall and Sen’s slope estimator statistical tests in the Cobres River basin. 77(2): 1205-1221. https://doi.org/10.1007/s11069-015-1644-7
Dariane, A., M. Farhani and S.J.W.r.m. Azimi. 2018. Long term streamflow forecasting using a hybrid entropy model. 32(4): 1439-1451. https://doi.org/10.1007/s11269-017-1878-0
Deo, R.C., O. Kisi and V.P.J.A.R. Singh. 2017. Drought forecasting in eastern Australia using multivariate adaptive regression spline, least square support vector machine and M5Tree model. 184: 149-175. https://doi.org/10.1016/j.atmosres.2016.10.004
Erdal, H.I. and O. Karakurt. 2013. Advancing monthly streamflow prediction accuracy of CART models using ensemble learning paradigms. J. Hydrol., 477: 119-128. https://doi.org/10.1016/j.jhydrol.2012.11.015
Etemad-Shahidi, A., and J.J.O.E. Mahjoobi. 2009. Comparison between M5′ model tree and neural networks for prediction of significant wave height in Lake Superior. 36(15-16): 1175-1181. https://doi.org/10.1016/j.oceaneng.2009.08.008
Francipane, A., D. Pumo, M. Sinagra, G. La Loggia, L.V.J.N.H. Noto and E.S.S. Discussions. 2021. A paradigm of extreme rainfall pluvial floods in complex urban areas: the flood event of 15 July 2020 in Palermo (Italy). pp. 1-32. https://doi.org/10.5194/nhess-2021-61
Freund, Y., & Schapire, R. E. (1996, July). Experiments with a new boosting algorithm. In icml (Vol. 96, pp. 148-156).
Friedman, J., 1999. Greedy function approximation: A gradient boosting machine. Greedy Func Approx SS. pdf.
Hancock, T., R. Put, D. Coomans, Y. Vander Heyden, Y.J.C., Everingham and I.L. Systems. 2005. A performance comparison of modern statistical techniques for molecular descriptor selection and retention prediction in chromatographic QSRR studies. 76(2): 185-196. https://doi.org/10.1016/j.chemolab.2004.11.001
Hayat, H., T.A. Akbar, A.A. Tahir, Q.K. Hassan, A. Dewan and M.J.W. Irshad. 2019. Simulating current and future river-flows in the karakoram and himalayan regions of Pakistan using snowmelt-runoff model and RCP Scenarios. 11(4): 761. https://doi.org/10.3390/w11040761
Huang, W., W. Duan, D. Nover, N. Sahu and Y.J.J.o.H. Chen. 2021. An integrated assessment of surface water dynamics in the Irtysh River Basin during 1990–2019 and exploratory factor analyses. 593: 125905. https://doi.org/10.1016/j.jhydrol.2020.125905
Jha, M.K., A. Chowdhury, V. Chowdary and S. Peiffer. 2007. Groundwater management and development by integrated remote sensing and geographic information systems: Prospects and constraints. Water Resour. Manage., 21(2): 427-467. https://doi.org/10.1007/s11269-006-9024-4
Jothiprakash, V., and R. Magar. 2012. Multi-time-step ahead daily and hourly intermittent reservoir inflow prediction by artificial intelligent techniques using lumped and distributed data. J. Hydrol., 450: 293-307. https://doi.org/10.1016/j.jhydrol.2012.04.045
Kasabov, N.K., 1996. Foundations of neural networks, fuzzy systems, and knowledge engineering. Marcel Alenca.
Kentel, E., 2009. Estimation of river flow by artificial neural networks and identification of input vectors susceptible to producing unreliable flow estimates. J. Hydrol., 375(3-4): 481-488. https://doi.org/10.1016/j.jhydrol.2009.06.051
Keshtegar, B., & Kisi, O. (2017). M5 model tree and Monte Carlo simulation for efficient structural reliability analysis. Applied Mathematical Modelling, 48, 899-910. https://doi.org/10.1016/j.apm.2017.02.047
Khan, K., M. Yaseen, Y. Latif and G.J.S.I. Nabi. 2015. Detection of river flow trends and variability analysis of Upper Indus Basin, Pakistan. 27(2).
Kisi, O., and M.J.J.o.H. Cimen. 2011. A wavelet-support vector machine conjunction model for monthly streamflow forecasting. pp. 399. https://doi.org/10.1016/j.jhydrol.2010.12.041
Lehner, B., L. Katiyo, F. Chivava, H.M. Sichingabula, E. Nyirenda, N.A. Rivers-Moore and F. Ecosystems. 2021. Identifying priority areas for surface water protection in data scarce regions: An integrated spatial analysis for Zambia. https://doi.org/10.1002/aqc.3606
Levinson, N.J.J.o.M. and Physics. 1946. The wiener (root mean square) error criterion in filter design and prediction. 25(1-4): 261-278. https://doi.org/10.1002/sapm1946251261
Lewis, R.J., 2000. An introduction to classification and regression tree (CART) analysis. Paper presented at the Annual meeting of the society for academic emergency medicine in San Francisco, California.
Mahmood, R., M.S. Babel, J.J.W. Shaofeng and C. Extremes. 2015. Assessment of temporal and spatial changes of future climate in the Jhelum river basin, Pakistan and India. 10: 40-55. https://doi.org/10.1016/j.wace.2015.07.002
Maja, M.M., S.F.J.E.S. Ayano and Environment. 2021. The impact of population growth on natural resources and farmers’ capacity to adapt to climate change in low-income countries. pp. 1-13. https://doi.org/10.1007/s41748-021-00209-6
Makkeasorn, A., N.B. Chang and X. Zhou. 2008. Short-term streamflow forecasting with global climate change implications–A comparative study between genetic programming and neural network models. J. Hydrol., 352(3-4): 336-354. https://doi.org/10.1016/j.jhydrol.2008.01.023
Mallick, J., S. Talukdar, M. Alsubih, R. Salam, M. Ahmed, N.B. Kahla and A. Climatology. 2021. Analysing the trend of rainfall in Asir region of Saudi Arabia using the family of Mann-Kendall tests, innovative trend analysis, and detrended fluctuation analysis. 143(1): 823-841. https://doi.org/10.1007/s00704-020-03448-1
Mann, H. B. (1945). Nonparametric tests against trend. Econometrica: Journal of the econometric society, 245-259. https://doi.org/10.2307/1907187
McGarry, K., S. Wermter and J.J.N.C.S. MacIntyre. 1999. Hybrid neural systems: from simple coupling to fully integrated neural networks. 2(1): 62-93.
McGarry, K.J., S. Wermter and J. MacIntyre. 1999. Knowledge extraction from radial basis function networks and multilayer perceptrons. Pap. Present. Int. Joint Conf. 4: 2494-2497. IEEE.
Mehr, A. D. (2018). An improved gene expression programming model for streamflow forecasting in intermittent streams. Journal of hydrology, 563, 669-678. https://doi.org/10.1016/j.jhydrol.2018.06.049
Menard. 2000. Coefficients of determination for multiple logistic regression analysis. 54(1): 17-24. https://doi.org/10.1080/00031305.2000.10474502
Meng, E., S. Huang, Q. Huang, W. Fang, L. Wu and L.J.J.o.h. Wang. 2019. A robust method for non-stationary streamflow prediction based on improved EMD-SVM model. 568: 462-478. https://doi.org/10.1016/j.jhydrol.2018.11.015
Mitchell, T.M., 1997. Machine learning. 1997. Burr Ridge, IL: McGraw Hill, 45(37): 870-877.
Nash, J. E., & Sutcliffe, J. V. (1970). River flow forecasting through conceptual models part I—A discussion of principles. Journal of hydrology, 10(3), 282-290. https://doi.org/10.1016/0022-1694(70)90255-6
Oeurng, C., T.A. Cochrane, S. Chung, M.G. Kondolf, T. Piman and M.E.J.W. Arias. 2019. Assessing climate change impacts on river flows in the Tonle Sap Lake Basin, Cambodia. 11(3): 618. https://doi.org/10.3390/w11030618
Quinlan, J. R. (1987). Simplifying decision trees. International journal of man-machine studies, 27(3), 221-234. https://doi.org/10.1016/S0020-7373(87)80053-6
Quinlan, J.R.J.M.l., 1986. Induction of decision trees. 1(1): 81-106. https://doi.org/10.1007/BF00116251
Robert, L.P., C. Pierce, L. Marquis, S. Kim and R.J.H.C.I. Alahmad. 2020. Designing fair AI for managing employees in organizations: A review, critique, and design agenda. 35(5-6): 545-575. https://doi.org/10.1080/07370024.2020.1735391
Saifullah, M., Z. Li, Q. Li, M. Zaman and S.J.A.i.M. Hashim. 2016. Quantitative estimation of the impact of precipitation and land surface change on hydrological processes through statistical modeling. 2016. https://doi.org/10.1155/2016/6130179
Salas, J.D., 1980. Applied modeling of hydrologic time series: Water resources publication. https://doi.org/10.1016/0309-1708(80)90028-7
Salmi, T., 2002. Detecting trends of annual values of atmospheric pollutants by the Mann-Kendall test and Sen’s slope estimates-the Excel template application MAKESENS: Ilmatieteen laitos.
Sayyad, R., K. Dakhore and S.J.E. Phad. 2019. Analysis of rainfall trend of Parbhani, Maharshtra using Mann–Kendall test. 13: 245-259.
Searcy, J.K., 1959. Flow-duration curves.
Şen, Z. (2012). Innovative trend analysis methodology. Journal of Hydrologic Engineering, 17(9), 1042-1046. https://doi.org/10.1061/(ASCE)HE.1943-5584.0000556
Sharma, V., Mishra, V. D., & Joshi, P. K. (2013). Implications of climate change on streamflow of a snow-fed river system of the Northwest Himalaya. Journal of Mountain Science, 10(4), 574-587. https://doi.org/10.1007/s11629-013-2667-8
Sheikh, M.M., 2001. Drought management and prevention in Pakistan. Paper presented at the COMSATS 1st meeting on water resources in the south: Present scenario and future prospects, Islamabad.
Shoaib, M., Shamseldin, A. Y., Khan, S., Khan, M. M., Khan, Z. M., Sultan, T., & Melville, B. W. (2018). A comparative study of various hybrid wavelet feedforward neural network models for runoff forecasting. Water resources management, 32(1), 83-103. https://doi.org/10.1007/s11269-017-1796-1
Shoaib, M., A.Y. Shamseldin and B.W. Melville. 2014. Comparative study of different wavelet based neural network models for rainfall–runoff modeling. J. Hydrol., pp. 47-58. https://doi.org/10.1016/j.jhydrol.2014.04.055
Singh, P., J.K. Thakur and U. Singh. 2013. Morphometric analysis of Morar River Basin, Madhya Pradesh, India, using remote sensing and GIS techniques. Environ. Earth Sci., 68(7): 1967-1977. https://doi.org/10.1007/s12665-012-1884-8
Sofia, G., and P.J.L. Tarolli. 2017. Hydrological response to~ 30 years of agricultural surface water management. 6(1): 3. https://doi.org/10.3390/land6010003
Sudheer, K., A. Gosain, D.M. Rangan and S.J.H.P. Saheb. 2002. Modelling evaporation using an artificial neural network algorithm. 16(16): 3189-3202. https://doi.org/10.1002/hyp.1096
Tahir, A.A., P. Chevallier, Y. Arnaud, M. Ashraf and Bhatti M.T.. 2015. Snow cover trend and hydrological characteristics of the Astore River basin (Western Himalayas) and its comparison to the Hunza basin (Karakoram region). 505: 748-761. https://doi.org/10.1016/j.scitotenv.2014.10.065
Tayyab, M., I. Ahmad, N. Sun, J. Zhou and X.J.A. Dong. 2018. Application of Integrated Artificial Neural Networks Based on Decomposition Methods to Predict Streamflow at Upper Indus Basin, Pakistan. 9(12): 494. https://doi.org/10.3390/atmos9120494
Tayyab, M., J. Zhou, X. Dong, I. Ahmad, N.J.M. Sun and A. Physics. 2019. Rainfall-runoff modeling at Jinsha River basin by integrated neural network with discrete wavelet transform. 131(1): 115-125. https://doi.org/10.1007/s00703-017-0546-5
Terzi, Ö., and G. Ergin. 2014. Forecasting of monthly river flow with autoregressive modeling and data-driven techniques. Neural Comput. Appl., 25(1): 179-188. https://doi.org/10.1007/s00521-013-1469-9
Tiwari, M.K. and C. Chatterjee. 2011. A new wavelet–bootstrap–ANN hybrid model for daily discharge forecasting. J. Hydroinf., 13(3): 500-519. https://doi.org/10.2166/hydro.2010.142
Valipour, M., 2015. Long-term runoff study using SARIMA and ARIMA models in the United States. Meteorol. Appl., 22(3): 592-598. https://doi.org/10.1002/met.1491
Valipour, M., M.E. Banihabib and S.M.R. Behbahani. 2013. Comparison of the ARMA, ARIMA, and the autoregressive artificial neural network models in forecasting the monthly inflow of Dez dam reservoir. J. Hydrol., 476: 433-441. https://doi.org/10.1016/j.jhydrol.2012.11.017
Valipour, M. and A.A. Montazar. 2012. Optimize of all effective infiltration parameters in furrow irrigation using visual basic and genetic algorithm programming. Aust. J. Basic Appl. Sci., 6(6): 132-137.
Vezza, P., C. Comoglio, M. Rosso and A. Viglione. 2010. Low flows regionalization in north-western Italy. Water Resour. Manage., 24(14): 4049-4074. https://doi.org/10.1007/s11269-010-9647-3
Wan, D., Y. Zhang and S. Li. 2007. Discovery association rules in time series of hydrology. Paper presented at the 2007 IEEE International Conference on Integration Technology. https://doi.org/10.1109/ICITECHNOLOGY.2007.4290400
Wang, Y., Y. Xu, H. Tabari, J. Wang, Q. Wang, S. Song and Z.J.A.R. Hu. 2020. Innovative trend analysis of annual and seasonal rainfall in the Yangtze River Delta, eastern China. 231: 104673. https://doi.org/10.1016/j.atmosres.2019.104673
Wu, C., K. Chau and Y. Li. 2009. Predicting monthly streamflow using data-driven models coupled with data-preprocessing techniques. Water Resour. Res., 45(8). https://doi.org/10.1029/2007WR006737
Xu, M., Watanachaturaporn, P., Varshney, P. K. and Arora, M. K. (2005). Decision tree regression for soft classification of remote sensing data. Remote Sensing of Environment, 97(3), 322-336. https://doi.org/10.1016/j.rse.2005.05.008
Yadav, S. K., & Pal, S. (2012). Data mining: A prediction for performance improvement of engineering students using classification. arXiv preprint arXiv:1203.3832.
Yaseen, M., T. Rientjes, G. Nabi and M. Latif. 2014. Assessment of recent temperature trends in Mangla watershed. 47(1).
Yaseen, Z.M., A. El-Shafie, H.A. Afan, M. Hameed, W.H.M.W. Mohtar and A. Hussain. 2016. RBFNN versus FFNN for daily river flow forecasting at Johor River, Malaysia. Neural Comput. Appl., 27(6): 1533-1542. https://doi.org/10.1007/s00521-015-1952-6
Yaseen, Z.M., A. El-Shafie, O. Jaafar, H.A. Afan and K.N. Sayl. 2015. Artificial intelligence based models for stream-flow forecasting: 2000–2015. J. Hydrol., 530: 829-844. https://doi.org/10.1016/j.jhydrol.2015.10.038
Zaman, M., Yuan, S., Liu, J., Ahmad, I., Sultan, M., Qamar, M. U. and Ali, I. (2018). Investigating hydrological responses and adaptive operation of a hydropower station under a climate change scenario. Polish Journal of Environmental Studies, 27(5), 2337-2348. https://doi.org/10.15244/pjoes/78678
Zhang, C., and S. Zhang. 2002. Association rule mining: Models and algorithms: Springer-Verlag. https://doi.org/10.1007/3-540-46027-6
To share on other social networks, click on any share button. What are these?