Distribution-Based Fuzzy Time Series Markov Chain Models for forecasting Inflation in Bandung

This study discusses the application of the Fuzzy Time Series Markov Chain method which was developed by determining the length of the interval using the distribution method. In the fuzzy forecasting method, the determination of the length of the interval is an important thing that will affect the accuracy of the forecasting results. The development of this forecasting model aims to get better forecasting accuracy results. In this study, general inflation data for the city of Bandung is used for the period January 2016 – June 2021. The data is divided into two groups, namely in sample data and out sample data with a ratio of 90: 10. In the data processing process, the Python programming language is used. Based on the accuracy test using the MAPE method, it can be concluded that this method provides better forecasting results with a MAPE value of 1.16%.


Introduction
Time series forecasting according to Makridakis, et.al. [1] is a forecasting process based on historical data observations. The time series forecasting method is divided into two, forecasting techniques based on statistics and mathematical models and forecasting techniques based on artificial intelligence [2]. Conventional methods such as ARIMA require many conditions that must be met so that this method can be applied properly, while not all data can meet these conditions, therefore fuzzy forecasting techniques are a solution for this problem [3].
Fuzzy Time Series (FTS) is a forecasting concept that was first proposed by Song and Chissom in 1993 [4]. The development of fuzzy forecasting models generally focuses on achieving a high level of forecasting accuracy by increasing the three main stages, namely, fuzzification, defuzification, and fuzzy inference [5]. The development of the fuzzy method was carried out by several researchers including the development of methods by adding fuzzy logic relation tables [6], development by adding weight to the fuzzy relation process and repetition of relations [7] [8], development by induction of Markov chains [9]. Determination of the length of the interval in the fuzzy method is crucial in the forecasting process because it will affect the accuracy of forecasting results. Among the methods for determining the length of the interval are intervals based on the mean and distribution, ratio, and granular computational approaches and entropy methods [1][10] [11][12] [13]. In several studies evaluating fuzzy forecasting models by comparing the Markov Chain method with the Chen and Weighted Markov Chain methods, from these studies the Markov Chain method gives the best results [14] [15].
Fuzzy time series forecasting is widely used for solving forecasting problems. However, the fuzzy method specifically in the Markov Chain method has deficiency in determining arbitrary intervals. Some improve it with the struges formula method [15] [16], the average method [17] distribution method. Based on this explanation, this study focuses on the application of the Distribution Based Fuzzy Time Series Markov Chain method in forecasting inflation. Where is called the current stage and is the next stage.

Distribution Based
Distribution based is an algorithm in FTS which is used to determine the length of intervals. Distribution based length can be determined by the following algorithm [10]: 1) Calculate all the absolute differences between +1 and ( = 1, … , − 1) as the first differences and the average of the first differences.
2) According to the average, determine the base for length of intervals based by following Table 1. 3) Plot the cumulative distribution of the first differences. The base in step 2 is used as interval on the plot. 4) Choose the largest interval length that has a data value less than at least half the amount of the differences data.

. Forecast accuracy
Determine forecasting accuracy using MAPE with the following formula: Where : forecast value, : actual data at time , and n : number of data.

Research Design
In this study, secondary data for general inflation in Bandung is used from January 2016 -June 2021. The data is divided into two categories, in sample data or data to be used in the model and out sample data to be used as forecasting accuracy calculations. In the data processing, the Python programming language is used. The flow in the study is described in Figure 1.

Result and discussion
In this study, general inflation data in Bandung from January 2016 to December 2020 was applied as the in-sample data. The data are presented in Table 2.  Based on data in Table 2, the universe of discourse is = [ − 1 ; + 2 ] = [−0,5 ; 1,1]. Next, determine the length of interval according to distribution algorithm. The process of calculating the initial forecast, the forecast adjustment value, and the final forecast value as described in the rules in the DSMC method. The results of the in-sample data forecasting are presented in Table 3. After doing the forecasting process on the in-sample data with t = 60, a looping process will be carried out by substituting the last forecast result as data to t = 61 as much as the number of out sample data. Forecasting results are presented in Table 4. The result of the test is the MAPE value calculated based on equation 4 of the DFTSMC method compared to the MAPE value of the FTS Markov Chain method. The MAPE value of all data carried out by the DFTSMC method is 1.16%, the MAPE value of the out-sample data from the DFTSMC method is 2.2%. The MAPE value of all data carried out by the FTS Markov Chain method is 1.24% while the MAPE value of the out-sample data from the FTS Markov Chain method is 1.84%. The comparison of the two methods is presented in Figure 4.  Figure 4, forecasting using the DFTSMC method results in a data pattern following the actual data. However, the DFTSMC method produces forecasting values that tend to be constant while the original data has a fluctuating pattern.

Conclusions
Based on the results and discussion of the conclusions obtained from this study are the application of the DFTSMC method to forecasting inflation in the city of Bandung produces a data pattern that is close to the actual data pattern. The DFTSMC method has a better accuracy rate than the FTS Markov Chain with a MAPE value of 1.16%. The results of inflation forecasting in Bandung using the DFTSMC method provide the same data pattern as the previous data pattern.