Applications of three distinct regression models in GDP predication

. This paper introduces the basic theory and formula of linear regression, multiple linear regression, and nonlinear regression. Linear regression is one of the commonly used analysis methods in statistical analysis, which can predict the trend of model data change to a certain extent. Multiple linear regression involves more variables to predict and analyze the change trend of data, and can predict the change of data more accurately. Nonlinear regression can predict the model of arbitrary relationship between variables, thus obtaining more accurate prediction data. In the selection of regression analysis method, data characteristics and problem background should be considered, and model assumptions and validation should be paid attention to ensure accuracy and reliability. In the applications, the paper discusses the application of simple linear regression to Okun’s law and delves into the complex relationship between multiple variables and gross domestic product (GDP). Finally, it uses nonlinear regression equations to analyze the global inflation rate and the annual data, and proves that there is a nonlinear relationship between the two and a downward trend, which is supported by analyzing the data of Australia and Canada.


Introduction
Regression model is a mathematical model that quantitatively describes statistical relationships.It can be divided into multiple linear regression models, univariate scale models, and nonlinear regression models.They are often used for various data analysis.Regression model is a predictive modeling technique that studies the relationship between the dependent variable and the independent variable.This technique is commonly used for predictive analysis, time series modeling, and discovering causal relationships between variables.The important foundation or method of regression models is regression analysis, which is a calculation method and theory that studies the specific dependency relationship between one variable and another.It is an important tool for modeling and analyzing data.There are many benefits to use regression analysis.Specifically, it indicates a significant relationship between the independent and dependent variables.It indicates the strength of the influence of multiple independent variables on a dependent variable [1].Regression analysis also allows people to compare the interrelationships between variables that measure different scales, such as the relationship between price changes and the quantity of promotional activities.These are beneficial for market researchers, data analysts, and data scientists to exclude and estimate the best set of variables for constructing predictive models.
The main problem of the paper is the methods and theory of the linear regression model, multiple linear regression model and nonlinear regression model, and the application of these models in gross domestic product (GDP) prediction.Three distinct examples are included in this paper [2].Firstly, the researchers collected data on GDP growth and unemployment rates in the United States and China to test the applicability of Okun's law.But not all countries conform to this law, and Indonesia's data changes do not conform to Okun's law.It follows from this that although Okun's law is valid in most cases, there is more specific information related to this data that needs to be considered.Next, through the construction of multiple linear regression model, the effects of total import and export volume, total energy consumption, total population, and total retail sales of consumer goods on GDP are analyzed, and empirical support is provided.This paper not only provides a valuable reference for policy makers and economists, but also provides entrepreneurs and market analysts with tools for in-depth understanding of economic phenomena.Finally, through covariance analysis, it is proved that although there are some abnormal changes in some data, this data still has a real prediction effect.
The layout of the paper is following.Firstly, this paper will describe the methods and theory of these three models sequentially, which is the second part of this article.Secondly, this paper will use the three models predicting Chinese GDP with some factors.this paper will research relationship between the unemployment growth rate and GDP growth rate with linear regression model, the relationship between GDP and four factors with multiple linear regression model, and analyze the global inflation rate and the annual data with nonlinear regression model.Lastly, the paper will get the conclusion.

Linear regression model
In data analysis, regression analysis is an important way of analyzing result.It is an effective way to measure the degree of correlation between two continuous variables while making certain predictions about patterns and trends in the data.In this section, the focus is linear regression model.It is a kind of simple linear regression model that is a powerful tool in statistical analysis and is widely applied in virous field, including economics, marketing, sociology, and medicine.
Supposed that the data in a dataset is collected in pairs, ( 1 ,  1 ), ( 2 ,  2 ), ( 3 ,  3 ) … … (  ,   ).Here, let  1 denotes explanatory variable or independent variable and let  1 denotes response variable or dependent variable.When explanatory variable can be thought of as a potential predictor of the response variable, and the two variables satisfy [3]  ̂= β 0 + β 1 . (1) Thus, the two variables could be defined as having a simple linear regression relationship.
According to this equation, the least square regression line, or "the line of best fit", can be fitted.In the function of the regression line, the two parameters,  0 and  1 , represent the intercept and the slope of the linear regression function, respectively.The least square regression line can, to a large extend, accurately describe, and predict the trend of the data.

Multiple linear regression theory
In regression analysis, if there are two or more independent variables, it can be called multiple regression.In fact, a phenomenon is usually associated with many factors, so using the optimal combination of multiple independent variables to predict or estimate the dependent variable is more realistic than using only one independent variable for prediction or estimation.Therefore, multiple linear regression has greater practical significance than univariate linear regression.This article will use multiple linear regression to estimate and predict the GDP in Chinese with the data from 2014 to 2023 from National Bureau of Standards.
The model is defined by and it can also be defined as [4] This is K-ary linear regression model, and it can be abbreviated as (Y, Xβ, δ 2 I n ).In this equation, There are three main problems in the process of multiple linear regression.The first is making point estimation with β and δ 2 ,and making quantitative relationship between  and  1 ,  2 , … … ,   .The second is examining the parameters of the model and the results of the model.The third is predicting the numerical value of , which is making point (interval) estimation with .
The first step is parameter estimation, estimating the parameter of  0 ,  2 , … … ,   by least square method, which is . One can properly select the  1 , … … ,   to make the quantity of Q least.Then, solving the equation to obtain the solution as β ̂ = (X T X) −1 (X T Y) and substituting the soluted  ̂ ,  = 0,2, … … ,  to y ̂= β ̂0 + β ̂1x k + ⋯ … + β ̂kx k .It is named as empirical regression plane equation, and  ̂ are named as empirical regression coefficient.
The second step is the examining of multiple linear regression models and regression coefficients.The first method is -test.When condition  0 holds, then In this equation,  0 means  ̂ = 0 ,  means the number of variables,  means the total number of samples, U means regression square sum, and   means residual sum of squares.In addition, , then the condition  0 cannot be thought established, which means there is a strong linear relationship between  and  1 ,  2 , … … ,   .If not, then the condition  0 can be thought established, which means there is not a strong linear relationship between  and other variables  1 ,  2 , … … ,   .The second method is , 0 < R ≤ 1, and R is multiple correlation coefficients between  and  1 ,  2 , … … ,   . can also be expressed by  as [5]

Nonlinear regression equation theory
Nonlinear regression is a method of modeling the nonlinear relationship between a dependent variable and a set of independent variables.Unlike linear regression, nonlinear regression is not limited to estimating linear models, but can estimate models with arbitrary relationships between independent variables and dependent variables.This estimation is usually achieved with iterative estimation algorithms.Common nonlinear regression models include polynomial regression, exponential regression, logarithmic regression, etc.These models can better capture the complex relationships between data, thereby improving prediction accuracy.The formula is given by [6] where  is the dependent variable,  is the independent variable,  is the parameter to be estimated f is the nonlinear function, and ε is the random error term.The purpose of a nonlinear regression model is to estimate the parameter β and find the best fitting function f, which minimizes the error between the predicted value and the actual value.
The intricate process of nonlinear regression often entails multiple stages, each playing a crucial role in arriving at accurate and meaningful predictions.The initial stage is model definition, where the researcher must carefully select the dependent and independent variables based on the context of the problem.This selection is crucial as it determines the nature of the nonlinear relationship being explored.For instance, in economic modeling, the dependent variable might be GDP growth, while the independent variables could include interest rates, inflation, and other macroeconomic indicators.
Once the variables are chosen, the next step is parameter estimation.This involves the application of iterative estimation algorithms to determine the optimal values for the model's parameters.These algorithms, such as the least squares method or gradient descent, iteratively adjust the parameter values to minimize the difference between the predicted and actual outcomes.This process can be computationally intensive, especially for complex nonlinear models, but it's crucial to obtaining accurate predictions.
After parameter estimation, the model undergoes rigorous testing to assess its goodness of fit and prediction ability.Goodness of fit measures how well the model explains the observed data, while prediction ability evaluates the model's performance on new, unseen data.To assess these metrics, a range of statistical tests and diagnostic tools are employed, such as the R-squared value, adjusted Rsquared, and residual analysis.If the model demonstrates satisfactory goodness of fit and prediction ability, it can be deemed suitable for addressing the practical problems at hand.However, if the results are not satisfactory, the model may require refinement or re-specification, possibly through the inclusion of additional variables or the adjustment of the functional form.In conclusion, the process of nonlinear regression is a sophisticated and iterative one, requiring careful consideration of variable selection, parameter estimation, and model testing.By following this rigorous framework, researchers can develop models that provide accurate and actionable insights into complex nonlinear relationships.

Application of simple linear regression to Okun's law
The data is collected from website Macrotrends, a research platform that record statistics about economics, stock, commodities.The statistics are record in Excel spreadsheet.In this section the researcher will discuss the application of simple linear regression to Okun's law.Second-hand data were collected, including the GDP growth and unemployment rate of the United States and China, to investigate if Okun's Law is suitable for these countries [7].
As labor force is a critical factor of economic growth, intuitively, there is a certain kind of relationship between economic growth and unemployment rate.Arthur Melvin Okun proposed Okun's law, clarifying the relationship.The law indicates that for every 1% increase in the unemployment rate, a country's GDP (an effective way to measure economic growth) will roughly decrease 2%.This law seems to be an effective measurement of relationship between GDP and unemployment rate.However, for some countries with different and complicated situations, this law may not be a perfect indicator.
As for the United States, according to statistical data from 1992 to 2022, using data visualization tools, the linear regression scatter plot and the least square regression line could be shown in Figure 1.

Figure 1. The relationship between the unemployment growth rate and GDP growth rate in the United States (in percent)
As the graph and the curve show, the two parameters of the function,  1 and  0 , are 2.38 and -1.1, respectively, indicating that in the United States, for every 1% of increase in unemployment rate, there are roughly about 1.1% decrease in GDP.This least square regression function shows that, in the United States, the relationship between unemployment rate and GDP roughly conforms to Okun's law.Moreover, as for China, according to statistical data from 1992 to 2022, using data visualization tools, the linear regression scatter plot and the least square regression line could be shown in Figure 2. As the graph and the curve show, the two parameters of the function,  1 and  0 , are 9.26 and -2.39, respectively, indicating that in China, for every 1% of increase in unemployment rate, there are roughly about 2.39% decrease in GDP.This least square regression function shows that, in China, the relationship between unemployment rate and GDP roughly conforms to Okun's law.However, due to different labor force structures and different reemployment policies for the unemployed in different countries, Okun's law does not necessarily conform to the situation in all countries.Indonesia is a typical example.According to statistical data from 1992 to 2022, using data visualization tools, the linear regression scatter plot and the least square regression line could be shown in Figure 3.

Figure 3. The relationship between the unemployment growth rate and GDP growth rate in Indonesia (in percent)
As the graph and the curve show, the two parameters of the function,  1 and  0 , are -0.45 and -0.03, respectively, indicating that in Indonesia, for every 1% of increase in unemployment rate, there are roughly about 0.45% decrease in GDP.This least square regression function shows that, in Indonesia, the relationship between unemployment rate and GDP doesn't conform to Okun's law.There are several possible reasons why Okun's law does not apply to certain countries.One of the reasons could be that due to differences between the ways countries measure their unemployment rate, the statistic of unemployment rate could have different standard.Another reason is that different countries, especially those less developed ones, have different labor force structure [8].

Application of multiple linear regression
This article will explore the relationship between GDP and total export-Import volume, total energy consumption, total population, and total retail sales of consumer goods, see Table 1 [9].
Therefore, this section explored the relationship between GDP and total export-Import volume, total energy consumption, total population, and total retail sales of consumer goods with multiple linear regression model.From the equation y ̂= −5.2755 × 10 6 + 0.8297 1 + 1.9537 2 + 34.6123 3 + 0.4389 4 , all the four factors have a positive impact on GDP.Besides, the total population has the greatest impact on GDP.This may be helpful for simulating and predicting GDP.

Analysis and discussion of empirical results
Through the analysis of variance, the authors obtained the value of Df-Residuals.Due to the large amount of data, the sum sq of Residuals reached 325517.Therefore, the degree of freedom of data residuals 41 indicates that this set of data has a relatively ideal prediction effect and can verify the accuracy of the data.As can be seen from the data analysis chart and trend line shown in Figure 4, the global inflation rate presents a nonlinear regression downward trend during 1980-2024.Although some data show abnormal increases and decreases, the global inflation rate has dropped from 7.5% to nearly 0% overall.According to  value and (> ) in covariance analysis, the year has a significant impact on the inflation rate, and there is a nonlinear relationship between the two [10].By conducting a thorough analysis of inflation rate and year in Canada and Australia, this paper confirms the existence of a non-linear relationship between the two variables, see Figure 5.The data reveals that, over time, the inflation rate exhibits a gradual downward trend.To ensure the accuracy of this observation, the authors employed the Mean Square Error (MSE) as a metric to assess the model's fitting quality and predictive capabilities.
However, this data only focuses on the impact of the year on the inflation rate, without considering other potential influencing factors.Therefore, there may be some uncertainty in the experimental results.To further improve the accuracy of the prediction, future research can consider incorporating more factors that may affect the inflation rate, such as economic growth rate, monetary policy, and international trade conditions.By considering these factors comprehensively, it can help the experiment to establish a more comprehensive and accurate prediction model, providing strong support for countries to formulate more scientific and reasonable economic policies and plans.In addition, the change of inflation rate is a complex and dynamic process, which is affected and restricted by many factors.Therefore, in the process of prediction and analysis, the experiment needs to constantly adjust and improve the prediction model to adapt to the constantly changing economic environment.Through deep research and analysis of the relationship between inflation rate and years, this experiment can provide an important basis for predicting the trend of future inflation rate changes and contribute to the stable growth of the global economy.
Finally, this experiment aims to provide a basis for predicting the future trend of inflation rate and provide a reference for countries to formulate future and price forecasts, in order to promote the stable growth of the global economy.In today's increasingly close global economy, the trend of inflation rate has a profound impact on the economic development, price stability, and consumer purchasing power of countries.Therefore, accurately predicting the trend of inflation rate is crucial for governments and enterprises in various countries.

Conclusion
Regression analysis is an important analytical method in statistics, which aims to explore the relationship between variables and predict future trends.Among them, linear regression is the most used form, and its theoretical basis mainly relies on the least square method, which describes the linear relationship between two variables by fitting a straight line.Linear regression can predict the trend of model data change to a certain extent, and provide strong data support for decision makers.However, many phenomena in the real world often involve multiple factors, which requires the use of multiple linear regression.By introducing more independent variables to predict the change of dependent variables, multiple linear regression can more accurately describe the relationship between data.For example, when analyzing GDP, one can consider several factors such as total import and export volume, total energy consumption, total population, and total retail sales of consumer goods, and construct a multiple linear regression model to understand the impact of these variables more fully on GDP.In addition to linear regression, nonlinear regression is also an important part of regression analysis.Nonlinear regression can predict models of arbitrary relationships between variables and therefore can provide more accurate predictive data.For example, when analyzing the relationship between the global inflation rate and the year, the nonlinear regression equation can help the model find that there is a nonlinear relationship between the global inflation rate and the year with a downward trend, so as to help the decision maker make more accurate calculations.When performing regression analysis, choosing the right method is crucial.Different data types and problem backgrounds of models require different regression equations to ensure the accuracy and reliability of data analysis.In addition, when the data model is validated, it is necessary to check for anomalies and possible effects on the predicted results.
Although representative data and rigorous calculation are selected as far as possible in this paper, the experimental results may be biased from the actual results because other factors that may affect the prediction results are not considered.For example, in the experimental analysis of linear regression equation, in addition to the impact of the unemployment rate on the economic growth rate, the impact of urban development rate, import and export trade tax rate, technology iteration and other factors were not considered.So, the experiment needs to collect more data to make more accurate predictions.In the future, with the development of data science, the application of regression analysis will be more extensive, and more new methods and advanced technologies will be applied to regression analysis.The author of this paper hopes that more scholars can make more accurate and realistic interpretation of regression models and forecast data.This will become the basis for more people to make decisions.

Figure 2 .
Figure 2. The relationship between the unemployment growth rate and GDP growth rate in China (in percent)

Figure 4 .
Figure 4.The distribution of global inflation rates for the period 1980-2024.

Figure 5 .
Figure 5. Left and Right: The inflation rates of specific regions of Australia and Canada over the period 1980-2024 and their trend lines.