Will the US-China Trade War Change Our Lives: A General Analysis on the Effects on US Economy

Motivation

When I first arrived here as an international student about two years ago, I was shocked to see that the shelves for many steel products in Ikea were empty. After some research, I found out that the US government has imposed tariffs on steel products coming from China because of the Trade War. Since then, I have been longing to know how the Trade War would affect our life and how it would influence the US economy. It is a very broad topic. Additionally, with the facts that most of the relevant data are not published by the government and people are more concerned about the coronavirus than everything else recently, I need to break down the topic and only focus on what is important. After several rounds of research, I found out that GDP (also known as total output in economics) is the best indicator of an economy. Some of the most relevant elements that had been affected by the Trade War are imports from China, exports to China, as well as the unemployment rate. This is because with restrictions and tariffs on imported goods from China, the US businesses are likely to have more job vacancies locally, driving down the unemployment rate as a result. It is hard to reach a conclusion on how the Trade War has affected the US economy based on limited data online and my limited time and knowledge. But it is worthwhile to see how these three elements are correlated to the GDP.

Data Collection and Cleaning

Most of the data on the research were obtained from the FRED Economic Research website. The data cleaning process was more complicated than predicted since they are all time series data. Some of them were gathered monthly while the others were gathered quarterly. Different notations and units were also used in the tables. To make further analysis easier, I combined the data and created several versions for different models.

This is an example of what a cleaned data looks like. Logarithm was taken for both import and export goods to better show the trend as it gives fewer fluctuations. Because all the US and China started to trade in 1985, all the following analysis and modeling involving import and export are based on data since then.

Otherwise, analysis and modeling are based on data since 1949.

Modelling Process and Visualizations

Background Information about the Three Elements

GDP (also known as total output) refers to gross domestic product. It is the monetary value of all finished goods and services made within a country during a specific range.
Potential GDP (also known as potential output) is the highest level of real GDP that can be sustained over the long term.
Imports and Exports here refer to the values of total imported goods and services coming to the US from China and to China from the US.
Unemployment rate is the share of the labor force that is jobless, expressed as a percentage.
Natural rate of unemployment, when an economy is in a steady state of full employment, is the proportion of the workforce who are employed.
GDP gap is the difference between GDP and potential GDP.
Unemployment gap is the difference between the unemployment rate and the natural rate of unemployment.

Why Focus on Them

In the graph above, the blue dots represent the trend of imports from China to the US since 1985 and the orange dots represent the trend of exports from the US to China since 1985. It is clear that they both showed an increasing trend until around 2018, they both decreased a little due to the Trade War.

Here is another graph showing the trend of the net import from China since 1985. This was done by subtracting exports to China from imports from China. The increasing trend shows that the US has been very reliant to Chinese goods and services.

Since GDP = C (consumption) + I (investment) + G (government spending) + NX (export - import), there must be a strong relationship between import, export and total output.

The graph on the left shows the increasing real GDP in blue and potential GDP in orange. The graph on the right shows the unemployment rate in blue and the natural rate of unemployment in orange. I would also like to know the relationship between them.

Unemployment gap is shown in blue and the output gap is shown in orange. Obviously, there is a negative relationship between the two. In economics, there is a model named the Okun’s Law that explains the relationship.

Okun’s Law

The Okun’s Law observed a relationship between unemployment and losses in a country’s production, where U is the unemployment rate, Un is the natural rate of unemployment, Y is the real GDP, Yp is the potential output, and c is the constant which varies across countries. In economics, Okun's law is an empirically observed relationship between unemployment and losses in a country's production. The "gap version" states that for every 1% increase in the unemployment rate, a country's GDP will be roughly an additional 2% lower than its potential GDP.

The simplest way to find out what the constant c is to divide all the differences between U and Un by the difference between all Y and Yp. There should be a negative relationship between them. There is a significantly negative number in 2018.

The mean of this result is also indeed a negative number.

But after plotting for both sets of data, some outliers show up, indicating that the previous result of c could be largely affected by those outliers. A better model needs to be built in order to investigate the relationship between output gap and unemployment gap.

Linear Regression Model 1 (output gap vs. unemployment gap)

The adjusted line in this plot shows that there is a somewhat linear relationship between output gap and unemployment gap. To see if a linear regression model is applicable, more tests are needed.

Two density plots are drawn based on both sets of data. Both of them are approximately normally distributed, which furthermore indicates that there is a probability that the relationship between them is linear.

A very strong correlation of -0.869 reinforces the linear model assumption.

Using the lm function in R, a linear function of

Output Gap = -1.2937 * Unemployment Gap - 0.2039

is modelled.

This model seems to be very nice. The p-value for the intercept has statistical significance in T test at 0.001 level. Standard errors for both the intercept and the coefficient are also very small. The p-value for the F test is statistically significant at 0.05 level. The R-squared value of 0.7563 is quite high, indicating that the regression models well and fits the data.

To further examine the prediction accuracy of the model, I separated the data into training and test samples, and generated the following diagnostic measures.

All the measures have values very similar to those of the original regression model and the result shows that it has a correlation accuracy of about 88.2%, which is very high. This also indicates that the actuals and predicted values have similar directional movement. Hence, the regression model is quite accurate.

The k-fold cross validation also provides some insight into the accuracy of the prediction model. Split the data into ‘k’ mutually exclusive random sample portions. Keeping each portion as test data, I built the model on the remaining (k-1 portion) data and calculated the mean squared error of the predictions. This was done for each of the ‘k’ random sample portions. Then finally, the average of these mean squared errors (for ‘k’ portions) was computed. This metric can be used to compare different linear models.

All the lines on the graph are very close to one another and parallel. Also, the symbols of the same color are not over dispersed. This validation further proves the validity of the prediction model.

Linear Regression Model 2 (total output vs. unemployment rate and net import from China)

total output = -232 * unemployment rate + 0.2659 * import from china - 0.2656 * export to china + 11186.4007

The p-values for all the coefficients and the intercept have very little statistical significance in T-test. And the standard errors for them vary a lot. The p-value for the F test is statistically significant at 0.05 level. But the R-squared value of 0.9179 is quite high.

To further examine the prediction accuracy of the model, I separated the data into training and test samples, and generated the following diagnostic measures.

Similar to the result in model 1, all the measures have values very similar to those of the original regression model and the result shows that it has a correlation accuracy of about 96.3%, which is even higher than that of the previous model. But the accuracy of this model still remains unclear.

The k-fold cross validation result shows that all the symbols with different colors vary a lot around the lines, indicating that although this model has some significance in statistics, it may not be a very accurate one.

Comparison

The R-squared value increased between model 1 and model 2. But this may be due to the fact that model 2 has more predictors. So, the increase in adjusted R-squared values may be more determinant in comparison.

Model 2 predicts the relationship between total output and unemployment rate, import and export to China, while model 1 only predicts the relationship between output gap and unemployment gap. Both T-test and k-fold cross validation shows that model 1 is a better prediction model, even though in practice, model 2 may be more useful and easier to be computed. The reason why model 2 seems to be less accurate is that there is some degree of correlation between all the predictors (unemployment rate, export and import to China), leading to multicollinearity and leading to useless T-test results.

Conclusion

Although model 2 has more relevant predictors and measures a more direct relationship between total GDP and the factors, model 1 has more statistically significant evidence proving its accuracy. Therefore, the prediction model of Output Gap = -1.2937 * Unemployment Gap - 0.2039 better fits the data.

As predicted by the model, there is a negative relationship between unemployment rate and output, a positive relationship between import and total output, and a negative relationship between export and GDP. Although total trade between the US and China decreased in quantity and value, the US economy has not been hugely affected since the US is also trading with many other countries in the world for substitutes. The unemployment rate in the US also declined, indicating that more job vacancies have been created domestically since the start of the trade war and simulated the economic performance.

References

Prabhakaran, Selva. “Linear Regression”.

http://r-statistics.co/Linear-Regression.html

Office of the United States Trade Representative. “The People’s Republic of China”.

https://ustr.gov/countries-regions/china-mongolia-taiwan/peoples-republic-china

Federal Reserve Economic Data. “Trade between the US. and China: Steady as she goes?”

https://fredblog.stlouisfed.org/2020/02/trade-between-the-u-s-and-china-steady-as-she-goes/?utm_source=series_page&utm_medium=related_content&utm_term=related_resources&utm_campaign=fredblog

Semester

Spring 2020

Researcher

Simin Na

Navigation

Motivation
Data Collection and Cleaning
Modelling Process and Visualizations
Conclusion
References

Executive / Directors

Member Profiles

Big-Little Tree