Income Inequality and Macroeconomic Activity: A Study of the Peruvian case

The relationship between income inequality and macroeconomic activity has been extensively investigated for many countries, with mixed evidence over the years. In this context, the aim of the paper is to study this relationship for the Peruvian economy with emphasis on the variables statistical properties. To that end, unit root time series bootstrap simulations are applied, additionally, social and financial variables are used to explain changes in inequality. As a result, no evidence of this relationship is found when variables time series properties are taken into account so new empirical evidence on this issue is critical in light of rising income inequality worldwide and its pervasive effects on society.


Introduction
The relationship between income inequality and macroeconomic activity was analyzed for the first time by Blinder and Esaki (1978).More than 30 years after that seminal paper, new econometric methods were developed, more reliable datasets are available and new economic variables were introduced in the relationship.As a result, the way the relationship is analyzed became more complex and additional care is needed in the application of statistical procedures as well as dataset elaboration.Keeping in mind the previous lines, the purpose of the paper is to analyze the relationship between income inequality and the macroeconomic activity for the Peruvian economy in a rigorous way.The Peruvian economy is attractive to study due to its historical high levels of income inequality and recently economic expansion.Additionally, results from national The relationship between income inequality and macroeconomic activity has been extensively investigated for many countries, with mixed evidence over the years.In this context, the aim of the paper is to study this relationship for the Peruvian economy with emphasis on the variables statistical properties.To that end, unit root time series bootstrap simulations are applied, additionally, social and financial variables are used to explain changes in inequality.As a result, no evidence of this relationship is found when variables time series properties are taken into account so new empirical evidence on this issue is critical in light of rising income inequality worldwide and its pervasive effects on society.
household surveys are available for a range of years that reduces the possibility of obtaining biased estimation of inequality.
The paper is relevant given that income inequality and economic disparities is on the rise in developing and developed countries alike and the relationship between income inequality and macroeconomic activity is crucial to elaborate economic policy tools on the issue given the pervasive effects of inequality on society (more details in Neckerman and torche (2007)).In this respect, the vast majority of papers in the subject analyze the relationship for developed countries, few papers on the issue focus on developing countries; consequently, no evidence this relationship sustains the same for developing economies.Additionally, time series properties of the variables are not as relevant as estimation procedures in many of the related works, the same applies to statistical test procedures and dataset elaboration, implying that some statistical methods may not work properly with some kind of economic data.
The paper is organized as follows: section 2 provides an account of the previous research on the subject.Section 3 shows how the data is obtained.Then, the data is analyzed to obtain its time series properties.In section 4 different models are proposed to test the relationship between income inequality and macroeconomic activity taking into account its time series properties.Finally, section 5 summarizes the main findings.

Previous research
The work of Blinder and Esaki (1978) proposed a statistical model, using inflation and unemployment as explanatory variables and quintile shares of income distribution as dependent variable, to explain income inequality.Thereafter, research in many directions took place.One direction attempts to explain income inequality using variables different from the ones commonly used to characterize business cycles (inflation, Gross domestic Product (GDP), unemployment, interest rates, among others), see for instance Deaton and Paxson (1994), Igiyum and Owen (1994) and Rafsanaji, Zeinolabedini, Mir y Far (2014).Whereas works that use business cycles economic variables can be classified in different ways, as proposed by Jäntti and Jenkins (2001), they classified all previous research on the issue in first and second generation studies.A first generation study involves the use of linear regression, estimated by least squares, with an inequality index as dependent variable.For instance, Blinder and Esaki (1978) showed that unemployment affect significantly income inequality, whereas inflation have a weak effect among quintiles for the U.S economy.And Buse (1982) found no evidence of any relationship between inflation and unemployment and income inequality for the Canadian economy, also fall into this category.A second generation study applies methods from the time series econometrics: testing for unit roots and obtaining a cointegration relationship in case one exists.The work of Mocan (1999) tested the presence of unit roots in the variables and showed inflation has a progressive impact on income inequality measured by income shares, belongs to this category.And Parker (1999) used various cointegration models to test the relationship between macroeconomic activity and income distribution and found a long-run relationship between inequality and unemployment, inflation, productivity among other variables for the U.S economy.
With respect to research in developing countries the literature is focused on the Kuznets hypothesis, that is, an inverted U relationship between economic development and economic growth (see : Kuznets 1955) as well as income inequality and macroeconomic activity.Dobson and Ramlogan (2008) found evidence that supports the Kuznets hypothesis for 18 Latin-American countries.Subarna and Heyse (2006) use panel data models to test the Kuznets hypothesis and found a positive relationship among variables.Whereas, Blejer and Guerrero (1990) found that inflation, exchange rates, government spending and interest rates have an impact on income inequality measured by decile ratio for the Philippine economy.And Rafsanaji, Zeinolabedini, Mir y Far (2014) found that foreign direct investment, per capita PIB and inflation have a positive effect whereas literacy rates has a negative on income inequality measured by the Gini index for the Iranian economy.Works related to the subject for the Peruvian economy can be found in Pozo (2008), who uses simulated data on income inequality to test the relationship proposed by Blinder and Esaki and Cieza (2007), which analyzed the relationship between income distribution and economic growth from a theoretical and empirical view, both with mixed results.

Data analysis
This section is divided into two subsections: the former explains in detail how the data was obtained.In the latter bootstrap simulations are applied to test the presence of unit roots in the variables (for an introduction to bootstrap techniques for time series, see: Härdle, Horowitz and Kreeis 2001).

The data
As stated in section 2, the relationship between income inequality and macroeconomic activity is traditionally analyzed through inequality measurement summary indexes and business cycle variables.This paper follows this tradition, the dependent variables are the Gini coefficient and Generalised entropy index (GEI), whereas inflation, real GDP and real interest rate are taken as independent variables, the data is taken for the period 1997-2015.
The data is taken annually to cope with the problem of seasonal adjustment that affects datasets taken in periods shorter than a year, and to avoid the aggregation problem which may lead the series to exhibit fluctuation patterns that no correspond to the true nature of the series, but the wrong periodicity, resulting in the inference of false causal relationships as stated in Marcelino (1999).Additionally, the data is constructed in real terms with the same base year to deal with false causal relationship that may appear as product of cross correlation with nominal data.Keeping in mind these points the next paragraphs show the data elaboration.

Gini coefficient and Generalized Entropy Index
One of the widely used measures of income inequality is the Gini coefficient; it takes values between 0 and 1, that is, 0 means perfect equality and 1 means perfect inequality (different procedures to construct Gini coefficients are explained in Xu [2004]).The Gini coefficient is selected among other inequality indexes because is a one-dimensional measure of inequality, which means 'all the different features of inequality are compressed into a single number' (Cowell, 2009, p.7).Whereas multidimensional measures may not indicate effectively whether inequality increased or decreased over time (See Savaglio [2004] for a survey in multidimensional inequality indexes).Besides, the Gini coefficient satisfies 3 of 5 basic criteria for inequality measures and weakly a fourth criterion (Cowell 2009).Finally, the Gini coefficient is broadly used in previous works related to the subject.On the other hand, the Generalized Entropy index (GEI) is based on entropy as a measure of disorder or deviation in income distributions, it satisfies the 5 basic criteria for inequality measures.The GEI is no bounded to any interval, it takes values from 0 to infinitum, with 0 as perfect equality.
The Gini coefficient and GEI are obtained using income before taxes data from the Encuesta Nacional de Hogares (ENAHO) that is conducted annually by the Instituto Nacional de Estadística e Informática (INEI).Results from the ENAHO are disposable from 1997 to 2015.From 1997 to 2002 the surveys were conducted only for the last quarter, and since 2003 the questionnaire was expanded; the data is still comparable regardless these drawbacks.As an advantage, construction of annual indexes vanish the missing data problem that distorts population surveys and leads to invalidate any measure extracted from the surveys.
In details, household is taken as basic unity of analysis and the definition of income before taxes is the one used in the ENAHO.Also, an equivalence scale of the parametric type is applied to consider the fact that households have different needs and features (see Coulter, Cowell and Jenkins [1992]; Deaton [1997] and Figini [1998]).Glewwe (1988) applied this kind of scale to the Peruvian case, but using parameter values different from the ones proposed by the SEDLAC (SEDLAC is the acronym of Socio-Economic Data Base for Latin America and the Caribbean.)and applied to this work.As stated before, the monetary income is price adjusted by a laspeyres index based on the consumer price index of 2009 to obtain the Gini coefficient in real terms (an alternative method of price adjustment is found in Pendakur 2002).Similarly, a spatial deflation was applied to the household data to take into account differences in cost of living among regions (UNSD 2005; Deaton and Zaidi 1999).To this end, a laspeyres index is constructed using the consumer price index of the main cities in each region.It is highlighted that values of both Gini index and GEI, among others inequality indexes in general, are highly dependent on income definitions, methods of adjustment for prices and needs, basic unity of analysis and survey design.Consequently, causal relationship derived from income inequality regression models are indirectly affected by the measurement procedures used to obtain the inequality index.

Inflation, interest rate and GDP
To measure macroeconomic activity inflation, interest rates and GDP are selected.The inflation rate in the period 1997-2015 was calculated as the annual percentage change of the Consumer Price Index (CPI) for Lima metropolitan area with base year 2009 published by the INEI.The interest rate is the average of the active rate in national currency taken in real terms and published in the Central Bank Reserve of Peru annual report.The nominal GDP, published by the INEI, is deflated using the Lima metropolitan CPI with base year 2009 to obtain the real GDP expressed in billions of new soles per year.Alternatively, works in the subject take unemployment and not GDP as the key variable to measure the influence of business cycles owing to its inverse relationship with economic activity, but for the Peruvian economy, no reliable unemployment data is collected except for Lima metropolitan area.

Non-traditional exports, social spending and ratio of liquid liabilities to GDP
Additionally, external factors, financial access and redistributive policy variables data is taken.Non-traditional exports, defined as products transformed to certain degree and transacted in low quantities historically, is selected to measure external factors because it reflects external shocks with little or no delay.The government spending in social programs (like welfare checks to elderly people, nutrition assistance and other programs) measured by millions of soles per year is used because welfare is redistributed among population by these programs.Finally, the ratio of liquid liabilities (financial obligations to private sector by financial institutions) to GDP and used as a proxy of financial access.Data were collected from the INEI and BCRP annual series statistics for the time period under consideration.

Data properties
The importance of temporal properties of time series has been long recognised in econometrics.Thus, it is common practice to test whether the series is stationary or non-stationary; and to test for unit roots in particular.This subsection deals with the application of unit root test to the dataset, and showing its strengths and weakness in fairly detail.
Unit root testing literature is ever growing, uses a wide range of mathematical techniques and it is supported by advanced theorems in mathematical statistics.The reader is referred to Stock (1994) and Campbell and Perron (1991) for detailed introductions on unit root testing.There are literally dozens of unit root tests with different features among them.These tests are proposed to be applied either in the time or frequency domain, proposed to test the null hypothesis of unit root, stationarity, structural break or change in variance.Also, some unit root tests are designed to test only time series that follow a specific stochastic process.Moreover, two unit root test applied to the same time series may display contradicting results as demonstrated in McCallum (1993) for the GDP series of the U.S economy.
One of the most popular unit root test is the Dickey-Fuller (DF) test developed by Dickey andFuller (1979, 1981) to test the existence of unit roots in autoregressive processes (AR).An extension of the DF test for autoregressive moving average processes (ARMA) was developed by Said and Dickey (1984) and known as the Augmented Dickey-Fuller (ADF) test.Nevertheless the widely use of the ADF test in applied work it has important limitations for a number of reasons.First, ADF tests suffer from power reduction whenever the sample size is small, the autoregressive component parameter is close to unity, the deterministic component of the true DGP and the model used to test the unit root are not the same (Rehman and Zaman 2009), the true DGP and the model follow different stochastic processes (Schewert 1988).Second, exact inference is not possible in unit root tests because the test statistic is asymptotically pivotal.As a result, only approximate inference, in other words, inference using limit distributions is possible (Haldrup and Janson 2005).This second point is relevant because in economics large sample size is the exception, not the rule.Third, estimators of autoregressive models are biased in small samples (Shaman and Stine 1988;Andrews 1991).Thus, unit root test statistics are biased in small samples.Finally, it is not possible to know in advance the true stochastic process that follows the data; consequently, sequential testing procedures are often applied, but sequential procedures are more complex in nature and face additional difficulties than individual tests.Overall, ignoring the limitations of ADF tests potentially lead to make either a type I or type II error, and invalidating further results.
An alternative to the use of conventional asymptotic theory in hypothesis testing is the application of bootstrap techniques.These techniques use simulations intensively to approximate the test statistic distribution function (Jeong and Maddala 1993;Davidson and Mackinnon 2004).In bootstrapping unit root tests, there are two procedures in the literature: the block bootstrap and the sieve bootstrap.The former is based on partitioning the series in blocks and resampling them to obtain the distribution of the test statistic (Lahiri 1999;Paparoditis and Politis 2003); the latter uses the residuals from a regression under the null of unit root to construct recursively a simulated series to obtain the test statistic distribution (Psaradakis 2001;Chang and Park 2003).Additionally, the application of bootstrap simulations to ADF unit root test offers several advantages, such as: the existence of asymptotic refinements, that is, the error in rejection probability (ERP) of using bootstrap test converges faster to cero than asymptotic theory as the sample size increases (Park 2003).Power and size improvements of using bootstrap test with respect to traditional asymptotic theory are found in simulation experiments by Palm, Smeekes and Urbain (2006).And significant bias reduction in autoregressive estimators is made by using bootstrap simulations unlike standard procedures, as demonstrated in Patterson (2011, p.145).On the other hand, bootstrap unit root test has low power if the root of the MA component is near to unity as stated in Davidson (2009).
This paper uses the bootstrap ADF test proposed by Chang and Park (2003) and explained in appendix 1.Each series is demeaned or detrended before the test.Information criteria are used to select the number of lags in the model, and the model is estimated by ordinary least squares (OLS).For each series 100,000 simulations were performed to construct the bootstrap distribution of the estimator; the 90 th , 95 th and 99 th percentiles were selected from the bootstrap distribution as critical values.In addition, two other set of simulations were performed to warrant the results.One set uses recursively adjusted series to avoid correlation between dependent variables and errors (So and Shin 1999;Taylor 2002), the adjustment procedures are explained in appendix 2. The other set uses a bootstrap procedure that offers power improvements to discriminate between unit root and trend stationary processes for processes with trend (Smeekes 2009).All simulations were performed in STATA (version 11.1) with seed 4590, and using a code written by the author.Table 1 shows the results.Note: the ADF test value is obtained from the original data.Inflation variable is adjusted by demeaning, the others by detrending.
Table 1 is divided in three panels displaying test statistics and critical values from the bootstrap distributions for each variable.In the first panel the original Chang and Park bootstrap unit root test is applied to each series, at the 5% level only the real GDP is significant, whereas the null of unit root is rejected for the rest of variables.In the second panel the Chang and Park bootstrap unit root test with recursive adjustment for each series is showed, the null of unit root is not rejected for all variables at a significance level of 5%, with the exception of inflation.It is clear from the two panels that the real GDP has a unit root and the inflation rate does not have a unit root, for the rest of variables the evidence is inconclusive.In panel three of table 1 a non-stationary time series bootstrap procedure is applied to distinguish between unit roots and trend components, from this panel the null of unit root is not rejected for the real GDP and Generalized entropy index, and rejected for the Gini coefficient and the real interest rate at the 5% level.From the results of Table 1, inflation follows a stationary process, the Gini coefficient and interest rate follows a trend stationary process, finally both the real GDP and Generalized entropy index follows a trend stationary process with unit root.

Model and results
The relationship between income inequality and macroeconomic activity are usually thought to go from macroeconomic activity to inequality, in other words, income inequality is changed by business cycles.As noted in the introduction regression models are mainly used to investigate the relationship between income inequality and macroeconomic activity, although the relationship does not have a theoretical background in economic theory, regressions offer a procedure to test the relationship empirically.Among the many models to test time series relationships, regression models require lesser number of observations than more sophisticated models used to analyse dynamical relationships (namely: transfer function models, error correction models, state-space models and others) that in some cases require hundreds of observation to get plausible results.On the other hand, regressions rely on a set of assumptions (No autocorrelation of errors, collinearity of regressors, homoscedasticity of errors, linearity of regressors and normality of residuals.)to get meaningful statistical results, violations of any of the assumptions affect estimators statistical properties and implies the proposed model is inadequate to test the relationship at hand.In the context of time series regressions it is necessary the variables follow stationary processes so that estimators could be consistent so each variable is detrended, differenced or demeaned according to their statistical properties.

The models
To analyze the relationship between income inequality and macroeconomic activity linear models are estimated under 3 different specifications: Model 1 is the basic specification, the Gini coefficient is the dependent variable and the inflation rate ( ), real interest rate ( ) and real Gross Domestic Product ( ) as regressors on the right hand side.In Model 2 the Generalized Entropy Index is the dependent variable and the same regressors as in model 1.Both models measure the effect of independent variables on income inequality.Two different measures of income inequality are used in order to see if the relationship between income inequality and macroeconomic activity would be influenced by the election of inequality measures.Whereas in model 3 the same variables as in model 1 is applied, but variables statistical properties are not taken into account.Table 2 presents the results.Table 2 is divided in two panels, the first shows values of the coefficients, standard errors and p-value for each variable.The second panel displays values of the F-test, adjusted R 2 and other parametric tests that are made to verify the regression assumptions.In this respect, the Breusch-Godfreytest is applied to test the presence of auto correlated errors.The homoscedasticity assumption is contrasted by the Breusch-Pagan test.Multicollinearity in the regressors is tested by the variance inflation factor (FIV) method.The Shapiro-Wilk test is applied to contrast normality in the residuals, and the omitted variables RESET test is applied also.From Table 2 is evident that none of the regressors is individual or joint significant in model 1 and normality of the residuals is the only assumption violated.In model 2 no significant regressors are found individually or jointly.At the same time, all assumptions are not rejected implying that model 2 is well specified.In model 3 the real GDP and the constant term are individual significant and all variables are joint significant.No presence of autocorrelation, heteroskedasticity, multicollinearity and misspecification is observed, but the assumption of normality of the residuals is rejected.

Results and analysis
At first glance from Table 2 results model 3 is the best model because the statistical significance of its estimators and tests, but estimators have poor statistical properties because variables time series properties were not taken into consideration, this is evident once a time variable is included among the regressors, none of the regressors is significant, this result leads to sustain that variables from model 3 are not truly related, but follow a spurious relationship.On the other hand, in model 1 and model 2 time series properties of the variables are taken into consideration.Table 2 shows that none of the variables is statistical significant and at the same time all of the regression assumptions are not rejected for both models (except normality for model 1, but in a time series context is of a lesser importance.).From both models it is clear that income inequality is not related to macroeconomic activity and the election of an inequality measure index has no effect on the statistical significance of this relationship.
From a logical perspective results from Table 2 makes sense because low inflation rates hardly affects inequality even though its impacts on population purchasing power because it would take more than 20 years to halve purchasing power if inflation rate was 3% each year, in case of the Peruvian economy the monetary authority implemented inflation target between 1% to 3% since 2002.The impact of lower real interest rates is limited owing to small financial markets and developing banking system in the Peruvian economy, it means that wide segments of the Peruvian economy have no access to credit and capital.In case of the real GDP, fishing, mining and manufacture are three of the largest sectors in the Peruvian economy, with mining and fishing sectors with considerably faster growth than manufacture, at the same time manufacture is labor intensive so changes in the real GDP have little effect on the structure of income inequality.
After rejecting the role of traditional macroeconomic variables on income inequality, a new set of macroeconomic variables related to external factors, financial access and redistributive policy are taken as regressors to further investigate if there is a relationship between macroeconomic activity and income inequality not considered in the previous models.In order to analyse these new variables two additional regression models are proposed:  In model 4 the Gini coefficient is the dependent variable and real social spending, non-traditional exports and the ratio of liquid liabilities to GDP as dependent variables, similarly, model 4 takes the same independent variables, but the Generalized Entropy Index on the left-hand side.As before, the variables in model 4 and model 5 are treated to take into account its time series properties.Table 3 shows the results: In model 4 from Table 3 the coefficients have the expected sign, but none of the regressors is significant and no regression assumption is rejected except normality.In Model 5 no significant regression coefficient is found, there is no statistical evidence to reject any of the regression assumptions, and for both models no joint significance of regressors is found.Results from table 3 seems contradictory, one could expect that the social spending should reduce inequality through satisfying the needs of low-income population.Also, wider access to financial services should reduce inequality by means of funding small business operations, as well as most of the non-traditional exports are labour-intensive for the Peruvian economy.
The explanation why variables from macroeconomic activity or social and financial ones are not related to income inequality has two well-defined arguments: The nature of income inequality and the structure of income inequality in the Peruvian economy.On the former reason, income inequality is a highly complex economic phenomenon influenced by many factors accounting a wide range of characteristics such as sex, education, age, industry, skills, hierarchy, financial assets, employment and others (See Hartog 1981).Therefore, individual income revenue is not just determined by forces of supply and demand on the labor market in the economy.
On the later, the structure of income inequality in the Peruvian economy follows a deeply heterogeneous complex process not circumscribed to economic factors, others such as demographic, educational, sociological, geographical and intergenerational factors also influence income inequality in the Peruvian economy.This fact is shown in the following figure: Figure 1 shows the evolution of the Gini coefficient for 2 geographical regions in the Peruvian economy.The north Andean region sustains high levels of inequality for the period considered, this region suffers from lack of infrastructure, social unrest, nutritional deficiencies in children, high birth rate and poor public administration, even though its natural resources and mining investment.In contrast, in the central coastal region income inequality has been declining over the years, historically this region has better transport systems, women are more integrated in the workforce and easier access to higher education than the Andean region, even though corruption and poor public management.As a result of these differences, income inequality is not correlated among regions as seen in figure 1, north Andean region exhibit not trend in contrast to the negative trend of the central coastal region in the time span, it means that changes in some regional income inequality indexes are not correlated to macroeconomic variables, the real GDP product expanded for almost the whole period whereas income inequality for the Andean region remained invariant, the same applies to other macroeconomic variables.Thus, the highly heterogeneous income inequality for the Peruvian economy helps to understand the lack of relation with macroeconomic activity variables.

Conclusions
This paper examines the relationship between income inequality and macroeconomic activity for the Peruvian economy.The main finding is that reductions in income inequality, measured by the Gini coefficient or the generalized entropy index, are not explained by economic, financial or redistributive variables once their time series properties are taken into account.One of the causes of absence of such relationship is the heterogeneous structure of income inequality resulting in enormous disparities between inequality indexes among regions for the Peruvian economy.On the other hand, the effect of variables such as inflation, interest rates, real GDP, financial access and nontraditional exports is weakened by the particular features of the Peruvian economy during the period considered.The inexistence of the previous relationship for the Peruvian economy does not mean that such relationship is impossible or that all previous research is invalidate in light of modelling and data considerations, it only says that inequality is highly complex and macroeconomic policy should be applied more carefully in order to reduce income inequality.

Table 1 :
Unit root tests

Table 2 :
Model specifications

Table 3 :
Model specifications