본문 바로가기
카테고리 없음

[STATA] 패널모형에서 단위근과 공적분 검정

by e-money2580 2023. 2. 6.
반응형


** 패널데이터 분석의 문제점  : 1.특정계열의 시계열 데이터는 전체 시계열 데이터보다 표본 수가 적다 2.계열별 시계열 데이터의 비정상성(nonstationarity)

/* 
2번 문제를 해결하기 위해 단위근(unit root) 검정 => (문제점) 패널모형의 이질적 패널(heterogenous panel) 문제 : 특정 계열에서는 단위근이 나타나지만(비정상적 시계열), 다른 계열에서는 단위근이 나타나지 않음(정상적 시계열)
모든 계열의 종속변수와 설명변수가 I(1), 즉 1차 차분 정상적 시계열이라고 하더라도 두 변수의 공적분 벡터는 계열마다 다르게 추정될 수 있고 모든 계열에 적용할 수 있는 공적분 벡터가 존재하는지, 이를 추정할 수 있는지는 문제

 


** 패널데이터에 대한 단위근 검정 (LLC : Levin, Lin and Chu 2002)
귀무가설 : 모든 계열에 대해 종속변수는 비정상적 시계열이고 파라미터(추정계수)는 다르다
대립가설 : 모든 계열에 대해 종속변수는 정상적 시계열이고 파라미터(추정계수)는 동일하다

LLC모형 : △yit = c x yit-1 + Φ x △yit-j + b x Zit + eit

Zit는 개체의 고정효과 변수 포함
귀무가설 하에서 yit는 비정상적 시계열이므로 가설검정에서 t검정을 사용할 수 없음
LLC는 특정 조건에서 c에 대한 t검정통계량이 점근적으로 표준정규분포를 따르는 것을 증명
LLC 검정통계량은 시계열이 무한에 가까운 경우에만 점근적 분포를 따르나, HT 검정에서는 T가 고정되어 있는 경우의 검정통계량을 도출

HT(Harris-Tsavalis(1999)모형 : yit = ρ x yit-1 + b x Zit + eit

HT는 패널그룹 수가 무한대에 가까운 ㄱㅇ우에만 성립

보다 일반적인 귀무가설은 Choi(2001)과 Im, Pesaran and Shin(2003)

귀무가설 : 모든 종속변수는 정상적 시계열이고, 파라미터(추정계수)는 다르다
대립가설 : 모든 종속변수는 정상적 시계열이고, 파라미터(추정계수)는 동일하다 


IPS 검정 : Dicky-Fuller 검정을 각 개체별로 추정한 후 추정계수에 대한 t검정통계량의 평균과 분산을 얻은 후 임계치를 도출한다. n이 무한대에 가까운 경우 IPS 검정통계량이 정규분포를 따른다.

Choi 검정 : T가 무한대에 가까운 경우 검정통계량은 카이제곱 분포를 따른다. 검정통계량인 P값이 커질수록 귀무가설을 기각한다. = 패널 단위근이 존재하지 않는다(=정상적 시계열)


** 패널공적분 검정

패널모형의 변수들이 I(1), 1차 차분 정상적 시계열이면 변수들의 공적분 관계를 확인할 필요 => 공적분 관계가 있다면 패널 동적 모형에서 오차수정항을 포함해야 함

Westerlund 공적분 검정

귀무가설 : 모든 개체에 대해 추정계수는 0이다.
대립가설 : 적어도 하나의 개체에 대해 추정계수는 0이 아니다. (=패널그룹에 공적분 관계가 존재한다, 장기적인 균형이 존재)
*/


use "D:\STATA연습데이터\STATA시계열데이터분석\T_data14_1.dta",clear
// 1970년~2003년(34개 연도), 151개 국가, PPP : purchasing power parity 구매력평가 
/* 
PPP 가설 : 재화(교역재)의 가격은 어느 나라에서나 동일해야 한다
두 국가의 실질환율 : R = (E x P*)/P, E : 두 국가의 명목환율, P* : 외국에서 재화의 가격, P: 국내에서 재화의 가격

y = lnR = logE + logP* - logP*

R이 장기적으로 균형에 접근한다면(PPP 가설이 성립한다면) yt는 정상적 시계열이 될 것으로 예상. yt 시계열이 단위근을 가지고 있다면(비정상적 시계열이라면) PPP 가설이 성립하지 않고 단위근 귀무가설을 기각하면 PPP 가설이 성립.
*/

tsset id year

 



** LLC 검정(패널데이터에 대한 단위근 검정 : 패널데이터의 시계열의 정상성, 모든 개체의 추정계수 동일 여부 검정)

xtunitroot llc lnrxrate, lags(aic 10) // xtunitroot llc : LLC 검정 명령어, lnrxrate : 실질환율, lags(aic 10) : AIC에 근거하여 △yit-p를 최대 p=10까지 포함한 후 적정한 시차(p)를 선택

/*
Levin–Lin–Chu unit-root test for lnrxrate
-----------------------------------------
H0: Panels contain unit roots               Number of panels  =    151
Ha: Panels are stationary                   Number of periods =     34

AR parameter: Common                        Asymptotics: N/T -> 0
Panel means:  Included
Time trend:   Not included

ADF regressions: 1.82 lags average (chosen by AIC)
LR variance:     Bartlett kernel, 10.00 lags average (chosen by LLC)
------------------------------------------------------------------------------
                    Statistic      p-value
------------------------------------------------------------------------------
 Unadjusted t       -28.3502
 Adjusted t*         -8.1550        0.0000
------------------------------------------------------------------------------
*/
// Unadjusted t(-28.35)는 추정계수 c에 대한 t검정통계량이고, Adjusted t는 이에 대한 유한표본에서 발생하는 편의를 조정한 값
// p값이 0.01보다 작으므로 귀무가설(패널데이터는 단위근을 포함한다)을 기각한다. = 단위근이 존재하지 않고(=정상적 시계열) 모든 국가들이 같은 추정계수 c를 가진다 = PPP 가설이 성립

xtunitroot llc lnrxrate if g7==1, lags(aic 10) // G7 국가에 대해서만 LLC 검정
/*
Levin–Lin–Chu unit-root test for lnrxrate
-----------------------------------------
H0: Panels contain unit roots               Number of panels  =      6
Ha: Panels are stationary                   Number of periods =     34

AR parameter: Common                        Asymptotics: N/T -> 0
Panel means:  Included
Time trend:   Not included

ADF regressions: 1.00 lags average (chosen by AIC)
LR variance:     Bartlett kernel, 10.00 lags average (chosen by LLC)
------------------------------------------------------------------------------
                    Statistic      p-value
------------------------------------------------------------------------------
 Unadjusted t        -6.7538
 Adjusted t*         -4.0277        0.0000
------------------------------------------------------------------------------
*/ // AIC에 의해 시차 p=1 선택, 단위근이 존재한다는 귀무가설을 기각


** 횡단면 평균을 제외한 변수를 사용하여 LLC 검정(yit 변수에 동시적 상관관계가 존재할 수 있다는 문제 해결) 

xtunitroot llc lnrxrate if g7==1, lags(aic 10) demean

/*
Levin–Lin–Chu unit-root test for lnrxrate
-----------------------------------------
H0: Panels contain unit roots               Number of panels  =      6
Ha: Panels are stationary                   Number of periods =     34

AR parameter: Common                        Asymptotics: N/T -> 0
Panel means:  Included
Time trend:   Not included                  Cross-sectional means removed

ADF regressions: 1.50 lags average (chosen by AIC)
LR variance:     Bartlett kernel, 10.00 lags average (chosen by LLC)
------------------------------------------------------------------------------
                    Statistic      p-value
------------------------------------------------------------------------------
 Unadjusted t        -5.5473
 Adjusted t*         -2.0813        0.0187
------------------------------------------------------------------------------
*/ // 1% 유의수준에서 단위근이 존재한다는 귀무가설을 기각할 수 없다.


** HT 단위근 검정

xtunitroot ht lnrxrate
/*
Harris-Tzavalis unit-root test for lnrxrate
-------------------------------------------
H0: Panels contain unit roots               Number of panels  =    151
Ha: Panels are stationary                   Number of periods =     34

AR parameter: Common                        Asymptotics: N -> Infinity
Panel means:  Included                                   T Fixed
Time trend:   Not included
------------------------------------------------------------------------------
                    Statistic         z         p-value
------------------------------------------------------------------------------
 rho                  0.7534      -22.0272       0.0000
------------------------------------------------------------------------------

*/ // 1% 유의수준에서 귀무가설을 기각


** LLC, HT : 하나의 파라미터(추정계수)에 의하여 단위근이 결정된다고 가정

** IPS 검정 : 파라미터가 개체마다 다르다고 가정

xtunitroot ips lnrxrate
/*
Im–Pesaran–Shin unit-root test for lnrxrate
-------------------------------------------
H0: All panels contain unit roots           Number of panels  =    151
Ha: Some panels are stationary              Number of periods =     34

AR parameter: Panel-specific                Asymptotics: T,N -> Infinity
Panel means:  Included                                        sequentially
Time trend:   Not included

ADF regressions: No lags included
------------------------------------------------------------------------------
                                              Fixed-N exact critical values
                    Statistic      p-value         1%      5%      10%
------------------------------------------------------------------------------
 t-bar               -3.3369                     -1.730  -1.670  -1.640
 t-tilde-bar         -2.6922
 Z-t-tilde-bar      -19.2620        0.0000
------------------------------------------------------------------------------
*/ // t-bar(-3.33)가 개별 파라미터들에 대한 t검정통계량의 평균값이 된다. 1% 임계치(-1.73)보다 검정통계량의 절대값이 크므로 귀무가설을 기각한다.

xtunitroot ips lnrxrate, lag(aic 8) // 8차까지 검정해 본 후 차수를 결정하라는 명령문
/*
Im–Pesaran–Shin unit-root test for lnrxrate
-------------------------------------------
H0: All panels contain unit roots           Number of panels  =    151
Ha: Some panels are stationary              Number of periods =     34

AR parameter: Panel-specific                Asymptotics: T,N -> Infinity
Panel means:  Included                                        sequentially
Time trend:   Not included

ADF regressions: 1.38 lags average (chosen by AIC)
------------------------------------------------------------------------------
                    Statistic      p-value
------------------------------------------------------------------------------
 W-t-bar            -16.0206        0.0000
------------------------------------------------------------------------------
*/ // 모든 패널이 단위근을 포함한다는 귀무가설을 기각


** Choi 단위근 검정(Fisher 형태 단위근 검정)

xtunitroot fisher lnrxrate, dfuller lags(1) demean
/*
Fisher-type unit-root test for lnrxrate
Based on augmented Dickey–Fuller tests
---------------------------------------
H0: All panels contain unit roots           Number of panels  =    151
Ha: At least one panel is stationary        Number of periods =     34

AR parameter: Panel-specific                Asymptotics: T -> Infinity
Panel means:  Included
Time trend:   Not included                  Cross-sectional means removed
Drift term:   Not included                  ADF regressions: 1 lag
------------------------------------------------------------------------------
                                  Statistic      p-value
------------------------------------------------------------------------------
 Inverse chi-squared(302)  P       511.4256       0.0000
 Inverse normal            Z        -4.7245       0.0000
 Inverse logit t(759)      L*       -5.7359       0.0000
 Modified inv. chi-squared Pm        8.5214       0.0000
------------------------------------------------------------------------------
 P statistic requires number of panels to be finite.
 Other statistics are suitable for finite or infinite number of panels.
------------------------------------------------------------------------------
*/
// 패널그룹별로 회귀모형을 추정한 결과 귀무가설을 기각


** Westerlund 공적분 검정

// findit xtwest

use "D:\STATA연습데이터\STATA시계열데이터분석\T_data14_2.dta",clear
// OECD 20개 국가 32개 연도 데이터, hex = yit : 1인당 의료비 지출, gdp = xit : 1인당 GDP

tsset ct year

xtunitroot fisher loghex, lag(2) demean dfuller

/*
Fisher-type unit-root test for loghex
Based on augmented Dickey–Fuller tests
--------------------------------------
H0: All panels contain unit roots           Number of panels  =     20
Ha: At least one panel is stationary        Number of periods =     32

AR parameter: Panel-specific                Asymptotics: T -> Infinity
Panel means:  Included
Time trend:   Not included                  Cross-sectional means removed
Drift term:   Not included                  ADF regressions: 2 lags
------------------------------------------------------------------------------
                                  Statistic      p-value
------------------------------------------------------------------------------
 Inverse chi-squared(40)   P        50.6458       0.1207
 Inverse normal            Z        -0.7597       0.2237
 Inverse logit t(104)      L*       -0.9879       0.1627
 Modified inv. chi-squared Pm        1.1902       0.1170
------------------------------------------------------------------------------
 P statistic requires number of panels to be finite.
 Other statistics are suitable for finite or infinite number of panels.
------------------------------------------------------------------------------
*/ // loghex 변수는 단위근이 존재한다는 귀무가설을 기각하지 못함

xtunitroot fisher loggdp, lag(2) demean dfuller

/*
Fisher-type unit-root test for loggdp
Based on augmented Dickey–Fuller tests
--------------------------------------
H0: All panels contain unit roots           Number of panels  =     20
Ha: At least one panel is stationary        Number of periods =     32

AR parameter: Panel-specific                Asymptotics: T -> Infinity
Panel means:  Included
Time trend:   Not included                  Cross-sectional means removed
Drift term:   Not included                  ADF regressions: 2 lags
------------------------------------------------------------------------------
                                  Statistic      p-value
------------------------------------------------------------------------------
 Inverse chi-squared(40)   P        37.5965       0.5790
 Inverse normal            Z         0.6986       0.7576
 Inverse logit t(104)      L*        0.9229       0.8209
 Modified inv. chi-squared Pm       -0.2687       0.6059
------------------------------------------------------------------------------
 P statistic requires number of panels to be finite.
 Other statistics are suitable for finite or infinite number of panels.
------------------------------------------------------------------------------
*/ // loggdp 변수는 단위근이 존재한다는 귀무가설을 기각하지 못함


** 1인당 의료비지출(loghex)과 1인당 GDP(loggdp) 간에 장기적 균형관계 존재하는지 검정

xtwest loghex loggdp, lags(1 3) leads(0 3) lrwindow(3) constant
// lags(1 3) : 종속변수 차분과 설명변수 차분의 차수1~3까지 검정한 후 AIC에 따라 최적 차수를 결정
// leads(0 3) : 설명변수 차분의 차수를 0~3까지 검정한 후 AIC에 따라 최적 차수를 결정된다고
// lrwindow(3) : 검정통계량을 계산할 때 사용하는 Bartlett kernel window를 3으로 설정
/*
Calculating Westerlund ECM panel cointegration tests..........

Results for H0: no cointegration
With 20 series and 1 covariate
Average AIC selected lag length: 1.2
Average AIC selected lead length: .25

-----------------------------------------------+
 Statistic |   Value   |  Z-value  |  P-value  |
-----------+-----------+-----------+-----------|
     Gt    |   -3.015  |   -6.160  |   0.000   |
     Ga    |  -17.245  |   -8.300  |   0.000   |
     Pt    |  -11.037  |   -4.589  |   0.000   |
     Pa    |  -14.521  |  -10.367  |   0.000   |
-----------------------------------------------+
*/
// Gt와 Ga는 그룹평균 검정, Pt와 Pa는 패널검정 : 모든 검정에서 귀무가설을 기각 = 공적분 관계가 있음 = loghex와 loggdp 변수 간에 장기적 균형관계가 존재함


// findit matvsort

set matsize 3000

* (주의) bootstrap 옵션을 사용할 경우 시간이 매우 오래 걸림
xtwest loghex loggdp, lags(1 3) leads(0 3) lrwindow(3) constant bootstrap(100)
// bootstrap(100) : 점근성 조건을 만족하지 못하면 정규분포를 따른다고 말할 수 없다. 이런 경우 부트르트랩 방법을 이용하여 p값을 계산, 결과에 robust p-value열이 추가



** 변수 간 공적분 관계가 존재하지 않는다는 귀무가설을 기각(장기적 균형관계가 존재함)했을 때, 오차수정 모형 추정
/*
의료비지출 log(H)과 GDP log(G) 변수 공적분 관계가 존재하지 않는다는 귀무가설 기각 시 오차수정 모형

시차가 1인 오차수정모형 : 
△logHit = μi + αi x (logHit-1 - θi - βi x logGit-1) + λi x △logHit-1 + γi x △logGit-1 + εit


<3가지 추정량 적용 가능> */


** 1.동적 고정효과 추정량(DFE : dynamic fixed effects estimator) = 단기/장기적 관계를 나타내는 모수들의 동질성이 모든 패널 그룹에서 서로 같다고 가정

xtreg d.loghex ld.loghex ld.loggdp l.loghex l.loggdp, fe
/*
Fixed-effects (within) regression               Number of obs     =        600
Group variable: ctr                             Number of groups  =         20

R-squared:                                      Obs per group:
     Within  = 0.2094                                         min =         30
     Between = 0.0025                                         avg =       30.0
     Overall = 0.0213                                         max =         30

                                                F(4,576)          =      38.14
corr(u_i, Xb) = -0.9371                         Prob > F          =     0.0000

------------------------------------------------------------------------------
    D.loghex | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
      loghex |
         LD. |   .1272998   .0373939     3.40   0.001     .0538548    .2007448
             |
      loggdp |
         LD. |   .1030366    .065654     1.57   0.117    -.0259138     .231987
             |
      loghex |
         L1. |  -.1282971   .0144308    -8.89   0.000    -.1566405   -.0999538
             |
      loggdp |
         L1. |   .1636673   .0238397     6.87   0.000     .1168441    .2104906
             |
       _cons |  -.6821439   .1492117    -4.57   0.000    -.9752093   -.3890786
-------------+----------------------------------------------------------------
     sigma_u |  .06114266
     sigma_e |  .03720047
         rho |  .72983327   (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(19, 576) = 3.01                     Prob > F = 0.0000
*/
// 단기 조정속도 λ = 0.127, 장기 조정속도 α = -0.128 < 0

nlcom -_b[l.loggdp]/_b[l.loghex] // nlcom : 추정량의 비선형 조합
/*
       _nl_1: -_b[l.loggdp]/_b[l.loghex]

------------------------------------------------------------------------------
    D.loghex | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
       _nl_1 |    1.27569   .0705079    18.09   0.000     1.137497    1.413883
------------------------------------------------------------------------------
*/ // 공적분 관계에서 파라미터 추정치 β는 1.275


** 2.각 그룹별 추정 후 추정치의 산술평균을 계산 (MG : mean group 그룹평균 추정량)

statsby long1=_b[l.loghex] short1=_b[ld.loghex], by(ctr) saving("D:\STATA연습데이터\STATA시계열데이터분석\coint1.dta", replace) : reg d.loghex ld.loghex ld.loggdp l.loghex l.loggdp  
// 각 국가별 파라미터 추정

use "D:\STATA연습데이터\STATA시계열데이터분석\coint1.dta",clear

su long1 short1
/*
    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
       long1 |         20     -.24507    .1292326  -.5063397  -.0624323
      short1 |         20    .1249033    .2432512  -.3109908   .5951226
*/ // mean 값이 바로 그룹평균 추정량(MG)임계치


** 3.통합그룹평균(PMG : pooled mean group) 추정량 계산

use "D:\STATA연습데이터\STATA시계열데이터분석\T_data14_2.dta",clear
tsset ctr year

reshape wide hex gdp loggdp loghex, i(year) j(ctr) // wide 타입을 long 타입 데이터로 변경

set more off
forvalues ii=2/20{
constraint def `ii' [D_loghex1]:l.loghex1=[D_loghex`ii']:l.loghex`ii'
local ij=`ii'+19
constraint def `ij' [Dloghex1]:l.logggdp1=[D_loghex`ii']:l.loggdp`ii'
}

tsset year
local eq_list " "
forvalues i=1/20 {
local eq_list `eq_list' (d.loghex`i' ldloghex`i ld.loggdp`i' l.loghex`i' l.loggdp`i')
}

sureg `eq_list', constraint(2/39)

capture drop short2
gen short2=.
forvalues j=1/20{
replace short2=_b[D_loghex`j':ld.loghex`j'] in `j'
}

su short2
di "short-run dynamics average = " r(mean)
di "long-run adjustment speed = " _b[D_loghex1:l.loghex1]

출처 : STATA 시계열 데이터 분석(민인식, 최필선)

반응형

댓글