Chapter 9 Generalized Linear Models

9.1 Import Data

#tfiling=read.table("c:\\data\\tfiling.txt", header=TRUE, sep="\t") # the two missing observations were already removed

tfiling.na=read.table("TXTData/TFiling.txt", sep ="\t", quote = "",header=TRUE)
tfiling<-na.omit(tfiling.na)
tfiling$GSTATEP=tfiling$GSTATEP/10000
tfiling$POP=tfiling$POPULATI/1000
tfiling$YEAR=tfiling$TIME+1983

There is a widespread belief that, in the United States, parties have become increasingly willing to go to the judicial system to settle disputes. This is particularly true in the insurance industry, an industry designed to spread risk among individuals who are subject to unfortunate events that threaten their livelihoods. Litigation in the insurance industry arises from two types of disagreement among parties, breach of faith and tort. A breach of faith is a failure by a party to the contract to perform according to its terms. This type of dispute is relatively confined to issues of facts including the nature of the duties and the action of each party. A tort action is a civil wrong, other than breach of contract, for which the court will provide a remedy in the form of action for damages. A civil wrong may include malice, wantonness oppression or capricious behavior by a party. Generally, much larger damages can be collected for tort actions because the award may be large enough to “sting” the guilty party. Since large insurance companies are viewed as having “deep pockets,” these awards can be quite large indeed.

Variable Description
FILINGS Number of filings of tort actions against insurance companies.
POPLAWYR The population per lawyer.
VEHCMILE Number of automobiles miles per mile of road, in thousands.
GSTATEP Percentage of gross state product from manufacturing and construction.
POPDENSY Number of people per ten square miles of land.
WCMPMAX Maximum workers’ compensation weekly benefit.
URBAN Percentage of population living in urban areas.
UNEMPLOY State unemployment rate, in percentages.
J&SLIAB An indicator of joint and several liability reform.
COLLRULE An indicator of collateral source reform.
CAPS An indicator of caps on non-economic reform.
PUNITIVE An indicator of limits of punitive damage.
TIME Year identifier, 1-6
STATE State identifier, 1-19.

9.2 Example: Tort Filings (Page 356)

There is a widespread belief that, in the United States, contentious parties have become increasingly willing to go to the judicial system to settle disputes. This is particularly true when one party is from the insurance industry, an industry designed to spread risk among individuals. Litigation in the insurance industry arises from two types of disagreement among parties, breach of faith and tort. A breach of faith is a failure by a party to the contract to perform according to its terms. A tort action is a civil wrong, other than breach of contract, for which the court will provide a remedy in the form of action for damages. A civil wrong may include malice, wantonness, oppression, or capricious behavior by a party. Generally, large damages can be collected for tort actions because the award may be large enough to “sting” the guilty party. Because large insurance companies are viewed as having “deep pockets,” these awards can be quite large.

9.2.1 TABLE 10.3 Averages with explanatory binary variables

library(Hmisc)
summary(tfiling[, c("JSLIAB", "COLLRULE", "CAPS", "PUNITIVE")])
     JSLIAB          COLLRULE           CAPS           PUNITIVE     
 Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
 1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
 Median :0.0000   Median :0.0000   Median :0.0000   Median :0.0000  
 Mean   :0.4911   Mean   :0.3036   Mean   :0.2321   Mean   :0.3214  
 3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:0.0000   3rd Qu.:1.0000  
 Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
summarize(tfiling$NUMFILE, tfiling$JSLIAB, mean)
  tfiling$JSLIAB tfiling$NUMFILE
1              0        15330.07
2              1        25886.76
summarize(tfiling$NUMFILE, tfiling$COLLRULE, mean)
  tfiling$COLLRULE tfiling$NUMFILE
1                0        20726.64
2                1        20026.71
summarize(tfiling$NUMFILE, tfiling$CAPS, mean)
  tfiling$CAPS tfiling$NUMFILE
1            0       24682.488
2            1        6726.615
summarize(tfiling$NUMFILE, tfiling$PUNITIVE, mean)
  tfiling$PUNITIVE tfiling$NUMFILE
1                0        17693.38
2                1        26469.14

In Table 10.3 we see that 23.2% of the 112 stateyear observations were under limits (caps) on noneconomic reform. Those observations not under limits on noneconomic reforms had a larger average number of filings.

9.2.2 TABLE 10.4 Summary statistics for other variables

summary(tfiling[,c("NUMFILE", "POP", "POPLAWYR", "VEHCMILE", "GSTATEP", "POPDENSY", "WCMPMAX", "URBAN", "UNEMPLOY")])
    NUMFILE            POP            POPLAWYR        VEHCMILE     
 Min.   :   512   Min.   : 0.521   Min.   :211.0   Min.   :  63.0  
 1st Qu.:  1790   1st Qu.: 1.109   1st Qu.:315.8   1st Qu.: 267.0  
 Median :  9085   Median : 3.353   Median :382.5   Median : 510.5  
 Mean   : 20514   Mean   : 6.679   Mean   :377.3   Mean   : 654.8  
 3rd Qu.: 31227   3rd Qu.:10.752   3rd Qu.:426.2   3rd Qu.: 933.5  
 Max.   :137455   Max.   :29.064   Max.   :537.0   Max.   :1899.0  
    GSTATEP          POPDENSY          WCMPMAX           URBAN       
 Min.   : 1.000   Min.   :   0.90   Min.   : 203.0   Min.   : 18.90  
 1st Qu.: 1.982   1st Qu.:  20.75   1st Qu.: 275.8   1st Qu.: 44.98  
 Median : 6.243   Median :  63.90   Median : 319.0   Median : 78.90  
 Mean   :12.667   Mean   : 168.18   Mean   : 350.0   Mean   : 69.36  
 3rd Qu.:17.673   3rd Qu.: 212.00   3rd Qu.: 382.0   3rd Qu.: 90.50  
 Max.   :69.738   Max.   :1043.00   Max.   :1140.0   Max.   :100.00  
    UNEMPLOY     
 Min.   : 2.600  
 1st Qu.: 5.075  
 Median : 5.950  
 Mean   : 6.217  
 3rd Qu.: 7.225  
 Max.   :10.800  
cor(tfiling$NUMFILE, tfiling[, c("POP", "POPLAWYR", "VEHCMILE", "GSTATEP", "POPDENSY", "WCMPMAX", "URBAN", "UNEMPLOY", "JSLIAB", "COLLRULE", "CAPS", "PUNITIVE")], use="pairwise.complete.obs")
          POP   POPLAWYR  VEHCMILE   GSTATEP  POPDENSY    WCMPMAX
[1,] 0.901947 -0.3781212 0.5175764 0.9145287 0.3678268 -0.2655063
         URBAN    UNEMPLOY    JSLIAB    COLLRULE       CAPS  PUNITIVE
[1,] 0.5501013 0.007600309 0.1825544 -0.01113243 -0.2622334 0.1417713

The correlations in Table 10.4 show that several of the economic and demographic variables appear to be related to the number of filings. In particular, we note that the number of filings is highly related to the state population.

9.3 Section 10.2 Homogeneous model

tfiling$POPLAWYR <- tfiling$POPLAWYR/1000
tfiling$VEHCMILE <- tfiling$VEHCMILE/1000
tfiling$GSTATEP<- tfiling$GSTATEP/1000
tfiling$POPDENSY<-tfiling$POPDENSY/1000
tfiling$WCMPMAX<-tfiling$WCMPMAX/1000
tfiling$URBAN<-tfiling$URBAN/1000
tfiling$LNPOP<-log(tfiling$POPULATI*1000)

9.3.1 TABLE 10.5 Tort filings model coefficient estimates

glm(NUMFILE ~ POPLAWYR+VEHCMILE+POPDENSY+WCMPMAX+URBAN+UNEMPLOY+JSLIAB+COLLRULE+CAPS+PUNITIVE, data=tfiling, family=poisson(link="log"), offset=LNPOP)

Call:  glm(formula = NUMFILE ~ POPLAWYR + VEHCMILE + POPDENSY + WCMPMAX + 
    URBAN + UNEMPLOY + JSLIAB + COLLRULE + CAPS + PUNITIVE, family = poisson(link = "log"), 
    data = tfiling, offset = LNPOP)

Coefficients:
(Intercept)     POPLAWYR     VEHCMILE     POPDENSY      WCMPMAX  
   -7.94343      2.16331      0.86188      0.39182     -0.80195  
      URBAN     UNEMPLOY       JSLIAB     COLLRULE         CAPS  
    0.89183      0.08664      0.17678     -0.02982     -0.03193  
   PUNITIVE  
    0.02953  

Degrees of Freedom: 111 Total (i.e. Null);  101 Residual
Null Deviance:      430300 
Residual Deviance: 118300   AIC: 119500
tfiling$TIMEFAC<-factor(tfiling$TIME)
glm(NUMFILE ~ TIMEFAC+POPLAWYR+VEHCMILE+POPDENSY+WCMPMAX+URBAN+UNEMPLOY+JSLIAB+COLLRULE+CAPS+PUNITIVE-1, data=tfiling, family=poisson(link="log"), offset=LNPOP)

Call:  glm(formula = NUMFILE ~ TIMEFAC + POPLAWYR + VEHCMILE + POPDENSY + 
    WCMPMAX + URBAN + UNEMPLOY + JSLIAB + COLLRULE + CAPS + PUNITIVE - 
    1, family = poisson(link = "log"), data = tfiling, offset = LNPOP)

Coefficients:
TIMEFAC1  TIMEFAC2  TIMEFAC3  TIMEFAC4  TIMEFAC5  TIMEFAC6  POPLAWYR  
-7.97398  -7.90048  -7.83975  -7.92226  -7.88501  -7.88776   2.12339  
VEHCMILE  POPDENSY   WCMPMAX     URBAN  UNEMPLOY    JSLIAB  COLLRULE  
 0.85617   0.38357  -0.82607   0.97667   0.08605   0.12953  -0.02347  
    CAPS  PUNITIVE  
-0.05575   0.05281  

Degrees of Freedom: 112 Total (i.e. Null);  96 Residual
Null Deviance:      1.465e+09 
Residual Deviance: 115500   AIC: 116700

Table 10.5 summarizes the fit of three Poisson models. With the basic homogeneous Poisson model, all explanatory variables turn out to be statistically significant, as evidenced by the small p-values. However, the Poisson model assumes that the variance equals the mean; this is often a restrictive assumption for empirical work. Thus, to account for potential overdispersion, Table 10.5 also summarizes a homogenous Poisson model with an estimated scale parameter. Table 10.5 emphasizes that, although the regression coefficient estimates do not change with the introduction of the scale parameter, estimated standard errors and thus p-values do change.

9.4 Section 10.3 Marginal Models

9.4.1 With in state correlation independent

library(gee)
gee(NUMFILE ~ offset(LNPOP)+POPLAWYR+VEHCMILE+POPDENSY+WCMPMAX+URBAN+UNEMPLOY+JSLIAB+COLLRULE+CAPS+PUNITIVE, id=STATE, data=tfiling, family=poisson(link="log"), corstr="independence") 
Beginning Cgee S-function, @(#) geeformula.q 4.13 98/01/27
running glm to get initial regression estimate
(Intercept)    POPLAWYR    VEHCMILE    POPDENSY     WCMPMAX       URBAN 
-7.94343077  2.16331290  0.86187552  0.39181865 -0.80195312  0.89182723 
   UNEMPLOY      JSLIAB    COLLRULE        CAPS    PUNITIVE 
 0.08663651  0.17677542 -0.02982377 -0.03193075  0.02952586 

 GEE:  GENERALIZED LINEAR MODELS FOR DEPENDENT DATA
 gee S-function, version 4.13 modified 98/01/27 (1998) 

Model:
 Link:                      Logarithm 
 Variance to Mean Relation: Poisson 
 Correlation Structure:     Independent 

Call:
gee(formula = NUMFILE ~ offset(LNPOP) + POPLAWYR + VEHCMILE + 
    POPDENSY + WCMPMAX + URBAN + UNEMPLOY + JSLIAB + COLLRULE + 
    CAPS + PUNITIVE, id = STATE, data = tfiling, family = poisson(link = "log"), 
    corstr = "independence")

Number of observations :  112 

Maximum cluster size   :  6 


Coefficients:
(Intercept)    POPLAWYR    VEHCMILE    POPDENSY     WCMPMAX       URBAN 
-7.94343079  2.16331290  0.86187552  0.39181865 -0.80195312  0.89182735 
   UNEMPLOY      JSLIAB    COLLRULE        CAPS    PUNITIVE 
 0.08663651  0.17677542 -0.02982377 -0.03193075  0.02952586 

Estimated Scale Parameter:  1285.7
Number of Iterations:  1

Working Correlation[1:4,1:4]
     [,1] [,2] [,3] [,4]
[1,]    1    0    0    0
[2,]    0    1    0    0
[3,]    0    0    1    0
[4,]    0    0    0    1


Returned Error Value:
[1] 0
gee(NUMFILE ~ offset(LNPOP)+POPLAWYR+VEHCMILE+POPDENSY+WCMPMAX+URBAN+UNEMPLOY+JSLIAB+COLLRULE+CAPS+PUNITIVE, id=STATE, data=tfiling, family=poisson(link="log"), corstr="AR-M", Mv=1) 
Beginning Cgee S-function, @(#) geeformula.q 4.13 98/01/27
running glm to get initial regression estimate
(Intercept)    POPLAWYR    VEHCMILE    POPDENSY     WCMPMAX       URBAN 
-7.94343077  2.16331290  0.86187552  0.39181865 -0.80195312  0.89182723 
   UNEMPLOY      JSLIAB    COLLRULE        CAPS    PUNITIVE 
 0.08663651  0.17677542 -0.02982377 -0.03193075  0.02952586 

 GEE:  GENERALIZED LINEAR MODELS FOR DEPENDENT DATA
 gee S-function, version 4.13 modified 98/01/27 (1998) 

Model:
 Link:                      Logarithm 
 Variance to Mean Relation: Poisson 
 Correlation Structure:     AR-M , M = 1 

Call:
gee(formula = NUMFILE ~ offset(LNPOP) + POPLAWYR + VEHCMILE + 
    POPDENSY + WCMPMAX + URBAN + UNEMPLOY + JSLIAB + COLLRULE + 
    CAPS + PUNITIVE, id = STATE, data = tfiling, family = poisson(link = "log"), 
    corstr = "AR-M", Mv = 1)

Number of observations :  112 

Maximum cluster size   :  6 


Coefficients:
(Intercept)    POPLAWYR    VEHCMILE    POPDENSY     WCMPMAX       URBAN 
-7.99997854  1.88219159  0.69338537  0.37164593  0.05604892  4.93610043 
   UNEMPLOY      JSLIAB    COLLRULE        CAPS    PUNITIVE 
 0.04340498  0.17025340 -0.06500658  0.09194548 -0.04663443 

Estimated Scale Parameter:  1444.921
Number of Iterations:  9

Working Correlation[1:4,1:4]
          [,1]      [,2]      [,3]      [,4]
[1,] 1.0000000 0.8517403 0.7254616 0.6179048
[2,] 0.8517403 1.0000000 0.8517403 0.7254616
[3,] 0.7254616 0.8517403 1.0000000 0.8517403
[4,] 0.6179048 0.7254616 0.8517403 1.0000000


Returned Error Value:
[1] 0
#THE NUMBER WAS A LITTLE OFF COMPARED WITH SAS ESTIMATE

9.4.2 Random effects model

# MODEL WITHOUR RANDOM EFFECTS
glm(NUMFILE ~ POPLAWYR+VEHCMILE+POPDENSY+WCMPMAX+URBAN+UNEMPLOY+JSLIAB+COLLRULE+CAPS+PUNITIVE, data=tfiling, family=poisson(link="log"), offset=LNPOP)

Call:  glm(formula = NUMFILE ~ POPLAWYR + VEHCMILE + POPDENSY + WCMPMAX + 
    URBAN + UNEMPLOY + JSLIAB + COLLRULE + CAPS + PUNITIVE, family = poisson(link = "log"), 
    data = tfiling, offset = LNPOP)

Coefficients:
(Intercept)     POPLAWYR     VEHCMILE     POPDENSY      WCMPMAX  
   -7.94343      2.16331      0.86188      0.39182     -0.80195  
      URBAN     UNEMPLOY       JSLIAB     COLLRULE         CAPS  
    0.89183      0.08664      0.17678     -0.02982     -0.03193  
   PUNITIVE  
    0.02953  

Degrees of Freedom: 111 Total (i.e. Null);  101 Residual
Null Deviance:      430300 
Residual Deviance: 118300   AIC: 119500