Chapter 9 Generalized Linear Models
9.1 Import Data
#tfiling=read.table("c:\\data\\tfiling.txt", header=TRUE, sep="\t") # the two missing observations were already removed
tfiling.na=read.table("TXTData/TFiling.txt", sep ="\t", quote = "",header=TRUE)
tfiling<-na.omit(tfiling.na)
tfiling$GSTATEP=tfiling$GSTATEP/10000
tfiling$POP=tfiling$POPULATI/1000
tfiling$YEAR=tfiling$TIME+1983
There is a widespread belief that, in the United States, parties have become increasingly willing to go to the judicial system to settle disputes. This is particularly true in the insurance industry, an industry designed to spread risk among individuals who are subject to unfortunate events that threaten their livelihoods. Litigation in the insurance industry arises from two types of disagreement among parties, breach of faith and tort. A breach of faith is a failure by a party to the contract to perform according to its terms. This type of dispute is relatively confined to issues of facts including the nature of the duties and the action of each party. A tort action is a civil wrong, other than breach of contract, for which the court will provide a remedy in the form of action for damages. A civil wrong may include malice, wantonness oppression or capricious behavior by a party. Generally, much larger damages can be collected for tort actions because the award may be large enough to “sting” the guilty party. Since large insurance companies are viewed as having “deep pockets,” these awards can be quite large indeed.
Variable | Description |
---|---|
FILINGS | Number of filings of tort actions against insurance companies. |
POPLAWYR | The population per lawyer. |
VEHCMILE | Number of automobiles miles per mile of road, in thousands. |
GSTATEP | Percentage of gross state product from manufacturing and construction. |
POPDENSY | Number of people per ten square miles of land. |
WCMPMAX | Maximum workers’ compensation weekly benefit. |
URBAN | Percentage of population living in urban areas. |
UNEMPLOY | State unemployment rate, in percentages. |
J&SLIAB | An indicator of joint and several liability reform. |
COLLRULE | An indicator of collateral source reform. |
CAPS | An indicator of caps on non-economic reform. |
PUNITIVE | An indicator of limits of punitive damage. |
TIME | Year identifier, 1-6 |
STATE | State identifier, 1-19. |
9.2 Example: Tort Filings (Page 356)
There is a widespread belief that, in the United States, contentious parties have become increasingly willing to go to the judicial system to settle disputes. This is particularly true when one party is from the insurance industry, an industry designed to spread risk among individuals. Litigation in the insurance industry arises from two types of disagreement among parties, breach of faith and tort. A breach of faith is a failure by a party to the contract to perform according to its terms. A tort action is a civil wrong, other than breach of contract, for which the court will provide a remedy in the form of action for damages. A civil wrong may include malice, wantonness, oppression, or capricious behavior by a party. Generally, large damages can be collected for tort actions because the award may be large enough to “sting” the guilty party. Because large insurance companies are viewed as having “deep pockets,” these awards can be quite large.
9.2.1 TABLE 10.3 Averages with explanatory binary variables
library(Hmisc)
summary(tfiling[, c("JSLIAB", "COLLRULE", "CAPS", "PUNITIVE")])
JSLIAB COLLRULE CAPS PUNITIVE
Min. :0.0000 Min. :0.0000 Min. :0.0000 Min. :0.0000
1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.0000
Median :0.0000 Median :0.0000 Median :0.0000 Median :0.0000
Mean :0.4911 Mean :0.3036 Mean :0.2321 Mean :0.3214
3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.:0.0000 3rd Qu.:1.0000
Max. :1.0000 Max. :1.0000 Max. :1.0000 Max. :1.0000
summarize(tfiling$NUMFILE, tfiling$JSLIAB, mean)
tfiling$JSLIAB tfiling$NUMFILE
1 0 15330.07
2 1 25886.76
summarize(tfiling$NUMFILE, tfiling$COLLRULE, mean)
tfiling$COLLRULE tfiling$NUMFILE
1 0 20726.64
2 1 20026.71
summarize(tfiling$NUMFILE, tfiling$CAPS, mean)
tfiling$CAPS tfiling$NUMFILE
1 0 24682.488
2 1 6726.615
summarize(tfiling$NUMFILE, tfiling$PUNITIVE, mean)
tfiling$PUNITIVE tfiling$NUMFILE
1 0 17693.38
2 1 26469.14
In Table 10.3 we see that 23.2% of the 112 stateyear observations were under limits (caps) on noneconomic reform. Those observations not under limits on noneconomic reforms had a larger average number of filings.
9.2.2 TABLE 10.4 Summary statistics for other variables
summary(tfiling[,c("NUMFILE", "POP", "POPLAWYR", "VEHCMILE", "GSTATEP", "POPDENSY", "WCMPMAX", "URBAN", "UNEMPLOY")])
NUMFILE POP POPLAWYR VEHCMILE
Min. : 512 Min. : 0.521 Min. :211.0 Min. : 63.0
1st Qu.: 1790 1st Qu.: 1.109 1st Qu.:315.8 1st Qu.: 267.0
Median : 9085 Median : 3.353 Median :382.5 Median : 510.5
Mean : 20514 Mean : 6.679 Mean :377.3 Mean : 654.8
3rd Qu.: 31227 3rd Qu.:10.752 3rd Qu.:426.2 3rd Qu.: 933.5
Max. :137455 Max. :29.064 Max. :537.0 Max. :1899.0
GSTATEP POPDENSY WCMPMAX URBAN
Min. : 1.000 Min. : 0.90 Min. : 203.0 Min. : 18.90
1st Qu.: 1.982 1st Qu.: 20.75 1st Qu.: 275.8 1st Qu.: 44.98
Median : 6.243 Median : 63.90 Median : 319.0 Median : 78.90
Mean :12.667 Mean : 168.18 Mean : 350.0 Mean : 69.36
3rd Qu.:17.673 3rd Qu.: 212.00 3rd Qu.: 382.0 3rd Qu.: 90.50
Max. :69.738 Max. :1043.00 Max. :1140.0 Max. :100.00
UNEMPLOY
Min. : 2.600
1st Qu.: 5.075
Median : 5.950
Mean : 6.217
3rd Qu.: 7.225
Max. :10.800
cor(tfiling$NUMFILE, tfiling[, c("POP", "POPLAWYR", "VEHCMILE", "GSTATEP", "POPDENSY", "WCMPMAX", "URBAN", "UNEMPLOY", "JSLIAB", "COLLRULE", "CAPS", "PUNITIVE")], use="pairwise.complete.obs")
POP POPLAWYR VEHCMILE GSTATEP POPDENSY WCMPMAX
[1,] 0.901947 -0.3781212 0.5175764 0.9145287 0.3678268 -0.2655063
URBAN UNEMPLOY JSLIAB COLLRULE CAPS PUNITIVE
[1,] 0.5501013 0.007600309 0.1825544 -0.01113243 -0.2622334 0.1417713
The correlations in Table 10.4 show that several of the economic and demographic variables appear to be related to the number of filings. In particular, we note that the number of filings is highly related to the state population.
9.3 Section 10.2 Homogeneous model
tfiling$POPLAWYR <- tfiling$POPLAWYR/1000
tfiling$VEHCMILE <- tfiling$VEHCMILE/1000
tfiling$GSTATEP<- tfiling$GSTATEP/1000
tfiling$POPDENSY<-tfiling$POPDENSY/1000
tfiling$WCMPMAX<-tfiling$WCMPMAX/1000
tfiling$URBAN<-tfiling$URBAN/1000
tfiling$LNPOP<-log(tfiling$POPULATI*1000)
9.3.1 TABLE 10.5 Tort filings model coefficient estimates
glm(NUMFILE ~ POPLAWYR+VEHCMILE+POPDENSY+WCMPMAX+URBAN+UNEMPLOY+JSLIAB+COLLRULE+CAPS+PUNITIVE, data=tfiling, family=poisson(link="log"), offset=LNPOP)
Call: glm(formula = NUMFILE ~ POPLAWYR + VEHCMILE + POPDENSY + WCMPMAX +
URBAN + UNEMPLOY + JSLIAB + COLLRULE + CAPS + PUNITIVE, family = poisson(link = "log"),
data = tfiling, offset = LNPOP)
Coefficients:
(Intercept) POPLAWYR VEHCMILE POPDENSY WCMPMAX
-7.94343 2.16331 0.86188 0.39182 -0.80195
URBAN UNEMPLOY JSLIAB COLLRULE CAPS
0.89183 0.08664 0.17678 -0.02982 -0.03193
PUNITIVE
0.02953
Degrees of Freedom: 111 Total (i.e. Null); 101 Residual
Null Deviance: 430300
Residual Deviance: 118300 AIC: 119500
tfiling$TIMEFAC<-factor(tfiling$TIME)
glm(NUMFILE ~ TIMEFAC+POPLAWYR+VEHCMILE+POPDENSY+WCMPMAX+URBAN+UNEMPLOY+JSLIAB+COLLRULE+CAPS+PUNITIVE-1, data=tfiling, family=poisson(link="log"), offset=LNPOP)
Call: glm(formula = NUMFILE ~ TIMEFAC + POPLAWYR + VEHCMILE + POPDENSY +
WCMPMAX + URBAN + UNEMPLOY + JSLIAB + COLLRULE + CAPS + PUNITIVE -
1, family = poisson(link = "log"), data = tfiling, offset = LNPOP)
Coefficients:
TIMEFAC1 TIMEFAC2 TIMEFAC3 TIMEFAC4 TIMEFAC5 TIMEFAC6 POPLAWYR
-7.97398 -7.90048 -7.83975 -7.92226 -7.88501 -7.88776 2.12339
VEHCMILE POPDENSY WCMPMAX URBAN UNEMPLOY JSLIAB COLLRULE
0.85617 0.38357 -0.82607 0.97667 0.08605 0.12953 -0.02347
CAPS PUNITIVE
-0.05575 0.05281
Degrees of Freedom: 112 Total (i.e. Null); 96 Residual
Null Deviance: 1.465e+09
Residual Deviance: 115500 AIC: 116700
Table 10.5 summarizes the fit of three Poisson models. With the basic homogeneous Poisson model, all explanatory variables turn out to be statistically significant, as evidenced by the small p-values. However, the Poisson model assumes that the variance equals the mean; this is often a restrictive assumption for empirical work. Thus, to account for potential overdispersion, Table 10.5 also summarizes a homogenous Poisson model with an estimated scale parameter. Table 10.5 emphasizes that, although the regression coefficient estimates do not change with the introduction of the scale parameter, estimated standard errors and thus p-values do change.
9.4 Section 10.3 Marginal Models
9.4.1 With in state correlation independent
library(gee)
gee(NUMFILE ~ offset(LNPOP)+POPLAWYR+VEHCMILE+POPDENSY+WCMPMAX+URBAN+UNEMPLOY+JSLIAB+COLLRULE+CAPS+PUNITIVE, id=STATE, data=tfiling, family=poisson(link="log"), corstr="independence")
Beginning Cgee S-function, @(#) geeformula.q 4.13 98/01/27
running glm to get initial regression estimate
(Intercept) POPLAWYR VEHCMILE POPDENSY WCMPMAX URBAN
-7.94343077 2.16331290 0.86187552 0.39181865 -0.80195312 0.89182723
UNEMPLOY JSLIAB COLLRULE CAPS PUNITIVE
0.08663651 0.17677542 -0.02982377 -0.03193075 0.02952586
GEE: GENERALIZED LINEAR MODELS FOR DEPENDENT DATA
gee S-function, version 4.13 modified 98/01/27 (1998)
Model:
Link: Logarithm
Variance to Mean Relation: Poisson
Correlation Structure: Independent
Call:
gee(formula = NUMFILE ~ offset(LNPOP) + POPLAWYR + VEHCMILE +
POPDENSY + WCMPMAX + URBAN + UNEMPLOY + JSLIAB + COLLRULE +
CAPS + PUNITIVE, id = STATE, data = tfiling, family = poisson(link = "log"),
corstr = "independence")
Number of observations : 112
Maximum cluster size : 6
Coefficients:
(Intercept) POPLAWYR VEHCMILE POPDENSY WCMPMAX URBAN
-7.94343079 2.16331290 0.86187552 0.39181865 -0.80195312 0.89182735
UNEMPLOY JSLIAB COLLRULE CAPS PUNITIVE
0.08663651 0.17677542 -0.02982377 -0.03193075 0.02952586
Estimated Scale Parameter: 1285.7
Number of Iterations: 1
Working Correlation[1:4,1:4]
[,1] [,2] [,3] [,4]
[1,] 1 0 0 0
[2,] 0 1 0 0
[3,] 0 0 1 0
[4,] 0 0 0 1
Returned Error Value:
[1] 0
gee(NUMFILE ~ offset(LNPOP)+POPLAWYR+VEHCMILE+POPDENSY+WCMPMAX+URBAN+UNEMPLOY+JSLIAB+COLLRULE+CAPS+PUNITIVE, id=STATE, data=tfiling, family=poisson(link="log"), corstr="AR-M", Mv=1)
Beginning Cgee S-function, @(#) geeformula.q 4.13 98/01/27
running glm to get initial regression estimate
(Intercept) POPLAWYR VEHCMILE POPDENSY WCMPMAX URBAN
-7.94343077 2.16331290 0.86187552 0.39181865 -0.80195312 0.89182723
UNEMPLOY JSLIAB COLLRULE CAPS PUNITIVE
0.08663651 0.17677542 -0.02982377 -0.03193075 0.02952586
GEE: GENERALIZED LINEAR MODELS FOR DEPENDENT DATA
gee S-function, version 4.13 modified 98/01/27 (1998)
Model:
Link: Logarithm
Variance to Mean Relation: Poisson
Correlation Structure: AR-M , M = 1
Call:
gee(formula = NUMFILE ~ offset(LNPOP) + POPLAWYR + VEHCMILE +
POPDENSY + WCMPMAX + URBAN + UNEMPLOY + JSLIAB + COLLRULE +
CAPS + PUNITIVE, id = STATE, data = tfiling, family = poisson(link = "log"),
corstr = "AR-M", Mv = 1)
Number of observations : 112
Maximum cluster size : 6
Coefficients:
(Intercept) POPLAWYR VEHCMILE POPDENSY WCMPMAX URBAN
-7.99997854 1.88219159 0.69338537 0.37164593 0.05604892 4.93610043
UNEMPLOY JSLIAB COLLRULE CAPS PUNITIVE
0.04340498 0.17025340 -0.06500658 0.09194548 -0.04663443
Estimated Scale Parameter: 1444.921
Number of Iterations: 9
Working Correlation[1:4,1:4]
[,1] [,2] [,3] [,4]
[1,] 1.0000000 0.8517403 0.7254616 0.6179048
[2,] 0.8517403 1.0000000 0.8517403 0.7254616
[3,] 0.7254616 0.8517403 1.0000000 0.8517403
[4,] 0.6179048 0.7254616 0.8517403 1.0000000
Returned Error Value:
[1] 0
#THE NUMBER WAS A LITTLE OFF COMPARED WITH SAS ESTIMATE
9.4.2 Random effects model
# MODEL WITHOUR RANDOM EFFECTS
glm(NUMFILE ~ POPLAWYR+VEHCMILE+POPDENSY+WCMPMAX+URBAN+UNEMPLOY+JSLIAB+COLLRULE+CAPS+PUNITIVE, data=tfiling, family=poisson(link="log"), offset=LNPOP)
Call: glm(formula = NUMFILE ~ POPLAWYR + VEHCMILE + POPDENSY + WCMPMAX +
URBAN + UNEMPLOY + JSLIAB + COLLRULE + CAPS + PUNITIVE, family = poisson(link = "log"),
data = tfiling, offset = LNPOP)
Coefficients:
(Intercept) POPLAWYR VEHCMILE POPDENSY WCMPMAX
-7.94343 2.16331 0.86188 0.39182 -0.80195
URBAN UNEMPLOY JSLIAB COLLRULE CAPS
0.89183 0.08664 0.17678 -0.02982 -0.03193
PUNITIVE
0.02953
Degrees of Freedom: 111 Total (i.e. Null); 101 Residual
Null Deviance: 430300
Residual Deviance: 118300 AIC: 119500