## Thursday, July 21, 2016

### Bayesian predicted slopes with interaction in multiple linear regression

Suppose we have a multiple linear regression with interaction: $\hat{y} = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \beta_{1\times 2} x_1 x_2$ Notice that the slope on $x_1$ is not just $\beta_1$, it's $\beta_1 + \beta_{1\times 2} x_2$: $\hat{y} = \beta_0 + ( \beta_1 + \beta_{1\times 2} x_2) x_1 + \beta_2 x_2$ In other words, the slope on $x_1$ depends on the value of $x_2$. That's the essence of interaction. Of course, the same applies analogously to the slope on $x_2$.

This is explained in Ch. 18 of DBDA2E, and an example is given in Figure 18.9 on p. 529 that plots the predicted slopes and their 95% HDIs as a function of the value of the other predictor. The variable "Spend" is $x_1$ and the variable "PrcntTake" is $x_2$. The top panel (below) plots the posterior predicted slope on $x_1$ for values of $x_2$:
 After Figure 18.9, p. 529, of DBDA2E

A reader asked me how to compute those predicted slopes in R and make the graph. I thought I already had the R code in the package of programs, but upon searching discovered that it was not there. (Sorry; with over 700 pages of text and many dozens of programs, some things get overlooked.) This blog post supplies the R code!  :-)

First, from the folder of programs, open the file Jags-Ymet-XmetMulti-Mrobust-Example.R Then, at the top of the file, comment out all the data-loading examples and put in the following code for reading in the relevant data and creating an interaction variable. This is explained on p. 528 of DBDA2E:

# Two predictors with interaction:
# Read the SAT data file:

# Append named column with interaction product:

myData = cbind( myData , SpendXPrcnt=myData[,"Spend"]*myData[,"PrcntTake"] )
# Specify the column names:

yName = "SATT" ; xName = c("Spend","PrcntTake","SpendXPrcnt")
# Put the saved files into a sub-directory:

dirName = "Guber1999data-Jags-Inter"
if( !dir.exists(dirName) ) { dir.create(dirName) }
fileNameRoot = paste0(dirName,"/Guber1999data-Jags-Inter-")
# Specify the number of saved steps and the thinning (merely to save memory):

numSavedSteps=15000 ; thinSteps=20 # smaller thinSteps will run faster
# Specify the preferred graph file format:
graphFileType = "png"
Then run the MCMC, as already specified in the script. I'm repeating it here for completeness:

# Load the relevant model into R's working memory:
source("Jags-Ymet-XmetMulti-Mrobust.R")
# Generate the MCMC chain:

mcmcCoda = genMCMC( data=myData , xName=xName , yName=yName ,
numSavedSteps=numSavedSteps , thinSteps=thinSteps ,
saveName=fileNameRoot )

The script also has commands for displaying MCMC diagnostics, numerical summary, and basic plots of the posterior distribution.

Now the novel R code for computing the slopes and making the graph! At the end of the script, run the following code:

# Convert mcmc coda object to a matrix:
mcmcMat=as.matrix(mcmcCoda,chains=TRUE)

# Make combs of X1 and X2 values for displaying slopes and their HDIs:

X1Comb = seq(min(myData[,xName[1]]),max(myData[,xName[1]]),length=20)
X2Comb = seq(min(myData[,xName[2]]),max(myData[,xName[2]]),length=20)

# Create place holders for X1 median slopes and HDIs:

X1SlopeMed = rep(0,length(X2Comb))
X1SlopeHDI = matrix(0,nrow=2,ncol=length(X2Comb))
# Compute the slope on X1 for each value of X2,
# as explained at top of p. 530 of DBDA2E:

for ( combIdx in 1:length(X2Comb) ) {
credSlope = mcmcMat[,"beta[1]"] + mcmcMat[,"beta[3]"]*X2Comb[combIdx]
X1SlopeHDI[,combIdx] = HDIofMCMC( credSlope )
X1SlopeMed[combIdx] = median( credSlope )
}

# Same as above but for X2:

X2SlopeHDI = matrix(0,nrow=2,ncol=length(X1Comb))
X2SlopeMed = rep(0,length(X1Comb))
for ( combIdx in 1:length(X1Comb) ) {
credSlope = mcmcMat[,"beta[2]"] + mcmcMat[,"beta[3]"]*X1Comb[combIdx]
X2SlopeHDI[,combIdx] = HDIofMCMC( credSlope )
X2SlopeMed[combIdx] = median( credSlope )
}

# Make the graph as in lower part of Figure 18.9, p. 529 of DBDA2E:

openGraph()
layout(matrix(1:2,nrow=2))
par( mar=c(3.5,3.5,2,1) , mgp=c(2,0.7,0) )
# Panel to display slope on X1:

plot( X2Comb , X1SlopeMed , ylim=range(X1SlopeHDI) ,
pch=19 , cex=1.5 , col="skyblue" ,
ylab=paste("Slope on",xName[1]) , xlab=paste("Value of",xName[2]) ,

cex.lab=1.5 )
segments( x0=X2Comb , y0=X1SlopeHDI[1,] ,x1=X2Comb , y1=X1SlopeHDI[2,] ,
col="skyblue" , lwd=3 )
abline(h=0,lty="dotted")
# Panel to display slope on X2:

plot( X1Comb , X2SlopeMed , ylim=range(X2SlopeHDI) , pch=19 , cex=1.5 ,
col="skyblue" ,
ylab=paste("Slope on",xName[2]) , xlab=paste("Value of",xName[1]) ,

cex.lab=1.5 )
segments( x0=X1Comb , y0=X2SlopeHDI[1,] ,x1=X1Comb , y1=X2SlopeHDI[2,] ,
col="skyblue" , lwd=3 )
abline(h=0,lty="dotted")
# Save it:

saveGraph( file=paste0(fileNameRoot,"Slopes") , type=graphFileType )

Hopefully the comments in the script (above) are sufficient for you to decipher each step.