Download The Generalized Exponential Random Graph Model for Weighted

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
The Generalized Exponential Random Graph Model
for Weighted Networks
James D. Wilson1 , Matthew J. Denny2 , Shankar Bhamidi3 , Skyler
Cranmer4 , Bruce Desmarais2
1
University of San Fransisco
4
2
Penn State
3
UNC Chapel Hill
OSU
June 24th, 2016
[email protected], mjdenny.com
Matt Denny
The GERGM
June 24th, 2016
1 / 23
Outline
1
ERGMs for Unweighted Graphs
2
The Generalized Exponential Random Graph Model (GERGM)
3
Metropolis Hastings: Expanded Applicability
4
Likelihood Degeneracy
5
Software!
Matt Denny
The GERGM
June 24th, 2016
2 / 23
Assessing Network Structure
Matt Denny
The GERGM
June 24th, 2016
3 / 23
Exponential Random Graph Models (ERGMs)
Setting:
Network x composed of n nodes and M edges
Assume binary edges between nodes i, j: xi,j ∈ {0, 1}
X : family of all graphs with n nodes and binary edges
PX : probability measure on X
Aim: Identify (and then estimate) the relational covariates that capture
network structure through PX
Matt Denny
The GERGM
June 24th, 2016
4 / 23
Exponential Random Graph Models (ERGMs)
The probability (likelihood) of observing network x:
PX (x, θ) =
exp(θ T h(x))
T
∑ exp(θ h(z))
,
x ∈ {0, 1}M
z∈X
h ∶ {0, 1}M → Rp : Network covariates
θ ∈ Rp : unknown parameters (we have to estimate these!)
Matt Denny
The GERGM
June 24th, 2016
5 / 23
Weighted Graphs
Setting:
Network y composed of n nodes and M edges
Edges are continuous valued between nodes i, j: yi,j ∈ (−∞, ∞)
Y : family of all graphs with n nodes
PY : probability measure on Y
Examples:
Migration, Finance, Communication, Voting, Biological, Trade, etc.
Matt Denny
The GERGM
June 24th, 2016
6 / 23
Generalized Exponential Random Graph Model
Probability measure on family of weighted graphs w/ n nodes, M edges
Model: Two Steps
1) Model joint structure of Y on restricted network X ∈ [0, 1]M :
fX (x, θ) =
exp (θ T h(x))
T
∫[0,1]m exp (θ h(z)) dz
,
x ∈ [0, 1]M
h ∶ [0, 1]M → Rp : Network covariates (endogeneous or exogenous)
θ ∈ Rp : unknown structural parameters
Matt Denny
The GERGM
June 24th, 2016
7 / 23
Generalized Exponential Random Graph Model
2) Transform onto space of continuous weights:
fY (y , θ, Λ) =
exp (θ T h(T (y , β)))
T
∫[0,1]m exp (β h(z)) dz
∏ tij (y , β),
y ∈ RM
ij
T ∶ RM → [0, 1]m : parametric transformation function
Monotonically increasing
An appropriate choice: cumulative distribution functions
β ∈ Rq : unknown transformation parameters
Matt Denny
The GERGM
June 24th, 2016
8 / 23
Features of the GERGM
Flexible model for any type of weighted network
Estimation via Markov Chain Monte-Carlo or Maximum
pseudo-likelihood
Simplifies to multivariate linear regression when h(x) = 0
Matt Denny
The GERGM
June 24th, 2016
9 / 23
Model Specification: Weighted Network Statistics
Network Statistic Parameter
Value
αR
Reciprocity
θR
⎛
⎞
∑ xij xji
⎝ i<j
⎠
Cyclic Triads
θCT
⎞
⎛
∑ (xij xjk xki + xik xkj xji )
⎠
⎝i<j<k
In-Two-Stars
θITS
⎛
⎞
∑ ∑ xji xki
⎝ i j<k ≠i
⎠
θOTS
⎛
⎞
∑ ∑ xij xik
⎝ i j<k ≠i
⎠
αCT
αITS
αOTS
Out-Two-Stars
αE
Edge Density
θE
Transitive Triads
θTT
⎛
⎞
∑ xij
⎝ i≠j ⎠
⎛
∑ (xij xjk xik + xij xkj xki + xij xkj xik ) +
⎝i<j<k
αTT
⎞
∑ (xji xjk xki + xji xjk xik + xji xkj xki )
⎠
i<j<k
Matt Denny
The GERGM
June 24th, 2016
10 / 23
Inference on the GERGM
Likelihood of GERGM is intractable; relies on MCMC
Gibbs (Desmarais, et al., 2012)
Major Issue: Restricts model specification by requiring first order
network statistics
∂ 2 h(x)
= 0, i, j ∈ [n]
∂xij2
Metropolis-Hastings (Wilson, et al. 2015): Removes above restriction
on GERGM
Matt Denny
The GERGM
June 24th, 2016
11 / 23
Metropolis-Hastings Sampling
Framework: Acceptance/Rejection algorithm for weighted edges w/
multivariate truncated normal proposal distribution.
Proposal:
qσ (w∣x) =
σ −1 φ( w−x
σ )
−x
Φ( 1−x
σ ) − Φ( σ )
,
0≤w ≤1
Advantages:
Flexible model specification
Interaction effects
Exponential weighting of covariates
New available models can avoid likelihood degeneracy
Matt Denny
The GERGM
June 24th, 2016
12 / 23
Comparison to Gibbs Sampling: U.S. Migration Data
Describes the inter-state migration in U.S. from 2006 to 2007.
Fit a GERGM with 5 network statistics, 11 demographic covariates
Matt Denny
The GERGM
June 24th, 2016
13 / 23
Equivalent Covariate Estimates
Matt Denny
The GERGM
June 24th, 2016
14 / 23
Goodness of Fit
M-H and Gibbs have comparable (and good) performance
Matt Denny
The GERGM
June 24th, 2016
15 / 23
Likelihood Degeneracy
GERGM?
3
2
1
0
Density
4
5
Simulated
Network Densities
ERGM
0.0
0.2
0.4
0.6
0.8
1.0
Network Density
Matt Denny
The GERGM
June 24th, 2016
16 / 23
Case Study: In-Two-Stars Model
Model:
fX (x, θ, α) =
exp (θE hE (x) + θITS hITS (x))
,
C(θE , θITS )
x ∈ [0, 1]m
Edge density: hE (x) = ∑i≠j xij /m
α
In-Two-Stars: hITS (x, α) = (∑i ∑j<k ≠i xji xki )
Known to suffer from degeneracy issues in the binary case.
Matt Denny
The GERGM
June 24th, 2016
17 / 23
Exponential Down-Weighting
In-Two-Stars
350
α
300
0.1
0.25
0.5
0.75
0.9
1
Edge Density
1.0
0.8
250
0.6
200
150
0.4
100
α
0
8
7
6
5
4
3
2
1
0
−1
−2
−3
−4
−5
−6
−7
−8
−9
−10
9
8
10
7
6
5
4
3
2
1
0
−1
−2
−3
−4
−5
−6
−7
−8
−9
−10
0.0
9
0.1
0.25
0.5
0.75
0.9
1
10
0.2
50
Figure: Statistics from 1M simulated In-Two-Stars networks with various
values of α weighting.
Matt Denny
The GERGM
June 24th, 2016
18 / 23
Non-Degeneracy
= 0.75
0.75
0.75
0.75
0.50
0.25
Scaled Frequency
1.00
0.00
0.50
0.25
6
7
0.50
0.25
0.00
0.00
5
8
30
Simulated In−Two−Stars Value
40
80
50
1.00
0.75
0.75
0.75
0.25
0.00
Scaled Frequency
1.00
1.00
0.50
0.50
0.25
0.3
0.4
Simulated Network Density
0.5
160
200
0.50
0.25
0.00
0.00
0.2
120
Simulated In−Two−Stars Value
Simulated In−Two−Stars Value
Scaled Frequency
Scaled Frequency
= 1.00
1.00
Scaled Frequency
Scaled Frequency
= 0.50
1.00
0.4
0.5
0.6
Simulated Network Density
0.7
0.5
0.6
0.7
0.8
Simulated Network Density
Matt Denny
GERGM and network density
June 24th,
2016
19
Figure: Histograms
of simulated in The
two-stars
statistics
for/ 23
Defeating Degeneracy
Causes:
Bimodal density distribution vs. poor initialization.
Solutions:
Estimate intercept via transformation.
Exponential down-weighting.
Weighted MPLE.
Adaptive grid search.
Matt Denny
The GERGM
June 24th, 2016
20 / 23
GERGM Software in R
github.com/matthewjdenny/GERGM
Matt Denny
The GERGM
June 24th, 2016
21 / 23
Thank you!
Funding: National Science Foundation
Grants: DGE-1144860, DMS-1105581, DMS- 1310002,
SES-1357622, SES-1357606, SES-1461493, and CISE-1320219
Materials:
Paper: http://ssrn.com/abstract=2795219
Github: github.com/matthewjdenny/GERGM
CRAN: GERGM
Matt Denny
The GERGM
June 24th, 2016
22 / 23
GERGM - Metropolis Hastings Procedure
1
(t)
(t)
For i, j ∈ [n], generate proposal edge x̃i,j ∼ qσ (⋅∣xi,j )
independently across edges.
2
Set
x
(t+1)
⎧
(t)
(t)
⎪
⎪
⎪x̃ = (x̃i,j )i,j∈[n]
=⎨
⎪
⎪x (t)
⎪
⎩
w.p.
ρ(x (t) , x̃ (t) )
w.p.
1 − ρ(x (t) , x̃ (t) )
where
ρ(x, y ) = min (
fX (y ∣θ) m qσ (xi ∣yi )
, 1)
∏
fX (x∣θ) i=1 qσ (yi ∣xi )
m
= min (exp (θ′ (h(y ) − h(x))) ∏
i=1
Matt Denny
The GERGM
qσ (xi ∣yi )
, 1)
qσ (yi ∣xi )
June 24th, 2016
23 / 23
Related documents