Data Science Research and Research Interests

My data science research covers a breadth of issues related to social phenomena, ranging from the roles actors adopt in networks to the factors influencing death penalty executions. To date, my research has produced over half a million dollars in federal grant funding.

My substantive research interests include:

  • Substance use disorder treatment

  • Social interdependencies

  • Peace science

  • Alliance politics

  • Systemic theory

Methodologically, I utilize the tools of:

  • Network analysis

  • Machine learning

  • Time series analysis

  • Survival analysis

The following is a sample of my work, organized thematically.


Methodological and Statistical Development

Kapferer Network Role Assignment, Wave 1.  In-Groupers, Purple. Out-Groupers, Orange. Movers, Purple.

Kapferer Network Role Assignment, Wave 1.

In-Groupers, Purple.
Out-Groupers, Orange.
Movers, Purple.

Detecting Heterogeneity and Inferring Latent Roles in Longitudinal Networks

Network analysis has typically examined the formation of whole networks while neglecting variation within or across networks. Actors within networks often adopt particular roles.  While cross-sectional approaches for inferring latent roles exist, there is a paucity of approaches for considering roles in longitudinal networks.  This paper explores the conceptual dynamics of temporally observed roles while deriving and introducing a novel statistical tool, the ego-TERGM, capable of uncovering these latent dynamics.  Estimated through an Expectation-Maximization algorithm, the ego-TERGM is quick and accurate in classifying roles within a broader temporal network.  An application to the Kapferer strike network illustrates the model's utility.

Political Analysis, 2018

Software available through CRAN

Decomposition of Recommender System Influence.

Decomposition of Recommender System Influence.

Inferring Influence Networks from Longitudinal Bipartite Relational Data

with Frank Marrs, Bailey Fosdick, Skyler Cranmer, and Tobias Bohmelt

Longitudinal bipartite relational data characterize the evolution of relations between pairs of actors, where each actor is one of two distinct types and relations exist only between disparate actor types.  An example of such data is countries’ treaty ratifications: in a given year, a relation is generated between country i and treaty k if i ratifies k.  A common task in examining bipartite data is estimating the influence network between actors of the same type.  That is, if state i ratifies a treaty in a given year, is it likely that state j ratifies the same treaty in a future year?  This task is typically accomplished by projecting the bipartite network to a one-mode network, for which generative models are lacking.  To address this shortcoming, we propose a generative model for (discrete-time, weighted) longitudinal bipartite relational data to infer (weighted, directed) influence networks.  We compare performance of the proposed model to existing models on a simulated dataset and on a dataset of relations between states.  Our approach provides improved interpretability and estimability over existing models while performing at least as well in explaining data variation and predicting out of sample.  As our model may be expressed as a linear regression, is extends readily to unweighted and categorical relational data.  Model parsimony is achieved via regularization or applying reduced rank structure to the estimated influence networks.  Finally, we prove that our model asymptotically captures the desired dependencies, even under misspecification.

Revise and Resubmit at Journal of Computational and Graphical Statistics

Software on CRAN

FERGM Improvements over ERGM.

FERGM Improvements over ERGM.

Substantive Implications of Unobserved Heterogeneity: Testing the Frailty Approach to Exponential Random Graph Models

with Jan Box-Steffensmeier and Dino Christenson

Exponential Random Graph Models (ERGMs) are an increasingly common tool for inferential network analysis.  However, a potential problem for these models is the assumption of correct model specification. Through six substantive applications (Mesa High, Florentine Marriage, Military Alliances, Militarized Interstate Disputes, Regional Planning, Brain Complexity), we illustrate how unobserved heterogeneity and confounding leads to degenerate model specifications, inferential errors, and poor model fit.  In addition, we present evidence that a better approach exists in the form of the Frailty Exponential Random Graph Model (FERGM), which extends the ERGM to account for unit or group-level heterogeneity in tie formation. In each case, the ERGM is prone to producing inferential errors and forecasting ties with lower accuracy than the FERGM.  

Social Networks, 2019

Software on CRAN


Applications of Role Analysis

Role Assignments from Ego-TERGM.  Aggregators, Red. Balancers, Gold. Reformers, Blue. Consolidators, Off White.

Role Assignments from Ego-TERGM.

Aggregators, Red.
Balancers, Gold.
Reformers, Blue.
Consolidators, Off White.

Measuring and Assessing Latent Variation in Alliance Design and Objectives

 The alliance politics literature foundationally assumes that states, motivated by an external threat, form alliances to ensure survival.  Treating alliance objectives as homogenous assumes that alliances' generating process has never changed and that their objectives do not vary, yielding underdeveloped theories and potentially problematic inferences. These studies have been handcuffed by insufficient data on the goals underpinning alliance design, or methods capable of inferring these latent objectives. I propose a novel role-based framework for considering alliance design and objectives which enumerates roles a state can adopt within the alliance network and considers the relationship between a state's role in the alliance network and how they design their local alliance network to accomplish role-based objectives.  To detect this unobserved variation, I employ a novel methodological tool, the ego-TERGM.  Results indicate that the conventional model of alliances is inadequately specified and that scholars should consider the varying alliance network roles states adopt.

Roles in the Interest Group Coalition Network, Environmental Cases.

Roles in the Interest Group Coalition Network, Environmental Cases.

Role Analysis Using the Ego-ERGM: A Look at Environmental Interest Group Coalitions

with Jan Box-Steffensmeier, Dino Christenson, and Zack Navabi

Interest groups coordinate to achieve political goals.  However, these groups are heterogeneous, and the division of labor within these coalitions varies.  We explore the presence of distinct roles in coalitions of environmental interest groups, and analyse which factors predict if an organization takes on a particular role.  To model these latent dynamics, we introduce the ego-ERGM.  We find that a group's budget, member size, staff size, and degree centrality are influential in distinguishing between three role assignments.  These results provide insight into the roles adopted in carrying out coalition tasks.  This approach shows promise for understanding a host of networks.

Social Networks, 2018


Survival Analysis, Event Dependence, and the Death Penalty


Event Dependence in U.S. Executions

with Jan Box-Steffensmeier and Frank Baumgartner

Since 1976, the United States has seen over 1,400 judicial executions concentrated in only a few states and counties.  The number of executions across counties appears to fit a stretched distribution.  Our tests estimate the monthly hazard of an execution in a given county, accounting for the number of previous executions, homicides, poverty, population demographics, and state, covering the entire period from 1976 through 2015, across all counties in states with the death penalty. We find that the number of prior executions a county experiences increases the probability of the next execution, in addition to accelerating the rate at which executions occur.  Once a jurisdiction goes down a given path, it tends to accelerate, causing the counties to separate out into those never executing (the vast majority of counties) and those which use the punishment frequently. This finding is of great legal and normative concern, and ultimately, may not be consistent with the equal protection clause of the U.S. Constitution.

PLOS ONE, 2018

Covered in Multiple Media Outlets

Geographic Distribution of Death Sentences.

Geographic Distribution of Death Sentences.

The Hot Spots of Capital Punishment: Event Dependency and Local Norms in Death Sentencing and Executions

with Jan Box-Steffensmeier, Frank Baumgartner, and Alex Bennett

The death penalty is thought to be judiciously applied and saved for the most heinous crimes. However, in the United States, a few counties generate much higher numbers of death sentences and executions than would be expected based on the number of homicides, whereas the majority have rendered not a single death verdict nor seen a single execution in the 40+ years of the modern death penalty system. We explain these high concentrations by looking at the self-reinforcing nature of local legal communities. Where a community has gone down a given path on the death penalty (either using it or not), it tends to accelerate, as future decisions are based on past ones. We document powerful event-dependency effects for both death sentences and executions. Results provide a statistical demonstration of the arbitrary nature of the United States' death penalty system.

Under Review


Social Network Treatments for Substance Abuse Disorder


Therapeutic community graduates cluster together in social networks: Evidence for spatial selection as a cooperative mechanism in therapeutic communities

with Skyler Cranmer, Carole Harvey, and Keith Warren

It is natural to think of affirmations in TCs as simply a means of positively reinforcing prosocial behavior, a distributed equivalent of the behavioral reward systems that play an important role in many correctional rehabilitation programs. TC clinical theory and research suggests that this reductionist assumption is inadequate to understand the role of affirmations in the TC, where they serve as a motivational tool and as a means of counterbalancing the peer corrections that also play a key role in treatment. This analysis adds that affirmations may also play a role in establishing and reinforcing networks of individuals who are cooperating to overcome substance abuse. The reductionist techniques that underlie most studies of mental health and substance abuse treatment are inadequate for analyzing such networks. Systems science techniques such as social network analysis, which allow researchers to consider the interactions between TC residents, are necessary if we are to fully understand the way in which the TC community acts as a method of clinical intervention in these programs.

Addictive Behaviors, 2018


Relationship between network clustering in a therapeutic community and reincarceration following discharge

with Skyler Cranmer, Nathan Dugan, Carole Harvey, and Keith Warren

Recent qualitative work on Therapeutic Communities (TCs) suggests that they help residents change by creating an environment that is simultaneously challenging and supportive. There is evidence that social networks that feature numerous closed triads are both more supportive and more likely to influence individual behavior.  This implies that TC residents whose peer social networks include more closed triads should have improved outcomes. The social network in this study consists of the affirmations exchanged between 1312 men who resided at a 90 bed TC in a Midwestern state over a period of eight years and includes a total of 34,667 weighted edges.  The network was analyzed using the Temporal Network Autocorrelation Model (TNAM) based semiparametric Cox model, thereby using a statistical methodology that accounts for dependence between individuals in the network.  Participants whose social networks of TC peers included a higher percentage of closed triads were at a decreased hazard of reincarceration following termination when controlling for age, length of stay and the number of peers who eventually graduated who affirmed the residents.  These results support the longstanding TC contention that the community as a whole is the method of clinical treatment.  Further quantitative research into TC processes and outcomes should ideally include social network surveys and statistics in order to avoid biases associated with violations of statistical independence assumptions.

Journal of Substance Abuse Treatment, 2018



Examining Social Interdependencies

Network of Correlates of War Fatal Militarized Interstate Disputes, 1938.

Network of Correlates of War Fatal Militarized Interstate Disputes, 1938.

Triangulating War: Network topology and the democratic peace

with Skyler Cranmer and Bruce Desmarais

The principal finding of Peace Science, the democratic peace, rests upon a dyadic understanding of conflict; decades of research has found that jointly democratic dyads do not engage in conflict with law-like regularity.  However, such conflicts may rarely reflect purely dyadic phenomena -- even if a conflict is not multi-party, multiple states may be engaged in distinct disputes with the same common enemy.  We postulate a network theory of conflict which indicates that the democratic peace is a function of the competing interests of mixed-regime dyads and the strategic inefficiencies associated with fighting enemies of enemies.  We find strong evidence of interdependence in the conflict network, that a state's decision to attack another is conditioned upon a third state with that same target.  When accounting for this effect, evidence of the democratic peace is much less clear.

Under Review

Network of Positive State Influence Relationships

Network of Positive State Influence Relationships

Latent influence networks in global environmental politics

with Tobias Böhmelt, Skyler Cranmer, Bailey Fosdick, and Frank Marrs

Leaders' decisions in international politics are not only driven by domestic, but also international factors.  However, while policymakers and diplomats have known for a long time that countries influence each other in international decision-making, systematically assessing and estimating this influence has proven to be more difficult.  Previous research has not demonstrated that a latent impact affecting all states persists in the network of treaties and countries, only that abstractly defined policy decisions in one context may influence what another country does.  Here we present findings that allow us, for the first time, to examine the latent influence network that underlies global politics and its impact in one of the most important policy issues of our time: environmental protection.  In light of diplomats' intuition and anecdotal evidence, we develop the argument for a latent influence network and demonstrate how it operates in the context of international environmental agreements.  To test the hypothesis that ratification is affected by a latent network linking treaties and countries, we analyze newly compiled data with an innovative estimator and show that strong positive and negative influences among countries and treaties do exist.  The analysis also examines the predictive power of the model, highlighting that we significantly improve upon earlier work that has failed to incorporate the network effect.  Our findings constitute the first systematic demonstration of an underlying latent influence network in international politics, different from previously identified domestic factors or diffusion effects.

In Press at PLOS One

Network of Military Alliances, 19

Network of Military Alliances, 19

The Contagion of Democracy Through International Networks: New Evidence for Second Order Effects

with Skyler Cranmer and Bruce Desmarais

Work on democratization has considered the diffusion of democracy through interstate partnerships such as alliances and intergovernmental organizations (IGOs).  However, such partnerships constitute complex networks of interaction that scholars have yet to fully explore as vectors for the spread of democracy.  We postulate a network theory of democratization in which we characterize these networks as epistemic communities that condition and socialize elites' attitudes towards favorable regime types.  In addition to democratic transitions spreading through direct ties, our theory predicts that indirect ties are also vectors of democratization.  We find that higher-order diffusion pressures are transmitted through like first-order connections (i.e., neighboring democracies with democratic neighbors).  In contrast to the field's current understanding, which emphasizes the roles of IGOs in democratic transitions and neglects the potential for higher-order effects, we find higher-order effects are transmitted only through the defensive alliance network

Under Review


I Get By With a Little Help from My Friends: Leveraging Campaign Resources to Maximize Congressional Power

with Jan Box-Steffensmeier and Andrew Podob

Despite the breadth of scholarship on legislative collaboration, scholars have little empirical understanding how members of Congress collaborate electorally. We posit there are four main reasons members collaborate electorally, namely it compliments and indirectly benefits their legislative agendas. Using a quasi-experiment of the 2016 election and nearly 3.2 million FEC records from 2010-2016, we build upon earlier scholarship, empirically separating legislative and campaign behaviors, showing the similarity and differences in these processes. Specifically, we find that party and committee membership drive most campaign collaboration, though some is cross-party. This finding shines light on how campaign behavior can be entirely isolated from legislative behavior, yet is endeavored to obtain similar ends. While collaboration between members is routinely used to achieve policy goals, we show electoral collaboration mimics legislative collaboration.

Revise and Resubmit at American Journal of Political Science


Supervised Learning and Causal Inference


Alliances and Conflict, or Conflict and Alliances? Appraising the Causal Effect of Alliances on Conflict

Decades of research has examined the deterrent or provoking effect of defensive alliances.  It is typically believed that if a state has a relevant defensive alliance they are less likely to be the target of a militarized dispute.  When theoretically and empirically modeling this relationship scholars typically assume that alliances are exogenous, that alliances are pseudo-randomly assigned and not influenced by a dyad's baseline probability of experiencing conflict.  This is problematic, while alliances may influence the probability of conflict, the expectation of conflict may influence the probability of alliances.  I synthesize theories of alliance formation and alliance-conflict relationship, innovating an endogenous theory of alliances and conflict.  I empirically evaluate this theory in dyadic and systemic perspective, finding that once endogenizing alliances and accounting for all relevant observed and unobserved confounders, alliances neither deter nor provoke aggression.  This has significant implications for our understanding of interstate conflict and alliance politics.