Related Datasets

The datasets below build on the MID dataset, but are not compiled by the Penn State MID Team.

Dyadic MID Dataset

This dataset (available here) is compiled by a team at UC-Davis, led by Zeev Maoz. It consists of pairs of states (i.e., dyads) on opposing sides of a MID. For multilateral MIDs, it only includes pairs of states that actually interacted directly with each other during the MID, which is an important improvement in accuracy over simply assuming that every state on Side A interacted with every state on Side B. The dyadic MID dataset also adjusts some of the other MID variables to be accurate by dyad. You can tell which variables are adjusted by looking at the Dyadic MID Codebook. Variables that say MID 4.2 in the Version column of the chart have not been adjusted. The current version of the Dyadic MID dataset does not adjust the Side A and B codings by dyad. Therefore, Side A in the Dyadic MID dataset is not necessarily the first mover in the dyadic interaction. Any discrepancies in the Side A and Side B codings between the Dyadic MID dataset and the MID 4.3 dataset are probably errors rather than substantive changes. These discrepancies are most common in the years 1993-2010, and the UC-Davis team is investigating their origin.

Recommendation for Users: We recommend using the Dyadic MID dataset to obtain dyadic MIDs between 1816-1992. For 1993-2010, we recommend using the MID 4.3 incident-level data to create dyadic MIDs. This not only eliminates dyads that never actually interacted, but also ensures that State A in the dyad is actually the dyadic first mover in the years 1993-2010. Stata code for combining these data sources is available here. Users of the combined data should cite both the Dyadic MID dataset article of record and the MID 4.3 article of record.

GML Dataset

In a recent article, Douglas Gibler, Steven Miller, and Erin Little (GML) conduct an extensive review of the MID data between the years 1816 and 2001, highlighting possible inaccuracies and recommending a substantial number of changes to the data. They contend that, in several instances, analyses with their revised GML MID dataset lead to substantively different inferences. In Palmer et al. (forthcoming) and our appendix, we review GML’s MID drop and merge recommendations and reevaluate the substantive impact of their changes. We are in agreement with about 76 percent of the recommended drops and merges. However, we find that some of the purported overturned findings in GML’s replications are not due to their data, but rather to the strategies they employ for replication. We re-examine these findings and conclude that the remaining differences in inference stemming from the variations in the MID data are rare and modest in scope. The changes about which we are in agreement with GML are incorporated into the MID 4.3 data. Please see Palmer et al. (forthcoming) for additional details.

Recommendation for Users: Based on the replications in Palmer et al. (forthcoming), we think that cases in which an otherwise robust result changes when switching between the official MID dataset and the GML version will be rare. Changes are probably most likely to occur in studies of duration. Concerned users may wish to estimate their regressions using both versions of the data, but we do not advocate this as a required practice because it is not clear that lack of replicability across both versions means that a result is invalid.

Militarized Interstate Dispute Locations Dataset

This dataset (available here) records the geographic locations of MIDs, and incidents within MIDs, in latitude/longitude coordinates. It is compiled by Alex Braithwaite, a former MID project coder who is now a Professor at the University of Arizona. This dataset is a useful resource for those who seek to study the geographic distribution and geopolitical attributes of interstate conflict.

Alternative Datasets

We think that the MID dataset is the best choice for many research questions, given its temporal scope, coding of both major and minor disputes, and clear directionality, which makes it relatively easy to convert to dyads. But if the MID dataset does not meet your research needs, the following are some other conflict datasets that you could consider:

International Crisis Behavior Data

UCDP/PRIO Armed Conflict Data

Issue Correlates of War Data

Militarized Compellent Threat Data