3.6 Case study on Disease and Health

With all this information available to academics, there are numerous projects that have taken part in big data research. The University of Pennsylvania conducted one such project, which included graduate students from multiple departments and disciplines. Their research has illuminated a predictive relationship between Twitter post content and heart disease [33].

figure3-7

Figure 7: University of Pennsylvania’s comparison of reported heart disease rates vs predicted heart disease rates [33]

Emotional factors play a role in identifying heart disease, and since direct measurement of emotions can be difficult and expensive, this innovative use of Twitter by the researchers of the University of Pennsylvania has helped shed more light on the relationship.

Traditional indicators like low income, smoking, and stress are commonly used to identify the risks of heart disease; however, the Penn researchers demonstrated that Twitter can identify heart disease risk more than most traditional factors combined. In their research, they compare how well Twitter data compares to traditional factors. The predictors compared in their study include:

  1. Income and education
  2. Smoking
  3. Diabetes
  4. Hypertension
  5. Obesity
  6. Black
  7. Female
  8. Married
  9. Hispanic

The results are surprising. As an indicator for heart disease, Twitter outperformed all the other indicators combined.

figure3-8

Figure 8: University of Pennsylvania’s results of Twitter vs traditional predictors as indicators for heart disease [43]

The University of Pennsylvania is not the only research group looking into relationships like this by far. Another group at UCLA has analyzed the Twitter data in order to monitor HIV and drug-related behavior. The UCLA researchers conclude that their use of Twitter data allows them to understand and possibly predict where HIV cases and drug use occur [34].

figure3-9

Figure 9: UCLA mapping of HIV-related [34]

Likewise, Twitter data can be used in the analysis of linguistic signals and how they apply to mental health. Researchers at Johns Hopkins University used Twitter and language analyses to find and classify users who express post-traumatic stress disorder, depression, bipolar disorder, and seasonal affective disorder [35]. While not offering a perfect method, these researchers propose that as more novel ways for collecting and analysing Twitter data for mental health are developed, they may become crucial complementary information to existing survey-based methods.