Sunday, 22 December 2024
A role for qualitative methods in researching Twitter data on a popular science article's communication
Written for scholars and students who are interested in using qualitative research methods for research with small data, such as tweets on X.
Myself, Dr Corrie Uys, Dr Pat Harpur and Prof Izak van Zyl's open-access paper, 'A role for qualitative methods in researching Twitter data on a popular science article's communication' identifies several potential qualitative research contributions in analysing small data from microblogging communications:
Qualitative research can provide a rich contextual framing for how micro-practices (such as tweet shares for journal articles...) relate to important social dynamics (... like debates on paradigms within higher-level social strata in the Global Health Science field) plus professionals' related identity work. Also, in-depth explorations of microblogging data following qualitative methods can contribute to the research process by supporting meta-level critiques of missing data, (mis-) categorisations, and flawed automated (and manual) results.
Published in Frontiers in Research Metrics and Analytics journal's special topic, Network Analysis of Social Media Texts, our paper responds to calls from Big Data communication researchers for qualitative analysis of online science conversations to better explore their meaning. We identified a scholarly gap in the Science Communication field regarding the role that qualitative methods might play in researching small data regarding micro-bloggers' article communications. Although social media attention assists with academic article dissemination, qualitative research into related microblogging practices is scant. To support calls for the qualitative analysis of such communications, we provided a practical example:
Mixed methods were applied for better understanding an unorthodox, but popular, article (Diet, Diabetes Status, and Personal Experiences of Individuals with Type 2 diabetes Who Self-Selected and Followed a Low Carbohydrate High Fat diet) and its Twitter users' shares over two years. Big Data studies describe patterns in micro-bloggers' activities from large sets of data. In contrast, this small data set was analysed in NVivo™ by me (a pragmatist), and in MAXQDA™ by Corrie (a statistician). As part of the data preparation and cleaning, a comprehensive view of hyperlink sharing and conversations was developed, which quantitative extraction alone could not support. For example, through neglecting the general publication paths that fall outside listed academic publications, and related formal correspondence (such as academic letters, and sharing via open resources).
My multimodal content analysis found that links related to the article were largely shared by health professionals. Its popularity related to its position as a communication event within a longstanding debate in the Health Sciences. This issue arena sees an emergent Insulin Resistance (IR) paradigm contesting the dominant “cholesterol” model of chronic disease development. Health experts mostly shared this article, and their profiles reflected support for the emergent IR paradigm. We identified that these professionals followed a wider range of deliberation practices, than previously described by quantitative SciComm Twitter studies. Practices ranged from being included as part of a lecture-reading list, to language localisation in translating the article's title from English to Spanish, and study participants mentioning being involved. Contributing under their genuine identities, expert contributors carried the formal norms for civil communication into the scientific Twitter genre. There were no original URL shares from IR critics, suggesting how sharing evidence for an unconventional low-carbohydrate, healthy fats approach might be viewed as undermining orthodox identities. However, critics did respond with pro-social replies, and constructive criticism linked to the article's content, and its methodological limitations.
The statistician's semantic network analysis (SNA) confirmed that terms used by the article's tweeters related strongly to the article's content, and its discussion was pro-social. A few prominent IR individual advocates and organisations shared academic links to the article repeatedly, with its most influential tweeters and sharers being from England and South Africa. In using Atlas.ti and MAXQDA's tools for automated sentiment analysis, the statistician found many instances where sentiment was inaccurately described as negative when it should have been positive. This suggested a methodological limitation of quantitative approaches, such as QDAS, in (i) accurately analysing microblogging data. The SNA also uncovered concerns with (ii) incorrect automated counts for link shares. Concerns i & ii indicate how microblogging statistics may oversimplify complex categories, leading to inaccurate comparisons. In response, close readings of microblogging data present a distinct opportunity for meta-critique. Qualitative research can support critiques of microblogging data sources, as well as its use in QDAS. A lack of support for static Twitter data spreadsheet analysis was concerning.
Meta-inferences were then derived from the two methods' varied claims above. These findings flagged the importance of contextualising a health science article's sharing in relation to tweeters' professional identities and stances on what is healthy. In addition, meta-critiques spotlighted challenges with preparing accurate tweet data, and their analysis via qualitative data analysis software. Such findings suggest the valuable contributions that qualitative research can make to research with microblogging data in science communication.
The manuscript's development history
In 2020, Dr Pat Harpur and I selected an outlier IR scientific publication based on its unusually high Twitter popularity. At that time, the editorial, 'It is time to bust the myth of physical inactivity and obesity: you cannot outrun a bad diet' had been tweeted about over 3,000 times (now nearing 4,000 according to Altmetric!). However analysing this highly popular outlier stalled after its static export in qualitative data analysis software proved unsuitable for efficient coding. The large quantum of tweet data also proved very difficult to analyse. Accordingly, we shifted focus to a popular article that had been shared as an episode of a broader, long-running IR versus cholesterol debate. Even with its relatively small volume of tweets, organising this data for qualitative analysis proved challenging. For example, it was necessary to refine the Python extraction code, while cross-checks of static vs Twitter search results necessitated the capture of “missing” conversations.
We originally developed a multimodal analysis of these tweets, which focused on their relationship to Twitter user's profiles, potentially reflecting a wide range of communication goals. Our manuscript was submitted in 2022 to Science Communication, where Professor Susanna Priest kindly gave in-depth feedback on changing the original manuscript's contribution to a methodological one. We tackled this through developing a rationale for qualitative research with small data in the majorly revised article, which Dr Corrie Uys did a semantic network analysis for, while I revisited the social semiotic analysis.
If you have any questions, comments or concerns about our article, please comment below.