BREAKING NEWS: Psycholinguistic and Behavioural Differences in (Un)‐Trustworthy Online News Source Interaction (Poster)

Aluffi, Pietro Alessandro; Meynent, Léo; Stachl, Clemens

Conference Object

BREAKING NEWS: Psycholinguistic and Behavioural Differences in (Un)‐Trustworthy Online News Source Interaction (Poster)

Author(s) / Creator(s)

Aluffi, Pietro Alessandro

Meynent, Léo

Stachl, Clemens

Abstract / Description

The psychological need to share information and impact thereof has been exacerbated with the advent of online activities and online social media. The pervasiveness of unfiltered and non-curated information in online social media induces exposure to untrustworthy information (i.e., low-factual information sources). This exposure makes it increasingly challenging for users to discern trustworthy from non-trustworthy online news sources. Moreover, the social character of these platforms raises questions around the diffusion of untrustworthy news sources. Understanding characteristics of users sharing untrustworthy online news sources could help identify and prevent the diffusion of untrustworthy content. Past work has mainly focused on the experimental investigation of specific individual differences in the interaction with untrustworthy online news sources. To comprehensively analyse the human factors related to the online dissemination of non-trustworthy news, a better understanding of user’s characteristics in the complexity of a real-world setting is called for. Here, we aim to answer the following research question: What are systematic differences in users’ demographic, psycholinguistic, and online posting behaviour regarding interaction with (un)trustworthy news sources? To address this question, we use open-access social media data from the Reddit platform (Pushshift) and independent data for political bias and trustworthiness of online media (e.g., media bias fact check) to highlight groups of users who tend to share less factual content to those platforms. We assign users into two groups: trustworthy news sharers and untrustworthy news sharers. For group assignment, we adopt communities’ embeddings to ensures that users in the two groups belong to similar communities. For predictive modelling, we extract three types of features: psycholinguistic, online posting behaviour and demographic. First, we quantify psycholinguistic characteristics by calculating the Linguistic Inquiry and Word Count scores (LIWC) using the text from the comments by users commenting a post containing a news article. Furthermore, we complement LIWC scores with general text-based characteristics such as lexical diversity and readability scores. Secondly, we extract online posting behaviour features based on the popularity of the user, posting frequency, and network-size. Lastly, we use the Bidirectional Encoder Representations from Transformers (BERT) model trained on the RedDust dataset to infer users’ demographic characteristics such as age and gender. We use the combined set of features in a machine learning approach (e.g., random forest) to predict group membership according to Redditors characteristics. Furthermore, to gain insights on which of these factors has a predominant impact on group membership we perform Shapely Values analysis to extract and interpret feature importance. Insights from this project will contribute to a more nuanced understanding of how users' characteristics can be associated to different consumption patterns of (un)trustworthy news sources, and potentially provide insights for predicting the process of misinformation spread. We anticipate that our findings can be used to identify groups who are more susceptible to consuming and spreading disinformation. This in turn could help individuals make informed decisions and avoid exposure to false information. Finally, we will discuss the implications of our findings for the development of interventions and policies to debunk disinformation and to increase media literacy.

Persistent Identifier

https://doi.org/10.23668/psycharchives.13024

Date of first publication

2023-07-24

Is part of

Big Data & Research Syntheses 2023, Frankfurt, Germany

Publisher

ZPID (Leibniz Institute for Psychology)

Citation

Select Style

Download BibTex

Download as Text

Aluffi_Poster.pdf

Adobe PDF - 419.71KB

MD5: e109489b737a332bd9f2a0b10bef4ea7

Sharing Level 0 (Public Use) CC-BY-SA 4.0

Download

Is related to

Conference Object
BREAKING NEWS: Psycholinguistic and Behavioural Differences in (Un)‐Trustworthy Online News Source Interaction (Slides)

Aluffi, Pietro Alessandro & Meynent, Léo & Stachl, Clemens, 2023-07-18, ZPID (Leibniz Institute for Psychology)

The psychological need to share information and impact thereof has been exacerbated with the advent of online activities and online social media. The pervasiveness of unfiltered and non-curated information in online social media induces exposure to untrustworthy information (i.e., low-factual information sources). This exposure makes it increasingly challenging for users to discern trustworthy from non-trustworthy online news sources. Moreover, the social character of these platforms raises questions around the diffusion of untrustworthy news sources. Understanding characteristics of users sharing untrustworthy online news sources could help identify and prevent the diffusion of untrustworthy content. Past work has mainly focused on the experimental investigation of specific individual differences in the interaction with untrustworthy online news sources. To comprehensively analyse the human factors related to the online dissemination of non-trustworthy news, a better understanding of user’s characteristics in the complexity of a real-world setting is called for. Here, we aim to answer the following research question: What are systematic differences in users’ demographic, psycholinguistic, and online posting behaviour regarding interaction with (un)trustworthy news sources? To address this question, we use open-access social media data from the Reddit platform (Pushshift) and independent data for political bias and trustworthiness of online media (e.g., media bias fact check) to highlight groups of users who tend to share less factual content to those platforms. We assign users into two groups: trustworthy news sharers and untrustworthy news sharers. For group assignment, we adopt communities’ embeddings to ensures that users in the two groups belong to similar communities. For predictive modelling, we extract three types of features: psycholinguistic, online posting behaviour and demographic. First, we quantify psycholinguistic characteristics by calculating the Linguistic Inquiry and Word Count scores (LIWC) using the text from the comments by users commenting a post containing a news article. Furthermore, we complement LIWC scores with general text-based characteristics such as lexical diversity and readability scores. Secondly, we extract online posting behaviour features based on the popularity of the user, posting frequency, and network-size. Lastly, we use the Bidirectional Encoder Representations from Transformers (BERT) model trained on the RedDust dataset to infer users’ demographic characteristics such as age and gender. We use the combined set of features in a machine learning approach (e.g., random forest) to predict group membership according to Redditors characteristics. Furthermore, to gain insights on which of these factors has a predominant impact on group membership we perform Shapely Values analysis to extract and interpret feature importance. Insights from this project will contribute to a more nuanced understanding of how users' characteristics can be associated to different consumption patterns of (un)trustworthy news sources, and potentially provide insights for predicting the process of misinformation spread. We anticipate that our findings can be used to identify groups who are more susceptible to consuming and spreading disinformation. This in turn could help individuals make informed decisions and avoid exposure to false information. Finally, we will discuss the implications of our findings for the development of interventions and policies to debunk disinformation and to increase media literacy.

There are no other versions of this object.

Author(s) / Creator(s)

Aluffi, Pietro Alessandro
Author(s) / Creator(s)

Meynent, Léo
Author(s) / Creator(s)

Stachl, Clemens
PsychArchives acquisition timestamp

2023-07-24T10:51:52Z
Made available on

2023-07-24T10:51:52Z
Date of first publication

2023-07-24
Abstract / Description

The psychological need to share information and impact thereof has been exacerbated with the advent of online activities and online social media. The pervasiveness of unfiltered and non-curated information in online social media induces exposure to untrustworthy information (i.e., low-factual information sources). This exposure makes it increasingly challenging for users to discern trustworthy from non-trustworthy online news sources. Moreover, the social character of these platforms raises questions around the diffusion of untrustworthy news sources. Understanding characteristics of users sharing untrustworthy online news sources could help identify and prevent the diffusion of untrustworthy content. Past work has mainly focused on the experimental investigation of specific individual differences in the interaction with untrustworthy online news sources. To comprehensively analyse the human factors related to the online dissemination of non-trustworthy news, a better understanding of user’s characteristics in the complexity of a real-world setting is called for. Here, we aim to answer the following research question: What are systematic differences in users’ demographic, psycholinguistic, and online posting behaviour regarding interaction with (un)trustworthy news sources? To address this question, we use open-access social media data from the Reddit platform (Pushshift) and independent data for political bias and trustworthiness of online media (e.g., media bias fact check) to highlight groups of users who tend to share less factual content to those platforms. We assign users into two groups: trustworthy news sharers and untrustworthy news sharers. For group assignment, we adopt communities’ embeddings to ensures that users in the two groups belong to similar communities. For predictive modelling, we extract three types of features: psycholinguistic, online posting behaviour and demographic. First, we quantify psycholinguistic characteristics by calculating the Linguistic Inquiry and Word Count scores (LIWC) using the text from the comments by users commenting a post containing a news article. Furthermore, we complement LIWC scores with general text-based characteristics such as lexical diversity and readability scores. Secondly, we extract online posting behaviour features based on the popularity of the user, posting frequency, and network-size. Lastly, we use the Bidirectional Encoder Representations from Transformers (BERT) model trained on the RedDust dataset to infer users’ demographic characteristics such as age and gender. We use the combined set of features in a machine learning approach (e.g., random forest) to predict group membership according to Redditors characteristics. Furthermore, to gain insights on which of these factors has a predominant impact on group membership we perform Shapely Values analysis to extract and interpret feature importance. Insights from this project will contribute to a more nuanced understanding of how users' characteristics can be associated to different consumption patterns of (un)trustworthy news sources, and potentially provide insights for predicting the process of misinformation spread. We anticipate that our findings can be used to identify groups who are more susceptible to consuming and spreading disinformation. This in turn could help individuals make informed decisions and avoid exposure to false information. Finally, we will discuss the implications of our findings for the development of interventions and policies to debunk disinformation and to increase media literacy.

en
Publication status

unknown
Review status

unknown
External description on another website

http://www.ressyn-bigdata.org
Persistent Identifier

https://hdl.handle.net/20.500.12034/8523
Persistent Identifier

https://doi.org/10.23668/psycharchives.13024
Language of content

eng
Publisher

ZPID (Leibniz Institute for Psychology)
Is part of

Big Data & Research Syntheses 2023, Frankfurt, Germany
Is related to

https://hdl.handle.net/20.500.12034/8510
Dewey Decimal Classification number(s)

150
Title

BREAKING NEWS: Psycholinguistic and Behavioural Differences in (Un)‐Trustworthy Online News Source Interaction (Poster)

en
DRO type

conferenceObject
Visible tag(s)

ZPID Conferences and Workshops