Employing text mining to investigate language attitudes and social identity on Chinese social media

  • Zhenzhen Zhang
  • , Haomin Zhang*
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Language attitudes are a central focus in sociolinguistics, with research methods steadily advancing. This study employed the text mining method, using keywords to retrieve 4.3k relevant Weibo posts to build a corpus. The BERTopic analysis reveals that discussions on Weibo primarily focus on three themes: cultural entertainment, language and regional issues, and education and exams. These topics demonstrate both short-term viral dissemination and long-term persistence. Manual coding indicates that discussions on language attitudes predominantly revolve around social contexts. Sentiment analysis shows that overall sentiment toward language attitudes is largely neutral (36.72%), with positive sentiment (33.36%) slightly exceeding negative sentiment (29.92%). Further analysis highlights that language attitudes are shaped by interpersonal emotional projection, social recognition and identity construction, and the functional role of language as a communication tool. Moreover, a societal expectation for ‘accent standardization’ is evident in discussions concerning both Mandarin and regional dialects.

Original languageEnglish
JournalInternational Journal of Multilingualism
DOIs
StateAccepted/In press - 2025
Externally publishedYes

Keywords

  • BERTopic
  • Language attitudes
  • sentiment analysis
  • social media
  • text mining

Fingerprint

Dive into the research topics of 'Employing text mining to investigate language attitudes and social identity on Chinese social media'. Together they form a unique fingerprint.

Cite this