This work analyzes tweets linking to scientific papers to find out if the tweets are positive, or negative or do not express an opinion. This will inform the meaning of tweets as a measure of impact in the context of altmetrics. The following research questions are examined:
- In how far can sentiment analysis be used to detect positive or negative statements towards scientific papers expressed on Twitter?
- Do tweets linking to scientific papers express positive or negative opinions? How do sentiments differ by academic discipline?
- How do results affect the meaning of tweets to scientific papers as an altmetric indicator?
Table of Contents
1. Introduction
2. Materials and Methods
2.1 Dataset
2.1.1 Bibliographic information of tweeted documents
2.1.2 Tweets
2.2 Sentiment analysis
2.2.1 The definition of a sentiment
2.2.2 Sentiment analysis
2.3 Methods
2.3.1 Intellectual coding of sentiments
2.3.2 Removing Twitter affordances
2.3.3 Removing title terms
2.3.4 Adapting the lexicon
2.3.5 Removing non-English terms
2.3.6 Calculating sentiments per discipline
3. Results and Discussions
3.1 The ground truth
3.1.1 Sentiment analysis I
3.1.2 Sentiment analysis II
3.1.3 Sentiment analysis III
3.1.4 Sentiment analysis IV
3.2 Automated analysis of all tweets
3.3 Results
3.4 Discipline specific results
4. Conclusion
Research Objectives and Themes
This thesis aims to analyze whether tweets linking to scientific papers express sentiment and how these sentiments can be detected automatically to inform the use of tweets as an altmetric indicator for research impact.
- Methodological development for sentiment analysis in scientific contexts
- Evaluation of sentiment analysis tools (Sentiment140 and SentiStrength)
- Assessment of scholarly discourse on Twitter across academic disciplines
- Comparison of manual intellectual coding vs. automated sentiment detection
- Investigation into the validity of Twitter activity as a measure of scientific impact
Excerpt from the Book
2.3.3 Removing title terms
Due to the misclassification of tweets caused by certain research-specific terms, it was decided to remove all terms, which appear in the tweet and the linked title of the article simultaneously. To make the terms in titles and tweets comparable, all terms were separated by spaces. As the final step each term from a tweet string was compared with each term from the title string –term occurring in the title and the tweet were deleted from the tweet. The title terms reflect only the content or the topic of the linked article. The scientific terms, which led to misallocations, may not only occur in the title terms but also in the abstracts or the article texts. So, only the title terms are removed because their occurrence in the tweets is certain.
The remaining terms were used for the sentiment analysis as they were believed to contain sentiment-carrying information and the user’s opinion towards the linked article. The dataset was named ta.
Summary of Chapters
1. Introduction: This chapter defines the research context of altmetrics in scholarly communication and outlines the primary research questions regarding the sentiment of tweets linking to scientific publications.
2. Materials and Methods: This section details the data collection of 663,547 tweets and the subsequent processing steps, including intellectual coding and the adaptation of automated sentiment analysis tools.
3. Results and Discussions: This chapter presents the evaluation of different cleaning methods and analyzes the sentiment distribution of tweets across various academic disciplines.
4. Conclusion: The concluding chapter summarizes the findings, confirms that while scholars do express opinions on Twitter, most tweets lack sentiment, and discusses the implications for altmetrics.
Keywords
Sentiment Analysis, Twitter, Altmetrics, Scientific Papers, Scholarly Communication, Opinion Mining, Text Processing, Automated Detection, Academic Impact, Lexicon Adaptation, SentiStrength, Sentiment140, Web of Science, Intellectual Coding, Research Evaluation
Frequently Asked Questions
What is the core focus of this research?
The research investigates whether tweets that link to scientific papers contain sentiments (positive or negative opinions) and explores how effectively these sentiments can be identified using automated tools.
What are the primary themes discussed?
The work covers scientific communication on social media, the technical challenges of sentiment analysis in academic contexts, the validity of altmetrics, and the adaptation of sentiment lexicons.
What is the central research question?
The study examines if sentiment analysis can accurately detect statements toward scientific papers on Twitter, how these sentiments vary by discipline, and what this implies for the use of tweets as impact indicators.
Which scientific methods are applied?
The research utilizes quantitative methods, specifically comparing manual "ground truth" coding of a sample of 1,000 tweets against automated results from tools like SentiStrength and Sentiment140.
What does the main body cover?
The main body focuses heavily on methodology, particularly the complex cleaning processes required to remove noise (like paper titles and Twitter-specific affordances) from the dataset to improve analysis accuracy.
Which keywords best characterize this work?
The work is characterized by terms such as Altmetrics, Sentiment Analysis, Twitter, Scholarly Communication, and Lexicon Adaptation.
Why was it necessary to remove paper titles from the tweets?
It was discovered that automated tools frequently misinterpreted scientific terms in paper titles (e.g., 'cancer', 'violence') as sentiment-bearing words, leading to false negative classifications.
How do academic disciplines differ in their Twitter sentiment?
The research indicates that liberal arts disciplines like Arts and Humanities show significantly higher levels of sentiment compared to scientific disciplines like Chemistry or Mathematics, which are mostly neutral.
- Citar trabajo
- Natalie Friedrich (Autor), 2015, Applying sentiment analysis for tweets linking to scientific papers, Múnich, GRIN Verlag, https://www.grin.com/document/312043