Classification of exaggerated news headlines

Rangata, Mapitsi R; Sefara, Tshephisho J

Classification of exaggerated news headlines

https://doi.org/10.1007/978-3-031-53731-8_20
http://hdl.handle.net/10204/13643

Abstract:

The amount of data online is increasing as companies generate news articles daily. These news articles contain headlines that have a level of exaggeration aimed to win the readers. In addition, these companies are competing against one another; hence creating appealing and exaggerated news headlines is one of the options to win the readers. Some of the exaggerated headlines contain some level of misleading information. Hence, this paper aims to apply machine learning methods and natural language processing to detect and identify exaggerated news headlines in South African context. Machine learning models such as logistic regression, decision trees, support vector machines, and XGBoost are trained on data that contain labelled news headlines as binary classification. The models produced good results, with XGboost and SVM obtaining 70% in terms of accuracy. Furthermore, the F measure was used to evaluate the models and decision trees obtained 56% followed by SVM with 53%. The classification of exaggerated news headlines is a difficult task. Therefore, we oversampled the data to obtain balanced labels. The performance of the models was increased. SVM obtained 84% followed by logistic regression, XGBoost, and decision trees with accuracy of 78%, 72% and 71%, respectively.

Reference:

Rangata, M.R. & Sefara, T.J. 2024. Classification of exaggerated news headlines. Communications in Computer and Information Science, 2030. http://hdl.handle.net/10204/13643

Rangata, M. R., & Sefara, T. J. (2024). Classification of exaggerated news headlines. Communications in Computer and Information Science, 2030, http://hdl.handle.net/10204/13643

Rangata, Mapitsi R, and Tshephisho J Sefara "Classification of exaggerated news headlines." Communications in Computer and Information Science, 2030 (2024) http://hdl.handle.net/10204/13643

Rangata MR, Sefara TJ. Classification of exaggerated news headlines. Communications in Computer and Information Science, 2030. 2024; http://hdl.handle.net/10204/13643.

Download RIS

Rangata, Mapitsi R
Sefara, Tshephisho J

Feb 2024

Online data increase
News headlines
Machine learning
Natural language
Exaggerated news

Show full item record

Files in this item

Rangata_2024.pdf

Source

Communications in Computer and Information Science, 2030

This item appears in the following Collection(s)

Journal Articles

Browse

All of ResearchSpace
This Collection
- By Issue Date
- Authors
- Titles
- Subjects
- Publication Type
- Cluster
- Impact Area

Quick Links

Legislation and compliance

General Enquiries

Tel: + 27 12 841 2911
Email: callcentre@csir.co.za

Physical Address
Meiring Naudé Road
Brummeria
Pretoria
South Africa

Postal Address
PO Box 395
Pretoria 0001
South Africa

Social Connect

Resources on this site are free to download and reuse according to associated licensing provision. Please read the terms and conditions of usage of each resource.

Classification of exaggerated news headlines

Classification of exaggerated news headlines

This item appears in the following Collection(s)

Browse

All of ResearchSpace

This Collection

Quick Links

Legislation and compliance

General Enquiries

Social Connect