Evaluating Annotation Consistency in Offensive Language Detection: A Data Analytics Approach on the TweetEval Dataset
Abstract
Most machine learning models are not only highly
dependent on difficult datasets but also on the quality of labeled
data they are trained on, especially for offensive content detection.
In this paper, we study the TweetEval dataset to provide a
comparison of its ground truth with manually annotated labels;
inter-annotator agreements are applied here as a metric for
assessing the consistency of annotation. Cohen’s Kappa coefficient
is used to quantify how much each pair of annotators agreed and
where they differed. In-depth examination of missed classifications
demonstrates other difficulties with manual labelling: subjective
interpretation, context dependency, and annotator bias. The in-
sights gathered demonstrate how manual annotation can have
positive and negative effects on further model training practices,
highlighting the importance of standardized annotation guidelines.
In their actions, the findings contribute to enhancing offensive
content detection models by advocating dataset reliability and the
reduction of inconsistencies in labeling.
Keywords:
—TweetEval Dataset, Annotation Consistency, Inter- Annotator Agreement,Cohen’s Kappa,, Offensive Language Detection, Hybrid Models,Annotator BiasPublished
Issue
Section
License
Copyright (c) 2025 International Journal on Emerging Research Areas

This work is licensed under a Creative Commons Attribution 4.0 International License.
All published work in this journal is licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0). This license permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
How to Cite
Similar Articles
- Jyothika Anil, Milan Joseph Mathew, Namitha S Mukkadan, Reshmi Raveendran, Rintu Jose, Driver Drowsiness Detection Using Smartphone Application , International Journal on Emerging Research Areas: Vol. 3 No. 1 (2023): IJERA
- Akhil Shaji, Albin Joshy, M J Athulkrishna, Joel Biju, Bino Thomas, COLLEGE BUS SECURITY AND MANAGEMENT SYSTEM , International Journal on Emerging Research Areas: Vol. 3 No. 1 (2023): IJERA
- Rohan Malka, Jerin Joseph Abraham, Jobcy Johnson, Sobin Saju, Febin Sam Philip, Aju Mathew George, S.N.Kumar , Green Waste Utilization for Sustainable Energy Engineering Application: A Path towards Green Circular Economy , International Journal on Emerging Research Areas: Vol. 4 No. 1 (2024): IJERA
- Insaf Finser , Georgy Prakash P , Bipin Dev B, Jacob Cyriac, Elisabeth Thomas, QUESTORA Shape Your Own Adventure , International Journal on Emerging Research Areas: Vol. 5 No. 1 (2025): IJERA
- Manna Mariam Abraham, Naveen Moncy Mathew , Richu Sakeer Hussain, Tima Jose Thachara , Bibin Varghese, Wild Watch Sentry , International Journal on Emerging Research Areas: Vol. 4 No. 1 (2024): IJERA
- Amarnath C, Adarsh P Kurian, Fabeela Ali Rawther, Adarsh K Sundaresan, Adarsh Suresh, INTELLI TRAFFIC MANAGEMENT SYSTEM , International Journal on Emerging Research Areas: Vol. 5 No. 1 (2025): IJERA
- Kevin Roy, Lino Shaji, Riya G Johnson, Tince Tomy, Jane George, INTELLIGENT BUDDY , International Journal on Emerging Research Areas: Vol. 3 No. 1 (2023): IJERA
- Anna N Kurian, Amitha Anil, Andriya Raju, Ancita J Feriah, Aiswarya Lakshmi Navami, Deep Learning based Multimodal Brain MRI Tumor Classification as a Diagnostic Tool to Benefit Clinical Applications , International Journal on Emerging Research Areas: Vol. 4 No. 2 (2024): IJERA
- Ann Mary Babu, Anto K Thomas, Aswin Sebastian, Beffin K Lalu, Dr Jacob John, Assistive Technology For Deaf And Dumb , International Journal on Emerging Research Areas: Vol. 3 No. 1 (2023): IJERA
- Dr. S. Perumal Sankar, P K Renjith, Ahammed Suhail P.I, Aswathy P S, Nithya Mary K J , Sharon K J, iAssist – An Intelligent Reading Assistant for Visually Impaired , International Journal on Emerging Research Areas: Vol. 4 No. 1 (2024): IJERA
You may also start an advanced similarity search for this article.
