Evaluating Annotation Consistency in Offensive Language Detection: A Data Analytics Approach on the TweetEval Dataset
Abstract
Most machine learning models are not only highly
dependent on difficult datasets but also on the quality of labeled
data they are trained on, especially for offensive content detection.
In this paper, we study the TweetEval dataset to provide a
comparison of its ground truth with manually annotated labels;
inter-annotator agreements are applied here as a metric for
assessing the consistency of annotation. Cohen’s Kappa coefficient
is used to quantify how much each pair of annotators agreed and
where they differed. In-depth examination of missed classifications
demonstrates other difficulties with manual labelling: subjective
interpretation, context dependency, and annotator bias. The in-
sights gathered demonstrate how manual annotation can have
positive and negative effects on further model training practices,
highlighting the importance of standardized annotation guidelines.
In their actions, the findings contribute to enhancing offensive
content detection models by advocating dataset reliability and the
reduction of inconsistencies in labeling.
Keywords:
—TweetEval Dataset, Annotation Consistency, Inter- Annotator Agreement,Cohen’s Kappa,, Offensive Language Detection, Hybrid Models,Annotator BiasPublished
Issue
Section
License
Copyright (c) 2025 International Journal on Emerging Research Areas

This work is licensed under a Creative Commons Attribution 4.0 International License.
All published work in this journal is licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0). This license permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
How to Cite
Similar Articles
- Dr.Jacob John, Aadhi Lakshmi M R, Alan Thomas Shaji, Alphonsa Francis, Adithyan Suresh Kumar, An Idea Sharing and Validation Platform Using Blockchain with Delegated Proof of Contribution (DPoC) , International Journal on Emerging Research Areas: Vol. 4 No. 1 (2024): IJERA
- Amith Bino, Don Peter Joseph, Sreehari P, Anchal J Vattakunnel, Revolutionizing Nutritional Management Through Food Scanning And Object Detection: A New Android Application For Adults , International Journal on Emerging Research Areas: Vol. 3 No. 1 (2023): IJERA
- Muhammed Saalim O.S, Fathima Parvin M.A, Albiya Hameed, Hiba Fathima T.S, Amritha Soloman, AGRISEN Precise irrigation System and Smart health monitoring system , International Journal on Emerging Research Areas: Vol. 4 No. 1 (2024): IJERA
- Dr.Jacob John, Alan Thomas Shaji, Adithyan Suresh Kumar, Aadhi Lakshmi M R, Alphonsa Francis, An Idea Sharing and Validation Platform Using Blockchain with Delegated Proof of Contribution (DPoC) , International Journal on Emerging Research Areas: Vol. 4 No. 1 (2024): IJERA
- Vinayak Prakash, Tresa Mariya Denny, Vivek Subash Nair, Sonal Varghese, Tom Kurian, FEATURE EXTRACTION AND CLASSIFICATION OF CERTIFICATES USING OCR , International Journal on Emerging Research Areas: Vol. 5 No. 1 (2025): IJERA
- Krishnendu B, Sreelakshmi A, Sumayya Maheen, Zameel Hassan, Honey Joseph, Chatbot-Enabled Symptom Assessment: Revolutionizing Disease Diagnosis and Patient Care , International Journal on Emerging Research Areas: Vol. 4 No. 1 (2024): IJERA
- V Naveen, S Rekha, A Concise Review On E-Commerce Website For Visually Impaired , International Journal on Emerging Research Areas: Vol. 3 No. 1 (2023): IJERA
- Athulya Anilkumar, Abhinav V V, Aneeta Shajan, Anjana S Nair, Bini M Issac, R Neenu, Image Descriptor For Visually Impaired , International Journal on Emerging Research Areas: Vol. 3 No. 1 (2023): IJERA
- George P Kurias, Gokul Krishna AU, Jifith Joseph, Sharunmon R, Linsa Mathew, A Review of Methodologies for Detecting Missing and Wanted People Using Machine Learning and Video Surveillance , International Journal on Emerging Research Areas: Vol. 4 No. 2 (2024): IJERA
- Minu Cherian, Elzabeth Bobus, Bala Susan Jacob, M Annapoorna, Ashwin Mathew Zacheria, Empowering Laptop Selection with Natural Language Processing Chatbot and Data Driven Filtering Assistance , International Journal on Emerging Research Areas: Vol. 4 No. 1 (2024): IJERA
You may also start an advanced similarity search for this article.
