Evaluating Annotation Consistency in Offensive Language Detection: A Data Analytics Approach on the TweetEval Dataset
Abstract
Most machine learning models are not only highly
dependent on difficult datasets but also on the quality of labeled
data they are trained on, especially for offensive content detection.
In this paper, we study the TweetEval dataset to provide a
comparison of its ground truth with manually annotated labels;
inter-annotator agreements are applied here as a metric for
assessing the consistency of annotation. Cohen’s Kappa coefficient
is used to quantify how much each pair of annotators agreed and
where they differed. In-depth examination of missed classifications
demonstrates other difficulties with manual labelling: subjective
interpretation, context dependency, and annotator bias. The in-
sights gathered demonstrate how manual annotation can have
positive and negative effects on further model training practices,
highlighting the importance of standardized annotation guidelines.
In their actions, the findings contribute to enhancing offensive
content detection models by advocating dataset reliability and the
reduction of inconsistencies in labeling.
Keywords:
—TweetEval Dataset, Annotation Consistency, Inter- Annotator Agreement,Cohen’s Kappa,, Offensive Language Detection, Hybrid Models,Annotator BiasPublished
Issue
Section
License
Copyright (c) 2025 International Journal on Emerging Research Areas

This work is licensed under a Creative Commons Attribution 4.0 International License.
All published work in this journal is licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0). This license permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
How to Cite
Similar Articles
- Mishal Rose Thankachan, Joshua John Sajit, Merwin Maria Antony, Richa Maria Biju, Richa Maria Biju, Bini M Issac, Pixelyse : ViT- VAE for Document Forgery Detection , International Journal on Emerging Research Areas: Vol. 5 No. 1 (2025): IJERA
- Adith Ajay, Automatic Fall Detection And Alert System For Home Safety , International Journal on Emerging Research Areas: Vol. 3 No. 1 (2023): IJERA
- Mekha , Abishek R Paleri, Athul Mohan, Avin Joshy, Smart Road Condition Monitoring and Optimal Routing System Using Yolo V11 , International Journal on Emerging Research Areas: Vol. 5 No. 1 (2025): IJERA
- Adams Mathew, Adithya Sanil, Akhil J Medackal, Nikhil J Medackal, Dyni Thomas, A Literature Review on IMAGE FORGERY DETECTION , International Journal on Emerging Research Areas: Vol. 3 No. 1 (2023): IJERA
- Albin , Aarunya Retheep, Adona Shibu, Athul P Shibu, Lis Jose, LanguaGuide -Your personalized AI companion for mastering languages, anytime, anywhere. , International Journal on Emerging Research Areas: Vol. 5 No. 1 (2025): IJERA
- Adona Shibu, Aarunya Retheep, Albin Joseph, Ali Jasim, Adona Shibu , International Journal on Emerging Research Areas: Vol. 4 No. 1 (2024): IJERA
- Sebastian Biju, Samuel Michael, Thomas Mathew Jose, Mathew Abraham, Minu Cherian, A Review of Machine Learning Approaches for Canine Skin Disease Detection Using Image Processing Techniques , International Journal on Emerging Research Areas: Vol. 4 No. 2 (2024): IJERA
- Amina Manaf , Ance Maria Joseph , Angel Joy , Anjaly Anilkumar , K S Rekha, Driver Drowsiness Detection Using Python , International Journal on Emerging Research Areas: Vol. 4 No. 1 (2024): IJERA
- Jyothis Joseph , Ajay K Baiju, Ganga Binukumar, Akshara Manoj, Sandra Elizabeth Rony, A Crowd Monitoring and Real-Time Tracking System using CNN , International Journal on Emerging Research Areas: Vol. 4 No. 1 (2024): IJERA
- Linsa Mathew, Jifith Joseph, George P Kurias, Gokul Krishna A U, Sharunmon R, TraceFusion: Precision AI for Missing and Wanted Person Detection , International Journal on Emerging Research Areas: Vol. 5 No. 1 (2025): IJERA
You may also start an advanced similarity search for this article.
