The Bangla Multi-task Hatespeech Identification shared task is designed to address the complex and nuanced problem of detecting and understanding hate speech in Bangla across multiple related subtasks such as type of hate, severity, and target group. In contrast to single-task approaches, this shared task embraces a multi-task learning framework, where models are trained to jointly learn several hate speech detection objectives. This approach is more reflective of real-world scenarios, where identifying hate speech requires understanding not just its presence, but also its type, target, and severity.
This shared task is designed to identify the type of hate, its severity, and the targeted group from social media content. The goal is to develop robust systems that advance research in this area. In this shared task, we will have three subtasks:
We choose the following evaluation metrics considering imbalance across classes:
Please follow the steps to participate:
For a brief overview of the dataset, kindly refer to the README.md file located in the data directory.
Each file uses the tsv format. A row within the tsv adheres to the following structure:
id text label
Where:
490273 আওয়ামী লীগের সন্ত্রাসী কবে দরবেন এই সাহস আপনাদের নাই Political Hate
Each file uses the tsv format. A row within the tsv adheres to the following structure:
id text label
Where:
490273 আওয়ামী লীগের সন্ত্রাসী কবে দরবেন এই সাহস আপনাদের নাই Organization
Each file uses the tsv format. A row within the tsv adheres to the following structure:
id text hate_type hate_severity to_whom
Where:
490273 আওয়ামী লীগের সন্ত্রাসী কবে দরবেন এই সাহস আপনাদের নাই "Political Hate" "Little to None" Organization
The scorer for the task is located in the scorer module of the project. The scorer will report official evaluation metrics and other metrics of a prediction file. The scorer invokes the format checker for the task to verify the output is properly shaped. It also handles checking if the provided predictions file contains all tweets from the gold one.
You can install all prerequisites through,
pip install -r requirements.txt
Launch the scorer for the task as follows:
python scorer/task.py --gold-file-path=<path_gold_file> --pred-file-path=<predictions_file>
python scorer/task.py --pred_files_path task_dev_output.txt --gold_file_path data/dev.tsv
The baselines module currently contains a majority, random and a simple n-gram baseline.
Baseline Results for the task on Dev-Test set
Model | micro-F1 |
---|---|
Random Baseline | 0.1465 |
Majority Baseline | 0.5760 |
n-gram Baseline | 0.6075 |
Baseline Results for the task on Dev-Test set
Model | micro-F1 |
---|---|
Random Baseline | 0.2118 |
Majority Baseline | 0.6083 |
n-gram Baseline | 0.6279 |
Baseline Results for the task on Dev-Test set
Model | weighted micro-F1 |
---|---|
Random Baseline | 0.2300 |
Majority Baseline | 0.6222 |
n-gram Baseline | 0.6401 |
The format checkers for the task are located in the format_checker module of the project. The format checker verifies that your generated results file complies with the expected format.
Before running the format checker please install all prerequisites,
pip install -r requirements.txt
To launch it, please run the following command:
python format_checker/task.py -p results_files
python format_checker/task.py -p ./subtask_1A.tsv
results_files: can be one path or space-separated list of paths
Evaluation consists of two phases:
For each phase, please adhere to the following guidelines:
Submission file format is tsv (tab seperated values). A row within the tsv adheres to the following structure:
id label model
Where:
Submission file format is tsv (tab seperated values). A row within the tsv adheres to the following structure:
id hate_type hate_severity to_whom model
Where:
For updates and resources, visit the GitHub repository.
To communicate join our Slack Channel
Participants must agree to use the dataset for research purposes only and cite the shared task paper and dataset source in any publication or derivative work.
@inprocedding{hasan2025multihate,
title="BanglaMultiHate",
}