Bangla Multi-task Hatespeech Identification Shared Task

Overview

The Bangla Multi-task Hatespeech Identification shared task is designed to address the complex and nuanced problem of detecting and understanding hate speech in Bangla across multiple related subtasks such as type of hate, severity, and target group. In contrast to single-task approaches, this shared task embraces a multi-task learning framework, where models are trained to jointly learn several hate speech detection objectives. This approach is more reflective of real-world scenarios, where identifying hate speech requires understanding not just its presence, but also its type, target, and severity.

Task Details

This shared task is designed to identify the type of hate, its severity, and the targeted group from social media content. The goal is to develop robust systems that advance research in this area. In this shared task, we will have three subtasks:

Subtask 1A: Given a Bangla text collected from YouTube comments, categorize whether it contains abusive, sexism, religious hate, political hate, profane, or none.
Subtask 1B: Given a Bangla text collected from YouTube comments, categorize whether the hate towards individuals, organizations, communities, or society.
Subtask 1C: This subtask is a multi-task setup. Given a Bangla text collected from YouTube comments, categorize it into type of hate, severity, and targeted group.

Official Evaluation Metrics

We choose the following evaluation metrics considering imbalance across classes:

Subtask 1A: Micro-F1 score
Subtask 1B: Micro-F1 score
Subtask 1C: Averaged Micro-F1 score

Participation

Please follow the steps to participate:

To participate in this competition, you must have an account in Codabench
Register to the competition to participate.

Competition Link

Dataset

Data Repository: https://github.com/AridHasan/blp25_task1

For a brief overview of the dataset, kindly refer to the README.md file located in the data directory.

Input data format

Subtask 1A

Each file uses the tsv format. A row within the tsv adheres to the following structure:

id	text	label

Where:

id: an index or id of the text
text: text
label: Abusive, Sexism, Religious Hate, Political Hate, Profane, or None.

Example

490273	আওয়ামী লীগের সন্ত্রাসী কবে দরবেন এই সাহস আপনাদের নাই	Political Hate

Subtask 1B

Each file uses the tsv format. A row within the tsv adheres to the following structure:

id	text	label

Where:

id: an index or id of the text
text: text
label: Individuals, Organizations, Communities, or Society.

Example

490273	আওয়ামী লীগের সন্ত্রাসী কবে দরবেন এই সাহস আপনাদের নাই	Organization

Subtask 1C

Each file uses the tsv format. A row within the tsv adheres to the following structure:

id	text	hate_type   hate_severity   to_whom

Where:

id: an index or id of the text
text: text
hate_type: Abusive, Sexism, Religious Hate, Political Hate, Profane, or None.
hate_severity: Little to None, Mild, or Severe.
to_whom: Individuals, Organizations, Communities, or Society.

Example

490273	আওয়ামী লীগের সন্ত্রাসী কবে দরবেন এই সাহস আপনাদের নাই	"Political Hate"  "Little to None"  Organization

Timeline

10 July 2025: Registration on codalab and beginning of the development cycle
25 September 2025: Beginning of the evaluation cycle (test sets release and run submission)
01 October 2025: End of the evaluation cycle
03 October 2025: Publish rank list and share paper submission details
15 October 2025: Paper Submission Deadline (Shared Task System Papers Due)
03 November 2025: Notification of acceptance
11 November 2025: Camera-ready due
23-14 December 2025: Workshop co-located with IJCNLP-AACL 2025 (Mumbai, India)

All deadlines are 11:59PM UTC-12:00 (“anywhere on Earth”).

Scorer and Official Evaluation Metrics

Scorers

The scorer for the task is located in the scorer module of the project. The scorer will report official evaluation metrics and other metrics of a prediction file. The scorer invokes the format checker for the task to verify the output is properly shaped. It also handles checking if the provided predictions file contains all tweets from the gold one.

You can install all prerequisites through,

pip install -r requirements.txt

Launch the scorer for the task as follows:

python scorer/task.py --gold-file-path=<path_gold_file> --pred-file-path=<predictions_file>

Example

python scorer/task.py --pred_files_path task_dev_output.txt --gold_file_path data/dev.tsv

Baselines

The baselines module currently contains a majority, random and a simple n-gram baseline.

Subtask 1A

Baseline Results for the task on Dev-Test set

Model	micro-F1
Random Baseline	0.1465
Majority Baseline	0.5760
n-gram Baseline	0.6075

Subtask 1B

Baseline Results for the task on Dev-Test set

Model	micro-F1
Random Baseline	0.2118
Majority Baseline	0.6083
n-gram Baseline	0.6279

Subtask 1C

Baseline Results for the task on Dev-Test set

Model	weighted micro-F1
Random Baseline	0.2300
Majority Baseline	0.6222
n-gram Baseline	0.6401

Format checker

The format checkers for the task are located in the format_checker module of the project. The format checker verifies that your generated results file complies with the expected format.

Before running the format checker please install all prerequisites,

pip install -r requirements.txt

To launch it, please run the following command:

python format_checker/task.py -p results_files

Example

python format_checker/task.py -p ./subtask_1A.tsv

results_files: can be one path or space-separated list of paths

Submission

Guidelines

Evaluation consists of two phases:

Development phase: This phase involves working on the dev-test set.
Evaluation phase: This phase involves working on the test set, which will be released during the evaluation cycle.

For each phase, please adhere to the following guidelines:

We request each team to establish and manage a single account for all submissions. Hence, all runs should be submitted through the same account. Any submissions made from multiple accounts by the same team may lead to your system being not ranked from the final ranking in the overview paper.
The most recently uploaded file on the leaderboard will serve as your final submission.
Adhere strictly to the naming convention for the output file, which must be labeled as 'task.tsv'. Deviation from this standard could trigger an error on the leaderboard.
Submission protocol requires you to compress the '.tsv' file into a '.zip' file (for instance, zip task.zip task.tsv) and submit it through the Codalab page.
With each submission, ensure to include your team name along with a brief explanation of your methodology.
Each team is allowed a maximum of 100 submissions per day for the given task. Please adhere to this limit.

Submission Format

Subtask 1A and 1B

Submission file format is tsv (tab seperated values). A row within the tsv adheres to the following structure:

id	label   model

Where:

id: a id of the text
label: [Abusive, Sexism, Religious Hate, Political Hate, Profane, or None] or [Individuals, Organizations, Communities, or Society.]
model: model name

Subtask 1C

Submission file format is tsv (tab seperated values). A row within the tsv adheres to the following structure:

id	hate_type   hate_severity   to_whom   model

Where:

id: a id of the text
hate_type: Abusive, Sexism, Religious Hate, Political Hate, Profane, or None.
hate_severity: Little to None, Mild, or Severe.
to_whom: Individuals, Organizations, Communities, or Society.
model: model name

Organizers

Md Arid Hasan

University of Toronto
Website

Firoj Alam

Qatar Computing Research Institute
Website

Md Fahad Hossain

Daffodil International University
Website

Usman Naseem

Macquarie University
Website

Syed Ishtiaque Ahmed

University of Toronto
Website

Resources

For updates and resources, visit the GitHub repository.

To communicate join our Slack Channel

Data Uses

Participants must agree to use the dataset for research purposes only and cite the shared task paper and dataset source in any publication or derivative work.

Citation


 @inprocedding{hasan2025multihate,
    title="BanglaMultiHate",
 }