2022年7月4日,中国科学院声学研究所、西北工业大学、新加坡A*STAR信息通信研究所、上海交通大学以及Magic Data (北京爱数智慧科技有限公司) 联合主办的 “ISCSLP2022对话短语音说话人日志挑战赛” (ISCSLP 2022 Conversational Short-phrase Speaker Diarization Challenge, CSSD) 正式开启报名,欢迎学术界、产业界的团体及个人报名参赛。
CSSD挑战赛报名 https://magichub.com/join-competition/?id=11559
挑战赛背景
对话场景是语音处理技术最重要的场景之一,同时也是最具挑战性的场景。在日常对话中,人们以随意的方式相互回应,并以连贯的问题和意见继续对话,而不是生硬地回答对方的问题。精准检测对话中每个人的语音活动,对于自然语言处理、机器翻译等众多下游任务至关重要。说话人分类系统的评价指标是分类错误率(DER)长期以来一直被用作说话人分类的标准评估指标,但它未能对短对话短语给予足够的重视。这些短对话短语持续时间短,但在语义层面上起重要作用。语音社区也缺乏有效评估对话中短语音分类准确性的评估指标。
围绕这一难题,我们开源了 MagicData-RAMC中文对话语音数据集,其中包含 180 小时人工标注对话语音数据。同时针对CSSD测评,我们还准备了 20 小时对话测试数据,并人工对说话人时间点进行了精准标注。针对CSSD挑战,我们同时设计了一个新的准确度评估指标,用于计算句子层面说话人分割聚类的精度。通过推动对话数据分割聚类技术的研究,我们旨在进一步促进该领域的可重复研究。
竞赛组委会支持团队
挑战赛相关问题,可以邮件标题为“对话短语音说话人日志挑战赛疑问”发送邮件至iscslp.cssd@gmail.com 或 open@magicdatatech.com。
赛程设置
评分判定方式
参赛者提交推理结果,由系统进行计算指标结果,具体文件格式以及评测指标将会在比赛训练开放阶段公布。
奖项设置
比赛分别设置一等奖、二等奖和三等奖,将评选出三组获奖团队/个人,获奖者将有机会参加 ISCSLP 2022 会议进行报告分享。
报名方式
报名地址:https://magichub.com/join-competition/?id=11559
参赛人数:每队参赛人数5人以内 (含5人)
更多详情:https://www.magichub.com
欢迎各路挑战者组队报名参赛!
On July 4, 2022, ISCSLP 2022 Conversational Short-phrase Speaker Diarization Challenge (CSSD) which is jointly sponsored by the Institute of Acoustics CAS, Northwestern Polytechnical University, Singapore A*STAR Institute of Information and Communication, Shanghai Jiaotong University and Magic Data (Beijing Aishu Smart Technology Co., Ltd.), is officially opened for registration. Groups and individuals from academia and industry are welcome to register for the competition.
CSSD Challenge Registration : https://magichub.com/join-competition/?id=11559
Challenge Background
Dialogue scenarios are one of the most essential and challenging scenarios for speech processing technology. In daily conversations, people casually respond to each other and continue the conversation with coherent questions and comments rather than bluntly answering each other's questions. Accurately detecting the speech activity of each person in a conversation is critical for many downstream tasks such as natural language processing and machine translation. The evaluation metric for speaker classification systems, the classification error rate (DER), has long been used as a standard evaluation metric for speaker classification. However, it fails to pay enough attention to short dialogue phrases. These short dialogue phrases are short but play an essential role at the semantic level. The speech community also lacks evaluation metrics to effectively assess the accuracy of short speech classification in conversations.
To solve this problem, we open-sourced the MagicData-RAMC Chinese conversational speech dataset, which contains 180 hours of manually annotated conversational speech data. For the CSSD evaluation, we also prepare 20 hours of dialogue data for testing purpose, and manually annotate the speaker's timestamps. For the CSSD challenge, we also design a new accuracy evaluation metric to calculate the accuracy of sentence-level speaker diarization. By advancing research on segmentation and clustering techniques for dialogue data, we aim to further promote reproducible research in this field.
Challenge Committee and Support Team
Questions related to the challenge could email iscslp.cssd@gmail.com or open@magicdatatech.com with the subject of the email titled "Question about the Conversational Short-phrase Speaker Diarization Challenge".
Schedule
Scoring Method
Participants submit inference results, and competition committee will calculate the score. The file format and evaluation metric will be announced in the open stage of the competition.
Prize Settings
Three sets of competitors will be awarded first prize, second prize, and third prize. The winners will have the opportunity to participate in ISCSLP 2022 for presentation.
Registration
Registration website:https://magichub.com/join-competition/?id=11559
Number of participants: Less than 5 participants per team (including 5 people)
More details:https://www.magichub.com
All challengers are welcome to sign up for the competition!