ID |
原文 |
译文 |
39256 |
本文所提出的方法的创新点主要在以下三个方面:1)在原有的提取各模态特征的模型基础上引入注意力机制,以此得到了视频和音频的特征选择模型,并筛选出相应的特征表达。 |
In particular, the contribution of proposed method is mainly manifested in the three aspects: 1)Using feature selection model constructed by attention-based LSTM model for choosing top k audio feature representation, and attention-based inception model for picking up visual feature representation. |
39257 |
2)使用了样本挖掘机制,剔除了无效样本,使得数据的训练更加高效。 |
2)Propose sample mining mechanism, for reconstructing the structure of the triplets that are ready to be trained, which helps the training more effective. |
39258 |
3)从计算模态间相似性和保持模态内结构不变两方面出发,利用了相应的损失函数进行模型的训练。 |
3)A novel combination of training loss function concerning inter-modal similarity and intra-modal invariance is used. |
39259 |
且所提出的模型在VEGAS数据集和自建数据集上都取得了较高的准确度。 |
Some promising experiment results evaluated by Recall K, MAP, and Precision-Recall Curves show that the proposed model can be applied to the task of video-music cross-modal retrieval. |
39260 |
发音偏误检测是计算机辅助发音训练(Computer Aided Pronunciation Training,CAPT)的重要组成部分。 |
Mispronunciation detection is an important part of computer aided pronunciation training(CAPT). |
39261 |
为了在机器辅助语料标注任务或者缺少标注语料的偏误检测任务上提高性能,本文提出解码时使用声韵母约束的扩展识别网络方法。 |
In order to improve the performance of machine aided corpus annotation task or error detection task without annotating corpus. |
39262 |
该方法将传统的语音识别中解码的自由文法循环(free grammar loop)部分换成结合声韵母交替以及字数限制规则的扩展识别网络,可以对全音素进行偏误检测,并且不会出现插入删除错误。 |
In this paper, we proposed a method of combining Deep Neural Network with initials and finals constrained extended recognition network(cERN), which replaced traditional ASR decoding's free grammar loop part with cERN. The proposed model can detect errors from all kind of phones and without insertion, deletion errors. |
39263 |
相比于传统的扩展识别网络,这种约束的扩展识别网络不需要大量的语料标注和分析。 |
Compared with the traditional extended recognition network, this constrained extended recognition network does not need a lot of corpus annotation and analysis. |
39264 |
相对于传统的发音良好度评价方法(Goodness of Pronunciation, GOP),基于这种拓展识别网络的方法不仅可以对二语学习者的发音进行正误的检测,还能给出具体的错误反馈。 |
Compared with Goodness of Pronunciation(GOP), this method can not only detect whether pronunciation is correct, but also can give learners error details. |
39265 |
实验结果表明,本文提出的基于声韵母约束拓展识别网络的方法在挑错任务上优于传统的发音质量评估(GOP)的方法,其错误接受率为29.2%,错误拒绝率为22.9%,诊断准确率为76.6%。 |
The experiment result shows that cERN is better than GOP in the task of pronunciation detection. The false accept rate is 29.2% and false reject rate is 22.9%. |