基于ChineseBert的中文拼写纠错方法
|
崔凡, 强继朋, 朱毅, 李云
|
Chinese spelling correction method based on ChineseBert
|
Fan Cui, Jipeng Qiang, Yi Zhu, Yun Li
|
|
表2 各算法在SIGHAN2013,SIGHAN2014和SIGHAN2015三个测试集上的实验结果
|
Table 2 Experimental results of different algorithms on three test sets of SIGHAN2013,SIGHAN2014 and SIGHAN2015
|
|
| Character Level | Sentence Level |
---|
| Detection Level | Correction Level | Detection Level | Correction Level | SIGHAN2013 | P | R | F | P | R | F | Acc | P | R | F | Acc | P | R | F | SpellGCN | 82.6% | 88.9% | 85.7% | 98.4% | 88.4% | 93.1% | (-) | 80.1% | 74.4% | 77.2% | (-) | 78.3% | 72.7% | 75.4% | REALISE | (-) | (-) | (-) | (-) | (-) | (-) | 82.7% | 88.6% | 82.5% | 85.4% | 81.4% | 87.2% | 81.2% | 84.1% | DCN | (-) | (-) | (-) | (-) | (-) | (-) | (-) | 86.8% | 79.6% | 83.0% | (-) | 84.7% | 77.7% | 81.0% | Roberta | 80.5% | 88.0% | 84.1% | 98.0% | 86.5% | 91.9% | 77.3% | 85.1% | 76.9% | 80.8% | 75.6% | 83.6% | 76.0% | 79.6% | ChineseBert | 79.4% | 91.2% | 84.9% | 98.1% | 95.3% | 96.7% | 81.4% | 85.6% | 81.3% | 83.4% | 80.0% | 84.1% | 79.9% | 81.9% | SepSpell | 78.9% | 91.4% | 84.7% | 98.4% | 95.4% | 96.9% | 83.9% | 88.5% | 84.0% | 86.2% | 82.7% | 87.2% | 82.8% | 84.9% | SIGHAN2014 | P | R | F | P | R | F | Acc | P | R | F | Acc | P | R | F | SpellGCN | 83.6% | 78.6% | 81.0% | 97.2% | 76.4% | 85.5% | (-) | 65.1% | 69.5% | 67.2% | (-) | 63.1% | 67.2% | 65.3% | REALISE | (-) | (-) | (-) | (-) | (-) | (-) | 78.4% | 67.8% | 71.5% | 69.6% | 77.7% | 66.3% | 70.0% | 68.1% | DCN | (-) | (-) | (-) | (-) | (-) | (-) | (-) | 67.4% | 70.4% | 68.9% | (-) | 65.8% | 68.7% | 67.2% | Roberta | 82.6% | 78.0% | 80.2% | 96.9% | 75.9% | 85.1% | 74.1% | 61.2% | 67.3% | 64.1% | 73.6% | 60.3% | 66.4% | 63.2% | ChineseBert | 80.3% | 79.4% | 79.8% | 97.1% | 88.4% | 92.5% | 77.1% | 66.0% | 68.1% | 67.1% | 76.4% | 64.6% | 66.5% | 65.5% | SepSpell | 79.9% | 79.6% | 79.8% | 98.0% | 89.2% | 93.4% | 78.3% | 67.2% | 71.2% | 69.1% | 77.5% | 65.5% | 69.4% | 67.4% | SIGHAN2015 | P | R | F | P | R | F | Acc | P | R | F | Acc | P | R | F | SpellGCN | 88.9% | 87.7% | 88.3% | 95.7% | 83.9% | 89.4% | (-) | 74.8% | 80.7% | 77.7% | (-) | 72.1% | 77.7% | 75.9% | REALISE | (-) | (-) | (-) | (-) | (-) | (-) | 84.7% | 77.3% | 81.3% | 79.3% | 84.0% | 75.9% | 79.9% | 77.8% | DCN | (-) | (-) | (-) | (-) | (-) | (-) | (-) | 77.1% | 80.9% | 79.0% | (-) | 74.5% | 78.2% | 76.3% | Roberta | 86.9% | 87.3% | 87.1% | 95.1% | 82.0% | 88.1% | 82.9% | 73.2% | 80.4% | 76.7% | 81.7% | 71.0% | 78.0% | 74.5% | ChineseBert | 87.5% | 87.6% | 87.5% | 96.1% | 92.1% | 94.0% | 84.9% | 77.1% | 81.3% | 79.1% | 83.8% | 75.0% | 79.1% | 77.0% | SepSpell | 87.0% | 86.5% | 86.7% | 97.3% | 92.4% | 94.8% | 86.6% | 81.7% | 80.6% | 81.1% | 85.6% | 79.6% | 78.6% | 79.1% |
|
|
|