1.东北石油大学计算机与信息技术学院,黑龙江 大庆 163318
2.东北石油大学机械科学与工程学院,黑龙江 大庆 163318
[ "王兵(1982- ),女,博士,东北石油大学计算机与信息技术学院副教授,主要研究方向为智慧教育、量子图像处理、深度学习等。" ]
[ "单瑞雪(2000- ),女,东北石油大学计算机与信息技术学院硕士生,主要研究方向为智慧教育、深度学习等。" ]
[ "邢海燕(1971- ),女,博士,东北石油大学机械科学与工程学院教授、博士生导师,主要研究方向为智慧教育、陆地及海洋钻井平台与油气管道检测机器人设计等。" ]
[ "李盼池(1969- ),男,博士,东北石油大学计算机与信息技术学院教授、博士生导师,主要研究方向为智慧教育、量子衍生计算、深度学习等。" ]
收稿:2025-03-05,
修回:2025-06-12,
录用:2025-07-17,
纸质出版:2025-09-15
移动端阅览
王兵,单瑞雪,邢海燕等.基于E5-SHAP算法的可解释英语作文自动评分语言模型[J].智能科学与技术学报,2025,07(03):370-380.
WANG Bing,SHAN Ruixue,XING Haiyan,et al.An interpretable automated essay scoring model for English compositions based on SHAP algorithm[J].Chinese Journal of Intelligent Science and Technology,2025,07(03):370-380.
王兵,单瑞雪,邢海燕等.基于E5-SHAP算法的可解释英语作文自动评分语言模型[J].智能科学与技术学报,2025,07(03):370-380. DOI: 10.11959/j.issn.2096-6652.202530.
WANG Bing,SHAN Ruixue,XING Haiyan,et al.An interpretable automated essay scoring model for English compositions based on SHAP algorithm[J].Chinese Journal of Intelligent Science and Technology,2025,07(03):370-380. DOI: 10.11959/j.issn.2096-6652.202530.
针对英语作文自动评分系统因依赖复杂深度学习模型而缺乏可解释性的问题,提出了一种基于E5-SHAP算法的可解释英语作文自动评分模型。该模型基于E5-Base模型编码器提取文本特征,结合均值计算和回归层实现评分输出,并引入自适应加权机制,从语法、句法、词汇多样性等6个维度综合评估作文质量。模型采用LoRA微调技术优化特定层参数,提高对作文特征的适应性。通过SHAP算法计算各特征对最终评分的影响,从而提供清晰的评分依据和解释路径,提升评分过程的透明性和可信度。实验结果表明,与现有模型相比,该模型在ELLIPSE数据集和自建数据集上的表现均有所提升,二次加权卡帕值(QWK)达0.84,在准确性和可解释性上优于现有模型。
In response to the lack of interpretability in English composition automatic scoring systems due to their reliance on complex deep learning models
an interpretable English composition automatic scoring model was proposed based on the E5-SHAP algorithm. This model was based on the E5 base model encoder to extract text features
combined with a mean calculation and a regression layer to achieve scoring output. It introduced an adaptive weighting mechanism to comprehensively evaluate the quality of compositions across six dimensions
including grammar
syntax
and vocabulary diversity. The model utilized LoRA fine-tuning technology to optimize specific layer parameters and enhance adaptability to compositional features. By using the SHAP algorithm to calculate the impact of each feature on the final score
a clear scoring basis and explanation path was provided to enhance the transparency and credibility of the scoring process. The experimental results show that compared with existing models
the performance of this model has been improved on both the ELLIPSE dataset and self-built dataset
with a quadratic weighted kappa value (QWK) of 0.84
which is superior to existing models in accuracy and interpretability.
陶雯馨, 俞万杰. 浅析全球化背景下的空间站语言及中国空间站语言传播策略[J]. 现代语言学, 2023, 11(6): 2510-2515.
TAO W X, YU W J. Analysis of space station language in the context of globalization and Chinese space station language communication strategy[J]. Modern Linguistics, 2023, 11(6): 2510-2515.
王飞跃. 智能科技与K21教育: 未来社会的未来学校与未来师生[J]. 智能科学与技术学报, 2024, 6(3): 281-283.
WANG F Y. K21 education and intelligent sciences: future schools and teachers/students in future society[J]. Chinese Journal of Intelligent Science and Technology, 2024, 6(3): 281-283.
于洲, 时真妹. 二语写作中iWrite机评反馈和教师反馈的对比研究[J]. 创新教育研究, 2022, 10(2): 214-223.
YU Z, SHI Z M. A comparative study of iWrite machine feedback and teacher feedback in second language writing[J]. Creative Education Studies, 2022, 10(2): 214-223.
梁茂成, 文秋芳. 国外作文自动评分系统评述及启示[J]. 外语电化教学, 2007(5): 18-24.
LIANG M C, WEN Q F. A critical review and implications of some automated essay scoring systems[J]. Computer-Assisted Foreign Language Education, 2007(5): 18-24.
李晨亮, 吴鸿涛. 基于梯度相似性的自动作文评分多主题联合预训练方法[J]. 电子科技大学学报, 2022, 51(4): 558-564.
LI C L, WU H T. A gradient-similarity based multi-topic jointly pre-training method for automated essay scoring[J]. Journal of University of Electronic Science and Technology of China, 2022, 51(4): 558-564.
林美乐. 基于隐含狄利克雷分配和机器学习的英语作文自动评分研究[D]. 上海: 上海师范大学, 2024: 1-57.
LIN M L. Research on Automated English Essay Scoring Based on Latent Dirichlet Allocation and Machine Learning[D]. Shanghai: Shanghai Normal University, 2024: 1-57.
PAGE E B. Grading essays by computer: progress report[C]//Proceedings of the 1967 Invitational Conference on Testing Problems. Princeton: Educational Testing Service, 1967: 87-100.
CHEN H B, XU J G, HE B. Automated essay scoring by capturing relative writing Quality 1 [J ] . The Computer Journal, 2014, 57(9): 1318-1330.
黄凯. 英语作文自动评分关键技术的研究与实现[D]. 武汉: 华中师范大学, 2019:1-53.
HUANG K. Research and implementation of the key technology of English automatic essay scoring[D]. Wuhan: Central China Normal University, 2019: 1-59.
栾笛. 中考英文作文自动评分系统设计与实现[D]. 武汉: 华中科技大学, 2021: 1-65.
LUAN D. Design and implementation of an automated English essay scoring system for the high school entrance examination[D]. Wuhan: Huazhong University of Science and Technology, 2021: 1-65.
周险兵, 樊小超, 任鸽, 等. 基于多层次语义特征的英文作文自动评分方法[J]. 计算机应用, 2021, 41(8): 2205-2211.
ZHOU X B, FAN X C, REN G, et al. Automated English essay scoring method based on multi-level semantic features[J]. Journal of Computer Applications, 2021, 41(8): 2205-2211.
黎秋艳, 刘佳祎, 王鹏, 等. 基于GloVe-CNN算法的英语在线考试主观题自动评分模型[J]. 桂林理工大学学报, 2023, 43(1): 155-160.
LI Q Y, LIU J Y, WANG P, et al. Automatic scoring model of English online examination subjective questions based on GloVe-CNN algorithm[J]. Journal of Guilin University of Technology, 2023, 43(1): 155-160.
ALIKANIOTIS D, YANNAKOUDAKIS H, REI M. Automatic text scoring using neural networks[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg: ACL, 2016: 715-725.
JIN C C, HE B, HUI K, et al. TDNN: a two-stage deep neural network for prompt-independent automated essay scoring[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg: ACL, 2018: 1088-1097.
UTO M, XIE Y K, UENO M. Neural automated essay scoring incorporating handcrafted features[C]//Proceedings of the 28th International Conference on Computational Linguistics. International Committee on Computational Linguistics, 2020: 6077-6088.
陈宇航, 杨勇, 帕力旦·吐尔逊. 多维度特征增强的作文自动评分[J]. 新疆师范大学学报(自然科学版), 2023, 42(3): 43-49, 58.
CHEN Y H, YANG Y, PALIDAN T E X. Enhance multi-dimensional features for automatic essay scoring[J]. Journal of Xinjiang Normal University (Natural Sciences Edition), 2023, 42(3): 43-49, 58.
于明诚, 党亚固, 吴奇林, 等. 基于多尺度上下文的英文作文自动评分研究[J]. 计算机工程, 2024, 50(3): 259-266.
YU M C, DANG Y G, WU Q L, et al. Research on automatic scoring for English essay based on multi-scale context[J]. Computer Engineering, 2024, 50(3): 259-266.
马钰, 杨勇, 任鸽, 等. 基于GCN和微调BERT的作文自动评分方法[J]. 计算机与现代化, 2024(9): 33-37, 44.
MA Y, YANG Y, REN G, et al. Automated essay scoring method based on GCN and fine tuned BERT[J]. Computer and Modernization, 2024(9): 33-37, 44.
SONG W, ZHANG K, FU R J, et al. Multi-stage pre-training for automated Chinese essay scoring[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2020: 6723-6733.
ABDULKAREM A, KRIVTSUN A. Bert embedding and scoring for scientific automatic essay grading[C]//Proceedings of the 12th International Conference on Applied Innovations in IT. [S.l.:s.n.], 2024: 115-121.
王飞跃. DeepSeek呼唤DeepThink: 重视AI治理与社会范式变革[J]. 智能科学与技术学报, 2025, 7(1): 1-3.
WANG F Y. DeepSeek calls DeepThink: rethinking AI governance and societal paradigm shift[J]. Chinese Journal of Intelligent Science and Technology, 2025, 7(1): 1-3.
HAMILTON R I, PAPADOPOULOS P N. Using SHAP values and machine learning to understand trends in the transient stability limit[J]. IEEE Transactions on Power Systems, 2024, 39(1): 1384-1397.
ANTONINI A S, TANZOLA J, ASIAIN L, et al. Machine Learning model interpretability using SHAP values: Application to Igneous Rock Classification task[J]. Applied Computing and Geosciences, 2024, 23: 100178.
MACHLEV R, PERL M, BELIKOV J, et al. Measuring explainability and trustworthiness of power quality disturbances classifiers using XAI: explainable artificial intelligence[J]. IEEE Transactions on Industrial Informatics, 2022, 18(8): 5127-5137.
TAGHIPOUR K, NG H T. A neural approach to automated essay scoring[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2016: 1882-1891.
JIANG B, CHEN X, LIU W, et al. Motiongpt: human motion as a foreign language[J]. Advances in Neural Information Processing Systems, 2023, 36: 20067-20079.
LEE G G, ZHAI X M. NERIF: GPT-4V for automatic scoring of drawn models[J]. arXiv preprint, 2023, arXiv: 2311.12990.
MANSOUR W, ALBATARNI S, ELTANBOULY S, et al. Can large language models automatically score proficiency of written essays?[J]. arXiv preprint, 2024, arXiv: 2403.06149.
0
浏览量
18
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621
