跳到主要导航 跳到搜索 跳到主要内容

Book2QA: A Framework for Integrating LLMs to Generate High-quality QA Data from Textbooks

  • Zhanhao Cui
  • , Ye Wang
  • , Xinya Huang
  • , Wen Wu
  • , Wenxin Hu*
  • *此作品的通讯作者
  • East China Normal University
  • Renmin University of China

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

The scarcity of high-quality question answering (QA) data remains a significant bottleneck in the development of intelligent educational systems, as traditional datasets lack the necessary scale and diversity to support personalized model training. To address this challenge, we propose the Book2QA framework, which integrates multiple language models to provide an effective and flexible approach for generating QA datasets derived from textbook content. To further enhance the quality of the generated data, we implement a hierarchical prompting strategy grounded in Bloom's taxonomy, substantially increasing both the depth and breadth of the QA datasets. In addition, we fine-tune our model using these data, with evaluations conducted by both human reviewers and GPT-4 confirming its strong performance in real-world questioning scenarios. Experimental results demonstrate that our framework excels not only in specific textbook domains but also shows promise for broader applications across diverse fields. We open source our data and code at https://github.com/Curtain2020/Book2QA.

源语言英语
主期刊名International Joint Conference on Neural Networks, IJCNN 2025 - Proceedings
出版商Institute of Electrical and Electronics Engineers Inc.
ISBN(电子版)9798331510428
DOI
出版状态已出版 - 2025
活动2025 International Joint Conference on Neural Networks, IJCNN 2025 - Rome, 意大利
期限: 30 6月 20255 7月 2025

出版系列

姓名Proceedings of the International Joint Conference on Neural Networks
ISSN(印刷版)2161-4393
ISSN(电子版)2161-4407

会议

会议2025 International Joint Conference on Neural Networks, IJCNN 2025
国家/地区意大利
Rome
时期30/06/255/07/25

指纹

探究 'Book2QA: A Framework for Integrating LLMs to Generate High-quality QA Data from Textbooks' 的科研主题。它们共同构成独一无二的指纹。

引用此