From Cover to Story: AI-Driven Genre Classification and Illustrated Narrative Creation for Children's Literature

سال انتشار: 1405
نوع سند: مقاله ژورنالی
زبان: انگلیسی
مشاهده: 24

فایل این مقاله در 17 صفحه با فرمت PDF قابل دریافت می باشد

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

شناسه ملی سند علمی:

JR_IJWR-9-1_004

تاریخ نمایه سازی: 30 بهمن 1404

چکیده مقاله:

Storytelling is a fundamental pillar of childhood development, where visual narratives play a crucial role in enhancing engagement and cognitive processing. While Generative Artificial Intelligence (GAI) has revolutionized content creation, its application for automated story generation from book covers remains largely unexplored. This study presents an innovative pipeline that combines computer vision for genre classification with GAI to create tailored illustrated stories. After evaluating four deep learning architectures widely used in image classification tasks, ConvNeXt-Tiny was selected as the final model, achieving a Weighted F۱-score of ۰.۶۸۹۸ in categorizing children's books into ۱۳ distinct genres through cover image analysis. To address the lack of benchmark datasets, we compiled and rigorously validated a specialized collection of ۴,۰۸۵ Persian children's book covers. The proposed system leverages both cover design elements and predicted genre features within structured prompts to generate coherent illustrated stories through LLMs and image-synthesis models. A sample of ۲۶ generated stories was qualitatively evaluated by three child psychologists based on narrative coherence, genre alignment, age appropriateness, character continuity, and visual congruence. This research makes significant contributions to both Persian literary analysis and AI-driven creative systems, demonstrating how machine learning can enhance educational storytelling while preserving cultural authenticity.

کلیدواژه ها:

Genre Classification ، Narrative Creation ، Deep Learning ، Large Language Model (LLM) ، Generative Artificial Intelligence (GAI) ، Children literature

نویسندگان

Maedeh Mosharraf

Faculty of Computer Science and Engineering, Shahid Beheshti University, Tehran, Iran;

Reyhaneh Naseri Moghadam

Faculty of Computer Science and Engineering, Shahid Beheshti University, Tehran, Iran;

مراجع و منابع این مقاله:

لیست زیر مراجع و منابع استفاده شده در این مقاله را نمایش می دهد. این مراجع به صورت کاملا ماشینی و بر اساس هوش مصنوعی استخراج شده اند و لذا ممکن است دارای اشکالاتی باشند که به مرور زمان دقت استخراج این محتوا افزایش می یابد. مراجعی که مقالات مربوط به آنها در سیویلیکا نمایه شده و پیدا شده اند، به خود مقاله لینک شده اند :
  • M. Sunderland, Using Story Telling as a Therapeutic Tool with ...
  • G. Trionfi and E. Reese, "A Good Story: Children With ...
  • A. Nicolopoulou, "Children and Narratives," in Narrative Development, New York, ...
  • B. Seuling, How to write a children's book and get ...
  • M. Evans and J. Saint-Aubin, "What children are looking at ...
  • R. E. Mayer, Multimedia Learning, ۳rd ed., Cambridge: Cambridge University ...
  • Y. Li, X. Zhiding, H. Wenxin and Z. Xian, "Enhancing ...
  • G. R. Biradar, R. JM, A. Varier and M. Sudhir, ...
  • P. Buczkowski, A. Sobkowicz and M. Kozlowski, "Deep Learning Approaches ...
  • R. Jayaram, M. Harshitha, S. Pavithra, B. Munshira Noor and ...
  • C. S. Kundu, "Book Genre Classification By Its Cover Using ...
  • S. Sung and R. Chokshi, "Classification of movie posters to ...
  • S. Oramas, O. Nieto, F. Barbieri and X. Serra, "Multi-Label ...
  • J. Li, D. Sun and T. Cai, "Genre Classification via ...
  • J. A. Wi, S. Jang and Y. Kim, "Poster-Based Multiple ...
  • J. Kim and H.-J. Suk, "Prediction of the Emotion Responses ...
  • U. K. Nareti, C. Adak and S. Chattopadhyay, "Demystifying Visual ...
  • S. Pooranalingam, "Film Poster Design: Understanding Film Poster Designs and ...
  • L. Xiaochuan and C. Xiangyong, "Improving Visual Storytelling with Multimodal ...
  • C. Zang, J. Tang, R. Zhang, Z. Zhao, T. Lv, ...
  • S. Yang, Y. Ge, Y. LI, Y. Chen, Y. Ge, ...
  • T. Huang, E. Qasemi, B. Li, H. Wang, F. Brahman, ...
  • A. Alabdulkarim, W. Li, L. J. Martin and M. O. ...
  • J. Canary, "Transfer Learning: Leveraging Pretrained Models - Jim Canary ...
  • K. Juntae, H. Yoonseok, Y. Hogeon and N. Jongho, "A ...
  • J.-B. Alayrac et al., “Flamingo: a visual language model for ...
  • D. Driess et al., “PaLM-E: An embodied multimodal language model,” ...
  • D. Zhu et al., “MiniGPT-۴: Enhancing vision-language understanding with advanced ...
  • H. Liu, C. Li, Q. Wu, and Y. J. Lee, ...
  • B. Hejazi, Children's and Adolescents' Literature: Features and Aspects (In ...
  • C. Zauner, "pHash – Perceptual Hash Library," [Online]. Available: https://phash.org/docs/design.html ...
  • M. F. Uddin, "Addressing Accuracy Paradox Using Enhanched Weighted Performance ...
  • X.-Z. Wu and Z.-H. Zhou, "A Unified View of Multi-Label ...
  • L. M. Justice and P. C. Pullen, "Promising Interventions for ...
  • L. R. Buccieri and P. Economy, Writing Children's Books for ...
  • S. Earnshaw, The Handbook of Creative Writing, Edinburgh: Edinburgh University ...
  • H. Rahim and M. D. H. Rahiem, "The Use of ...
  • Booka, "What Makes a Good Children’s Book: ۱۰ Important Characteristics," ...
  • Brett, "How to Tell Awesome Stories to Your Kids," Brett, ...
  • A. McCabe, "Developmental and Cross-Cultural Aspects of Children's Narration," in ...
  • J. R. Brown and J. Dunn, "Continuities in Emotion Understanding ...
  • M. N. Sala, F. Pons and P. Molina, "Emotion regulation ...
  • نمایش کامل مراجع