VG-CGARN: Video Generation Using Convolutional Generative Adversarial and Recurrent Networks

سال انتشار: 1404
نوع سند: مقاله ژورنالی
زبان: انگلیسی
مشاهده: 82

فایل این مقاله در 13 صفحه با فرمت PDF قابل دریافت می باشد

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

شناسه ملی سند علمی:

JR_IJWR-8-2_005

تاریخ نمایه سازی: 16 خرداد 1404

چکیده مقاله:

Generating dynamic videos from static images and accurately modeling object motion within scenes are fundamental challenges in computer vision, with broad applications in video enhancement, photo animation, and visual scene understanding. This paper proposes a novel hybrid framework that combines convolutional neural networks (CNNs), recurrent neural networks (RNNs) with long short-term memory (LSTM) units, and generative adversarial networks (GANs) to synthesize temporally consistent and spatially realistic video sequences from still images. The architecture incorporates splicing techniques, the Lucas-Kanade motion estimation algorithm, and a loop feedback mechanism to address key limitations of existing approaches, including motion instability, temporal noise, and degraded video quality over time. CNNs extract spatial features, LSTMs model temporal dynamics, and GANs enhance visual realism through adversarial training. Experimental results on the KTH dataset, comprising ۶۰۰ videos of fundamental human actions, demonstrate that the proposed method achieves substantial improvements over baseline models, reaching a peak PSNR of ۳۵.۸ and SSIM of ۰.۹۶—representing a ۲۰% performance gain. The model successfully generates high-quality, ۱۰-second videos at a resolution of ۷۲۰×۱۲۸۰ pixels with significantly reduced noise, confirming the effectiveness of the integrated splicing and feedback strategy for stable and coherent video generation.

نویسندگان

Fatemeh Sobhani Manesh

MSc, Computer Engineering Department, Bu-Ali Sina University, Hamedan, Iran

Amin Nazari

Ph.D. Candidate, Computer Engineering Department, Bu-Ali Sina University, Hamedan, Iran

Muharram Mansoorizadeh

Associate Professor, Computer Engineering Department, Bu-Ali Sina University, Hamedan, Iran

MirHossein Dezfoulian

Assistant Professor, Computer Engineering Department, Bu-Ali Sina University, Hamedan, Iran

مراجع و منابع این مقاله:

لیست زیر مراجع و منابع استفاده شده در این مقاله را نمایش می دهد. این مراجع به صورت کاملا ماشینی و بر اساس هوش مصنوعی استخراج شده اند و لذا ممکن است دارای اشکالاتی باشند که به مرور زمان دقت استخراج این محتوا افزایش می یابد. مراجعی که مقالات مربوط به آنها در سیویلیکا نمایه شده و پیدا شده اند، به خود مقاله لینک شده اند :
  • P. Zhou, et al., "A survey on generative ai and ...
  • A. Khang, V. Abdullayev, E. Litvinova, S. Chumachenko, A. V. ...
  • J. Walker, K. Marino, A. Gupta, and M. Hebert, "The ...
  • Tang Y, Bi J, Xu S, Song L, Liang S, ...
  • X. Zhao, L. Wang, Y. Zhang, X. Han, M. Deveci, ...
  • S. M. Al-Selwi et al., "RNN-LSTM: From applications to modeling ...
  • S. Gupta, A. Keshari and S. Das, "Rv-gan: Recurrent gan ...
  • Z. Xing, Q. Dai, H. Hu, Z. Wu and Y. ...
  • K.Vougioukas, P. Ma, S. Petridis and M. Pantic, "Video-driven speech ...
  • J. Chen, Y. Li, K. Ma and Y. Zheng, "Generative ...
  • R. Mira, K. Vougioukas, P. Ma, S. Petridis, B. W. ...
  • L. Lan and C. Ye, "Recurrent generative adversarial networks for ...
  • S. Singh, B. Aggarwal, V. Bhardwaj and A. Kumar, "Motion ...
  • R. Qamar, N. Bajao, I. Suwarno and F. A. Jokhio, ...
  • F. T. Hong, L. Zhang, L. Shen and D. Xu, ...
  • T. Liu, D. Yan, N. Yan and G. Chen, "Anti-forensics ...
  • Z. Zhang, S. H. Zhong, A. Fares and Y. Liu, ...
  • G. Shrivastava and A. Shrivastava, "Video prediction by modeling videos ...
  • L. Gan, J. Lai, J. Ju, L. Gao and Y. ...
  • S. Gupta, P. Agrawal, P. Gupta, "MAUCell: An Adaptive Multi-Attention ...
  • G. Shrivastava and A. Shrivastava, "Continuous Video Process: Modeling Videos ...
  • M. Shen and C. Cheng, "Neural SDEs as a Unified ...
  • A. Davtyan, S. Sameni and P. Favaro, "Efficient video prediction ...
  • F. Cui et al., "State-space Decomposition Model for Video Prediction ...
  • M. Sun, W. Wang, X. Zhu and J. Liu "Moso: ...
  • I. Arel, D. C. Rose and T. P. Karnowski, "Deep ...
  • R. Collobert and J. Weston, "A unified architecture for natural ...
  • V. Sze, Y. H. Chen, T. J. Yang and J. ...
  • M. Yang, F. Lv, W. Xu, K. Yu and Y. ...
  • نمایش کامل مراجع