Modeling spatiotemporal chromatic energy using attention-enhanced CNN-LSTM networks for deepfake video detection

Authors

  • Clive Ebomagune Asuai
    Department of Computer Science, Delta State Polytechnic, Otefe-Oghara, Nigeria
  • Ofualagba Mamuyovwi Helen
    Department of Computer Science, Delta State Polytechnic, Otefe-Oghara, Nigeria

Keywords:

Deepfake detection, Attention mechanism, Spatiotemporal chromatic energy distributions, Video forensics, CNN-LSTM

Abstract

As deepfake videos become increasingly difficult to distinguish from authentic footage, the proliferation of artificial-intelligence (AI)-generated synthetic media poses severe threats, including identity theft, misinformation campaigns, and political manipulation. Synthetic videos are frequently indistinguishable from genuine footage to human observers, rendering conventional forensic approaches insufficient. This paper presents a robust detection system capable of identifying subtle spatiotemporal anomalies in manipulated videos. We develop and evaluate a hybrid deep learning architecture that employs convolutional neural networks (CNNs) for spatial feature extraction and bidirectional long short-term memory (BiLSTM) networks for temporal sequence modeling. An attention mechanism enables the model to focus on the video segments most informative for classification. A key innovation is the introduction of spatiotemporal chromatic energy distributions (SCED) as input features, which model harmonic relationships in authentic video and demonstrate sensitivity to artifacts introduced during synthetic video generation. The proposed hybrid CNN-BiLSTM-attention model achieves 98.7% accuracy, 96.2% precision, 96.4% recall, and a 97.7% F1-score, substantially outperforming standalone CNN (88.5% accuracy) and LSTM (91.4% accuracy) baselines. This integration offers an effective approach for deepfake video detection and contributes to digital media security and reliability.

Dimensions

[1] M. I. Akazue, I. A. Debekeme, A. E. Edje, C. Asuai & U. J. Osame, ``Unmasking fraudsters: ensemble features selection to enhance random forest fraud detection'', Journal of Computer and Theoretical Applications 1 (2023) 201. https://pdfs.semanticscholar.org/7025/05350b83861f2abc215f1745314402334b0b.pdf.

[2] A. Clive, C. T. Atumah & A. Agajere Joseph-Brown, ``An improved framework for predictive maintenance in Industry 4.0 and 5.0 using synthetic IoT sensor data and boosting regressor for oil and gas operations'', International Journal of Latest Technology in Engineering, Management & Applied Science 14 (2025) 383. https://doi.org/10.51583/IJLTEMAS.2025.140400041.

[3] C. Asuai, A. Mayor, P. O. Ezzeh, H. Hosni, A. Agajere Joseph-Brown, I. A. Merit & I. Debekeme, ``Enhancing DDoS detection via 3ConFA feature fusion and 1D convolutional neural networks'', Journal of Future Artificial Intelligence and Technologies 2 (2025) 145. https://doi.org/10.62411/faith.3048-3719-105.

[4] C. Asuai, A. P. Arinomor, C. T. Atumah, I. F. Kowhoro & D. E. Ogheneochuko, ``Hybrid CNN-LSTM architectures for deepfake audio detection using Mel frequency cepstral coefficients and spectrogram analysis'', American Journal of Mathematical and Computer Modelling 10 (2025) 98. https://doi.org/10.11648/j.ajmcm.20251003.12.

[5] A. Clive, O. K. Nana & I. E. Destiny, ``Optimizing credit card fraud detection: a multi-algorithm approach with artificial neural networks and gradient boosting model'', International Research Journal of Modern Engineering Technology and Science 6 (2024) 2582. https://www.researchgate.net/publication/387335228_OPTIMIZING_CREDIT_CARD_FRAUD_DETECTION_A_MULTI-ALGORITHM_APPROACH_WITH_ARTIFICIAL_NEURAL_NETWORKS_AND_GRADIENT_BOOSTING_MODEL

[6] A. Clive, G. Giroh & W. Obinor, ``Hybrid quantum-classical strategies for hydrogen variational quantum eigensolver optimization'', Iconic Research and Engineering Journal 7 (2024) 458. https://www.irejournals.com/formatedpaper/1705910.pdf

[7] C. Asuai, A. Maureen, A. Edje, M. Andrew, P. O. Ezzeh, H. Hosni & I. Khan, ``3ConFA: a robust feature aggregation framework for high-dimensional data optimization'', Asian Journal of Research in Computer Science 18 (2025) 243. https://doi.org/10.9734/ajrcos/2025/v18i6695.

[8] R. R. Rajalaxmi, P. P. Sudharsana, A. M. Rithani, S. Preethika, P. Dhivakar & E. Gothai, ``Deepfake detection using Inception-ResNet-V2 network'', 7th International Conference on Computing Methodologies and Communication (ICCMC), Erode, India, 2023, 580-586. https://doi.org/10.1109/ICCMC56507.2023.10083584.

[9] H. H. Kilinc & F. Kaledibi, ``Audio deepfake detection by using machine and deep learning'', 10th International Conference on Innovative Approaches in Smart Technologies (ISAS), Istanbul, Turkey, 2023, 1-5. https://ieeexplore.ieee.org/document/10323004.

[10] M. Mcuba, A. Singh, R. A. Ikuesan & H. Venter, ``The effect of deep learning methods on deepfake audio detection for digital investigation'', Procedia Computer Science 219 (2023) 211. https://doi.org/10.1016/j.procs.2023.01.283.

[11] A. Alshehri, D. Almalki, E. Alharbi & S. Albaradei, ``Audio deep fake detection with Sonic Sleuth model'', Computers 13 (2024) 256. https://doi.org/10.3390/computers13100256.

[12] M. Akazue, K. Esiri & A. Clive, ``Application of RFM model on customer segmentation in digital marketing'', The Nigerian Journal of Science and Environment 22 (2024) 57. https://www.researchgate.net/publication/380263482_APPLICATION_OF_RFM_MODEL_ON_CUSTOMER_SEGMENTATION_IN_DIGITAL_MARKETING

[13] N. O. Sebastina, M. I. A. Maureen, E. O. Amanda, E. A. Clive & A. O. Arnold, ``Improving customer trust through fraud prevention e-commerce model'', Journal of Computing, Science & Technology, Unidel 1 (2024) 76. https://www.researchgate.net/publication/383948470_Improving_Customer_Trust_through_Fraud_Prevention_E-Commerce_Model

[14] M. Ganavi, R. R. Shashank, S. Varun, R. S. Salanke & V. S. Navale, ``AudioVeritas: a machine learning model to detect deepfake audio'', International Journal for Research in Applied Science and Engineering Technology 13 (2025), 1. https://doi.org/10.22214/ijraset.2025.66025.

[15] Z. Almutairi & H. Elgibreen, ``A review of modern audio deepfake detection methods: challenges and future directions'', Algorithms 15 (2022) 155. https://doi.org/10.3390/a15050155.

[16] N. V. Kulangareth, J. Kaufman, J. Oreskovic & Y. Fossat, ``Investigation of deepfake voice detection using speech pause patterns: algorithm development and validation'', JMIR Biomedical Engineering 9 (2024) e56245. https://doi.org/10.2196/56245.

[17] A. Zaharescu & R. P. Wildes, ``Anomalous behaviour detection using spatiotemporal oriented energies, subset inclusion histogram comparison and event-driven processing'', European Conference on Computer Vision (ECCV), Heraklion, Greece, 2010, pp. 563--576. Available online: https://doi.org/10.1007/978-3-642-15549-9_41.

[18] S. R. Viknesh, M. Praveen Kumar, V. N. G. Ajey Suthan & G. Sivakarthi, ``DeepFakeGuard: real-time deepfake video detection leveraging Celeb-DF dataset and CNN-LSTM framework'', 5th International Conference on Expert Clouds and Applications (ICOECA), Bengaluru, India, 2025, pp. 744--750. https://doi.org/10.1109/ICOECA66273.2025.00133.

[19] S. Tipper, H. F. Atlam & H. S. Lallie, ``An investigation into the utilisation of CNN with LSTM for video deepfake detection'', Applied Sciences 14 (2024) 9754. https://doi.org/10.3390/app14219754.

[20] Z.-M. Lai, Y. Zhang & D. Li, ``A survey of deepfake detection techniques based on Transformer'', Journal of Guangdong University of Technology 40 (2023) 155. https://doi.org/10.12052/gdutxb.230130.

[21] B. V. C. Moyo, T. Tuyikeze, F. Matsebula & I. C. Obagbuwa, ``An AI-driven conceptual framework for detecting fake news and deepfake content: a systematic review'', Frontiers in Artificial Intelligence 9 (2026) 1737790. https://doi.org/10.3389/frai.2026.1737790.

[22] D. Erokhin & N. Komendantova, ``A review of tools and technologies to combat deepfakes'', Information 17 (2026) 347. https://doi.org/10.3390/info17040347.

[23] C. E. Asuai, G. Ogbogbo, H. Hosni & M. I. Khan, ``The Chromatic Gradient Anomaly Network (CrGAN): exploiting second-order spatiotemporal inconsistencies for deepfake video detection'', International Journal of Wireless and Microwave Technologies 16 (2026) 139. https://doi.org/10.5815/ijwmt.2026.02.10.

[24] S. Khan, A. Hassan, F. Hussain, A. Perwaiz, F. Riaz, M. Alsabaan & W. Abdul, ``Enhanced spatial stream of two-stream network using optical flow for human action recognition'', Applied Sciences 13 (2023) 8003. https://doi.org/10.3390/app13148003.

[25] D. Zhang, F. Lin, Y. Hua, P. Wang, D. Zeng & S. Ge, ``Deepfake video detection with spatiotemporal dropout transformer'', arXiv:2207.06612, 2022. Available online: https://arxiv.org/abs/2207.06612.

[26] J. Li, J. Sun, Y. Meng & K. Xu, ``STCA-net: spatio-temporal collaborative attention network for deepfake video detection'', Engineering Research Express 7 (2025) 035286. https://doi.org/10.1088/2631-8695/adfad0.

[27] Z. Wang, S. Cui, Q. Kong, W. Wang & M. D. Plumbley, ``Densely connected convolutional network for audio spoofing detection'', Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Auckland, New Zealand, 2020, pp. 1352--1360. https://www.apsipa.org/proceedings/2020/pdfs/0001352.pdf.

[28] S. Chapagain, B. Thapa, S. M. S. Baidhya, S. B. K. & S. Thapa, ``Deep fake audio detection using a hybrid CNN-BiLSTM model with attention mechanism'', International Journal on Engineering Technology 2 (2025) 204. https://doi.org/10.3126/injet.v2i2.78619.

[29] Y. Gao, X. Wang, Y. Zhang, P. Zeng & Y. Ma, ``Temporal feature prediction in audio--visual deepfake detection'', Electronics 13 (2024) 3433. https://doi.org/10.3390/electronics13173433.

[30] A. Clive & G. Gideon, ``Enhanced brain tumor image classification using convolutional neural network with attention mechanism'', International Journal of Trend in Research and Development 10 (2023) 5. https://www.researchgate.net/publication/376265239_Enhanced_Brain_Tumor_Image_Classification_Using_Convolutional_Neural_Network_with_Attention_Mechanism

fig3

Published

2026-06-18

How to Cite

Modeling spatiotemporal chromatic energy using attention-enhanced CNN-LSTM networks for deepfake video detection. (2026). Proceedings of the Nigerian Society of Physical Sciences, 3, 270. https://doi.org/10.61298/pnspsc.2026.3.270

How to Cite

Modeling spatiotemporal chromatic energy using attention-enhanced CNN-LSTM networks for deepfake video detection. (2026). Proceedings of the Nigerian Society of Physical Sciences, 3, 270. https://doi.org/10.61298/pnspsc.2026.3.270