Comparative Analysis of cVAE, cGAN, and LSTM Models for Music Generation Based on Weather Conditions
Nedim Karavdić1, Bećir Isaković 2*
1Department of Information Technology, International Burch University, Sarajevo, Bosnia and Herzegovina
2Department of Information Technology, International Burch University, Sarajevo, Bosnia and Herzegovina
* Corresponding author: becir.isakovic@ibu.edu.ba
Presented at the International Symposium on AI-Driven Engineering Systems (ISADES2025), Tokat, Turkiye, Jun 19, 2025
SETSCI Conference Proceedings, 2025, 22, Page (s): 89-95 , https://doi.org/10.36287/setsci.22.11.001
Published Date: 10 July 2025
Generative Artificial Intelligence is gaining popularity every day and uses contextual data to personalize and enhance user experiences. This paper explores music generation that is conditioned on weather. It influences musical compositions by connecting MIDI music data with corresponding weather attributes, for example sunny and cloudy weather. In this paper, three generative models are compared for the task of weather based music generation; Conditional Variational Autoencoder (cVAE), a Conditional Generative Adversarial Network (cGAN) and Long Short-Term Memory (LSTM) network. Mentioned models are implemented and trained on combination of large MIDI corpus with historical weather information. Performance of models in this paper are evaluated using metrics that capture musical diversity and quality, such as pitch range, unique pitches and pitch variance, and for fidelity to real data is measured by mean squared error and KL divergence. Results of the paper showed that the cVAE produced the most diverse music for music that is context sensitive. Using cVAE in this approach helped the model in achieving the widest pitch range and variety of notes with low error. Runner ups in this comparison was cGAN that generated relevant music, but with slightly less diversity while LSTM showed higher error and inability to integrate the weather context. Contributions to this work is made of the first comparative analysis of these architectures for weather based music generation, exploration of their strength and limits and evidence that environmental data can be used to influence AI generated music. Results showed the relevance of blending weather data with music generation which can serve as a foundation for applications such as adaptive game soundtracks, mood based music therapy or dynamic background music that is being generated according to the user's environment.
Keywords - Music Generation, cVAE, cGAN, LSTM, MIDI
[1] A. van den Oord, S. Dieleman, H. Zen, K. Simonyan, O. Vinyals, A. Graves, N. Kalchbrenner, A. Senior, and K. Kavukcuoglu, “WaveNet: A Generative Model for Raw Audio,” 2016.
[2] G. Hadjeres, F. Pachet, and F. Nielsen, “DeepBach: a Steerable Model for Bach Chorales Generation,” 2017.
[3] J. Engel, K. K. Agrawal, S. Chen, I. Gulrajani, C. Donahue, and A. Roberts, “GANSynth: Adversarial Neural Audio Synthesis,” 2019.
[4] Z. Zheng, C. Cai, and Y. Zhang, “Real-Time Intelligent Big Data Processing: Technology, Platform, and Applications,” 2019.
[5] A. N. Navaz and T. Karthikeyan, “Real-Time Data Streaming Algorithms and Processing Technologies: A Survey,” 2019.
[6] J.-P. Briot, G. Hadjeres, and F.-D. Pachet, “Deep Learning Techniques for Music Generation,” 2020.
[7] P. Dhariwal, H. Jun, C. Payne, J. W. Kim, A. Radford, and I. Sutskever, “Jukebox: A Generative Model for Music,” 2020.
[8] W. X. Zhao, Y. Wu, J. Liu, and J. Xu, “Retrieval-Augmented Generation for AI-Generated Content: A Survey,” 2023.
[9] J. Wen and K. M. Ting, “Computational Intelligence Techniques for Music Composition: A Review,” 2023.
[10] Google Cloud, “Enabling Real-Time AI with Streaming Ingestion in Vertex AI,” 2022. [Online]. Available: https://cloud.google.com/blog/products/ai-machine-learning/real-time-ai-with-streaming-ingestion-in-vertex-ai
[11] Y. Zhao, X. Wang, and C. Liu, “Domain Adversarial Training on Conditional VAE for Controllable Music Generation,” 2022.
[12] D. Conner, T. Johnson, and K. Lee, “Music Generation Using LSTM Networks for Sequence Modeling,” 2022.
[13] Lakh MIDI Dataset.
[14] Historical Hourly Weather Data 2012-2017
|
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
