Music and artificial intelligence

(Redirected from Text-to-music)

Music and artificial intelligence (AI) is the development of music software programs which use AI to generate music.[1] As with applications in other fields, AI in music also simulates mental tasks. A prominent feature is the capability of an AI algorithm to learn based on past data, such as in computer accompaniment technology, wherein the AI is capable of listening to a human performer and performing accompaniment.[2] Artificial intelligence also drives interactive composition technology, wherein a computer composes music in response to a live performance. There are other AI applications in music that cover not only music composition, production, and performance but also how music is marketed and consumed. Several music player programs have also been developed to use voice recognition and natural language processing technology for music voice control. Current research includes the application of AI in music composition, performance, theory and digital sound processing.

Erwin Panofksy proposed that in all art, there existed three levels of meaning: primary meaning, or the natural subject; secondary meaning, or the conventional subject; and tertiary meaning, the intrinsic content of the subject.[3][4] AI music explores the foremost of these, creating music without the "intention" which is usually behind it, leaving composers who listen to machine-generated pieces feeling unsettled by the lack of apparent meaning.[5]

History

edit

Artificial intelligence finds its beginnings in music with the transcription problem: accurately recording a performance into musical notation as it is played. Père Engramelle's schematic of a "piano roll", a mode of automatically recording note timing and duration in a way which could be easily transcribed to proper musical notation by hand, was first implemented by German engineers J.F. Unger and J. Hohlfield in 1752.[6]

In 1957, the ILLIAC I (Illinois Automatic Computer) produced the "Illiac Suite for String Quartet", a completely computer-generated piece of music. The computer was programmed to accomplish this by composer Lejaren Hiller and mathematician Leonard Isaacson.[5]: v–vii  In 1960, Russian researcher Rudolf Zaripov published worldwide first paper on algorithmic music composing using the Ural-1 computer.[7]

In 1965, inventor Ray Kurzweil developed software capable of recognizing musical patterns and synthesizing new compositions from them. The computer first appeared on the quiz show I've Got a Secret.[8]

By 1983, Yamaha Corporation's Kansei Music System had gained momentum, and a paper was published on its development in 1989. The software utilized music information processing and artificial intelligence techniques to essentially solve the transcription problem for simpler melodies, although higher-level melodies and musical complexities are regarded even today as difficult deep-learning tasks, and near-perfect transcription is still a subject of research.[6][9]

In 1997, an artificial intelligence program named Experiments in Musical Intelligence (EMI) appeared to outperform a human composer at the task of composing a piece of music to imitate the style of Bach.[10] EMI would later become the basis for a more sophisticated algorithm called Emily Howell, named for its creator.

In 2002, the music research team at the Sony Computer Science Laboratory Paris, led by French composer and scientist François Pachet, designed the Continuator, an algorithm uniquely capable of resuming a composition after a live musician stopped.[11]

Emily Howell would continue to make advancements in musical artificial intelligence, publishing its first album From Darkness, Light in 2009.[12] Since then, many more pieces by artificial intelligence and various groups have been published.

In 2010, Iamus became the first AI to produce a fragment of original contemporary classical music, in its own style: "Iamus' Opus 1". Located at the Universidad de Malága (Malága University) in Spain, the computer can generate a fully original piece in a variety of musical styles.[13][5]: 468–481  August 2019, a large dataset consisting of 12,197 MIDI songs, each with their lyrics and melodies (https://github.com/yy1lab/Lyrics-Conditioned-Neural-Melody-Generation), was created to investigate the feasibility of neural melody generation from lyrics using a deep conditional LSTM-GAN method.

With progress in generative AI, models capable of creating complete musical compositions (including lyrics) from a simple text description have begun to emerge. Two notable web applications in this field are Suno AI, launched in December 2023, and Udio, which followed in April 2024.[14]

Software applications

edit

ChucK

edit

Developed at Princeton University by Ge Wang and Perry Cook, ChucK is a text-based, cross-platform language.[15] By extracting and classifying the theoretical techniques it finds in musical pieces, the software is able to synthesize entirely new pieces from the techniques it has learned.[16] The technology is used by SLOrk (Stanford Laptop Orchestra)[17] and PLOrk (Princeton Laptop Orchestra).

Jukedeck

edit

Jukedeck was a website that let people use artificial intelligence to generate original, royalty-free music for use in videos.[18][19] The team started building the music generation technology in 2010,[20] formed a company around it in 2012,[21] and launched the website publicly in 2015.[19] The technology used was originally a rule-based algorithmic composition system,[22] which was later replaced with artificial neural networks.[18] The website was used to create over 1 million pieces of music, and brands that used it included Coca-Cola, Google, UKTV, and the Natural History Museum, London.[23] In 2019, the company was acquired by ByteDance.[24][25][26]

MorpheuS

edit

MorpheuS[27] is a research project by Dorien Herremans and Elaine Chew at Queen Mary University of London, funded by a Marie Skłodowská-Curie EU project. The system uses an optimization approach based on a variable neighborhood search algorithm to morph existing template pieces into novel pieces with a set level of tonal tension that changes dynamically throughout the piece. This optimization approach allows for the integration of a pattern detection technique in order to enforce long term structure and recurring themes in the generated music. Pieces composed by MorpheuS have been performed at concerts in both Stanford and London.

AIVA

edit

Created in February 2016, in Luxembourg, AIVA is a program that produces soundtracks for any type of media. The algorithms behind AIVA are based on deep learning architectures[28] AIVA has also been used to compose a Rock track called On the Edge,[29] as well as a pop tune Love Sick[30] in collaboration with singer Taryn Southern,[31] for the creation of her 2018 album "I am AI".

Google Magenta

edit
20-second music clip generated by MusicLM using the prompt "hypnotic ambient electronic music"

Google's Magenta team has published several AI music applications and technical papers since their launch in 2016.[32] In 2017 they released the NSynth algorithm and dataset,[33] and an open source hardware musical instrument, designed to facilitate musicians in using the algorithm.[34] The instrument was used by notable artists such as Grimes and YACHT in their albums.[35][36] In 2018, they released a piano improvisation app called Piano Genie. This was later followed by Magenta Studio, a suite of 5 MIDI plugins that allow music producers to elaborate on existing music in their DAW.[37] In 2023, their machine learning team published a technical paper on GitHub that described MusicLM, a private text-to-music generator which they'd developed.[38][39]

Riffusion

edit
Generated spectrogram from the prompt "bossa nova with electric guitar" (top), and the resulting audio after conversion (bottom)

Riffusion is a neural network, designed by Seth Forsgren and Hayk Martiros, that generates music using images of sound rather than audio.[40] It was created as a fine-tuning of Stable Diffusion, an existing open-source model for generating images from text prompts, on spectrograms.[40] This results in a model which uses text prompts to generate image files, which can be put through an inverse Fourier transform and converted into audio files.[41] While these files are only several seconds long, the model can also use latent space between outputs to interpolate different files together.[40][42] This is accomplished using a functionality of the Stable Diffusion model known as img2img.[43]

The resulting music has been described as "de otro mundo" (otherworldly),[44] although unlikely to replace man-made music.[44] The model was made available on December 15, 2022, with the code also freely available on GitHub.[41] It is one of many models derived from Stable Diffusion.[43]

Riffusion is classified within a subset of AI text-to-music generators. In December 2022, Mubert[45] similarly used Stable Diffusion to turn descriptive text into music loops. In January 2023, Google published a paper on their own text-to-music generator called MusicLM.[46][47]

Musical applications

edit

Artificial Intelligence has the opportunity to impact how producers create music by giving reiterations of a track that follow a prompt given by the creator. These prompts allow the AI to follow a certain style that the artist is trying to go for.[5]

AI has also been seen in musical analysis where it has been used for feature extraction, pattern recognition, and musical recommendations.[48]

Composition

edit

Artificial intelligence has had major impacts in the composition sector as it has influenced the ideas of composers/producers and has the potential to make the industry more accessible to newcomers. With its development in music, it has already been seen to be used in collaboration with producers. Artists use these software to help generate ideas and bring out musical styles by prompting the AI to follow specific requirements that fit their needs. Future compositional impacts by the technology include style emulation and fusion, and revision and refinement. Development of these types of software can give ease of access to newcomers to the music industry.[5] Software such as ChatGPT have been used by producers  to do these tasks, while other software such as Ozone11 have been used to automate time consuming and complex activities such as mastering. [49]

edit

In the United States, the current legal framework tends to apply traditional copyright laws to AI, despite its differences with the human creative process.[50] However, music outputs solely generated by AI are not granted copyright protection. In the compendium of the U.S. Copyright Office Practices, the Copyright Office has stated that it would not grant copyrights to “works that lack human authorship” and “the Office will not register works produced by a machine or mere mechanical process that operates randomly or automatically without any creative input or intervention from a human author.”[51] In February 2022, the Copyright Review Board rejected an application to copyright AI-generated artwork on the basis that it "lacked the required human authorship necessary to sustain a claim in copyright."[52]

The situation in the European Union (EU) is similar to the US, because its legal framework also emphasizes the role of human involvement in a copyright-protected work.[53] According to the European Union Intellectual Property Office and the recent jurisprudence of the Court of Justice of the European Union, the originality criterion requires the work to be the author’s own intellectual creation, reflecting the personality of the author evidenced by the creative choices made during its production, requires distinct level of human involvement.[53] The reCreating Europe project, funded by the European Union’s Horizon 2020 research and innovation program, delves into the challenges posed by AI-generated contents including music, suggesting legal certainty and balanced protection that encourages innovation while respecting copyright norms.[53] The recognition of AIVA marks a significant departure from traditional views on authorship and copyrights in the realm of music composition, allowing AI artists capable of releasing music and earning royalties. This acceptance marks AIVA as a pioneering instance where an AI has been formally acknowledged within the music production.[54]

The recent advancements in artificial intelligence made by groups such as Stability AI, OpenAI, and Google has incurred an enormous sum of copyright claims leveled against generative technology, including AI music. Should these lawsuits succeed, the machine learning models behind these technologies would have their datasets restricted to the public domain.[55]

Musical deepfakes

edit

A more nascent development of AI in music is the application of audio deepfakes to cast the lyrics or musical style of a pre-existing song to the voice or style of another artist. This has raised many concerns regarding the legality of technology, as well as the ethics of employing it, particularly in the context of artistic identity.[56] Furthermore, it has also raised the question of to whom the authorship of these works is attributed. As AI cannot hold authorship of its own, current speculation suggests that there will be no clear answer until further rulings are made regarding machine learning technologies as a whole.[57] Most recent preventative measures have started to be developed by Google and Universal Music group who have taken into royalties and credit attribution to allow producers to replicated the voices and styles of artists.[58]

"Heart on My Sleeve"

edit

In 2023, an artist known as ghostwriter977 created a musical deepfake called "Heart on My Sleeve" that cloned the voices of Drake and The Weeknd by inputting an assortment of vocal-only tracks from the respective artists into a deep-learning algorithm, creating an artificial model of the voices of each artist, to which this model could be mapped onto original reference vocals with original lyrics.[59] The track was submitted for Grammy consideration for the best rap song and song of the year.[60] It went viral and gained traction on TikTok and received a positive response from the audience, leading to its official release on Apple Music, Spotify, and YouTube in April of 2023.[61] Many believed the track was fully composed by an AI software, but the producer claimed the songwriting, production, and original vocals (pre-conversion) were still done by him.[59] It would later be rescinded from any Grammy considerations due to it not following the guidelines necessary to be considered for a Grammy award.[61] The track would end up being removed from all music platforms by Universal Music Group.[61] The song was a watershed moment for AI voice cloning, and models have since been created for hundreds, if not thousands, of popular singers and rappers.

"Where That Came From"

edit

In 2013, country music singer Randy Travis suffered a stroke which left him unable to sing. In the meantime, vocalist James Dupré toured on his behalf, singing his songs for him. Travis and longtime producer Kyle Lehning released a new song in May 2024 titled "Where That Came From", Travis's first new song since his stroke. The recording uses AI technology to re-create Travis's singing voice, having been composited from over 40 existing vocal recordings alongside those of Dupré.[62][63]

See also

edit

References

edit
  1. ^ D. Herremans; C.H.; Chuan, E. Chew (2017). "A Functional Taxonomy of Music Generation Systems". ACM Computing Surveys. 50 (5): 69:1–30. arXiv:1812.04186. doi:10.1145/3108242. S2CID 3483927.
  2. ^ Dannenberg, Roger. "Artificial Intelligence, Machine Learning, and Music Understanding" (PDF). Semantic Scholar. S2CID 17787070. Archived from the original (PDF) on August 23, 2018. Retrieved August 23, 2018.
  3. ^ Erwin Panofsky, Studies in Iconology: Humanistic Themes in the Art of the Renaissance. Oxford 1939.
  4. ^ Dilly, Heinrich (2020), Arnold, Heinz Ludwig (ed.), "Panofsky, Erwin: Zum Problem der Beschreibung und Inhaltsdeutung von Werken der bildenden Kunst", Kindlers Literatur Lexikon (KLL) (in German), Stuttgart: J.B. Metzler, pp. 1–2, doi:10.1007/978-3-476-05728-0_16027-1, ISBN 978-3-476-05728-0, retrieved 2024-03-03
  5. ^ a b c d e Miranda, Eduardo Reck, ed. (2021). "Handbook of Artificial Intelligence for Music". SpringerLink. doi:10.1007/978-3-030-72116-9. ISBN 978-3-030-72115-2.
  6. ^ a b Roads, Curtis (1985). "Research in music and artificial intelligence". ACM Computing Surveys. 17 (2): 163–190. doi:10.1145/4468.4469. Retrieved 2024-03-06.
  7. ^ Zaripov, Rudolf (1960). "Об алгоритмическом описании процесса сочинения музыки (On algorithmic description of process of music composition)". Proceedings of the USSR Academy of Sciences. 132 (6).
  8. ^ "Ray Kurzweil". National Science and Technology Medals Foundation. Retrieved 2024-09-10.
  9. ^ Katayose, Haruhiro; Inokuchi, Seiji (1989). "The Kansei Music System". Computer Music Journal. 13 (4): 72–77. doi:10.2307/3679555. ISSN 0148-9267. JSTOR 3679555.
  10. ^ Johnson, George (11 November 1997). "Undiscovered Bach? No, a Computer Wrote It". The New York Times. Retrieved 29 April 2020. Dr. Larson was hurt when the audience concluded that his piece -- a simple, engaging form called a two-part invention -- was written by the computer. But he felt somewhat mollified when the listeners went on to decide that the invention composed by EMI (pronounced Emmy) was genuine Bach.
  11. ^ Pachet, François (September 2003). "The Continuator: Musical Interaction With Style". Journal of New Music Research. 32 (3): 333–341. doi:10.1076/jnmr.32.3.333.16861. hdl:2027/spo.bbp2372.2002.044. ISSN 0929-8215.
  12. ^ Lawson, Mark (2009-10-22). "This artificially intelligent music may speak to our minds, but not our souls". The Guardian. ISSN 0261-3077. Retrieved 2024-09-10.
  13. ^ "Iamus: Is this the 21st century's answer to Mozart?". BBC News. 2013-01-02. Retrieved 2024-09-10.
  14. ^ Nair, Vandana (2024-04-11). "AI-Music Platform Race Accelerates with Udio". Analytics India Magazine. Retrieved 2024-04-19.
  15. ^ ChucK => Strongly-timed, On-the-fly Audio Programming Language. Chuck.cs.princeton.edu. Retrieved on 2010-12-22.
  16. ^ Foundations of On-the-fly Learning in the ChucK Programming Language
  17. ^ Driver, Dustin. (1999-03-26) Pro - Profiles - Stanford Laptop Orchestra (SLOrk), pg. 1. Apple. Retrieved on 2010-12-22.
  18. ^ a b "From Jingles to Pop Hits, A.I. Is Music to Some Ears". The New York Times. 22 January 2017. Retrieved 2023-01-03.
  19. ^ a b "Need Music For A Video? Jukedeck's AI Composer Makes Cheap, Custom Soundtracks". techcrunch.com. 7 December 2015. Retrieved 2023-01-03.
  20. ^ "What Will Happen When Machines Write Songs Just as Well as Your Favorite Musician?". motherjones.com. Retrieved 2023-01-03.
  21. ^ Cookson, Robert (7 December 2015). "Jukedeck's computer composes music at touch of a button". Financial Times. Retrieved 2023-01-03.
  22. ^ "Jukedeck: the software that writes music by itself, note by note". Wired UK. Retrieved 2023-01-03.
  23. ^ "Robot rock: how AI singstars use machine learning to write harmonies". standard.co.uk. March 2018. Retrieved 2023-01-03.
  24. ^ "TIKTOK OWNER BYTEDANCE BUYS AI MUSIC COMPANY JUKEDECK". musicbusinessworldwide.com. 23 July 2019. Retrieved 2023-01-03.
  25. ^ "As TikTok's Music Licensing Reportedly Expires, Owner ByteDance Purchases AI Music Creation Startup JukeDeck". digitalmusicnews.com. 23 July 2019. Retrieved 2023-01-03.
  26. ^ "An AI-generated music app is now part of the TikTok group". sea.mashable.com. 24 July 2019. Retrieved 2023-01-03.
  27. ^ D. Herremans; E. Chew (2016). "MorpheuS: Automatic music generation with recurrent pattern constraints and tension profiles". IEEE Transactions on Affective Computing. PP(1). arXiv:1812.04832. doi:10.1109/TAFFC.2017.2737984. S2CID 54475410.
  28. ^ "A New AI Can Write Music as Well as a Human Composer". Futurism. 2017-03-09. Retrieved 2024-04-19.
  29. ^ Technologies, Aiva (2018-10-24). "The Making of AI-generated Rock Music with AIVA". Medium. Retrieved 2024-04-19.
  30. ^ Lovesick | Composed with AIVA Artificial Intelligence - Official Video with Lyrics | Taryn Southern. 2 May 2018.
  31. ^ Southern, Taryn (2018-05-10). "Algo-Rhythms: The future of album collaboration". TechCrunch. Retrieved 2024-04-19.
  32. ^ "Welcome to Magenta!". Magenta. 2016-06-01. Retrieved 2024-04-19.
  33. ^ Engel, Jesse; Resnick, Cinjon; Roberts, Adam; Dieleman, Sander; Eck, Douglas; Simonyan, Karen; Norouzi, Mohammad (2017). "Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders". PMLR. arXiv:1704.01279.
  34. ^ Open NSynth Super, Google Creative Lab, 2023-02-13, retrieved 2023-02-14
  35. ^ "Cover Story: Grimes is ready to play the villain". Crack Magazine. Retrieved 2023-02-14.
  36. ^ "What Machine-Learning Taught the Band YACHT About Themselves". Los Angeleno. 2019-09-18. Retrieved 2023-02-14.
  37. ^ "Magenta Studio". Magenta. Retrieved 2024-04-19.
  38. ^ "MusicLM". google-research.github.io. 2023. Retrieved 2024-04-19.
  39. ^ Sandzer-Bell, Ezra (2024-02-16). "Best Alternatives to Google's AI-Powered MusicLM and MusicFX". AudioCipher. Retrieved 2024-04-19.
  40. ^ a b c Coldewey, Devin (December 15, 2022). "Try 'Riffusion,' an AI model that composes music by visualizing it".
  41. ^ a b Nasi, Michele (December 15, 2022). "Riffusion: creare tracce audio con l'intelligenza artificiale". IlSoftware.it.
  42. ^ "Essayez "Riffusion", un modèle d'IA qui compose de la musique en la visualisant". December 15, 2022.
  43. ^ a b "文章に沿った楽曲を自動生成してくれるAI「Riffusion」登場、画像生成AI「Stable Diffusion」ベースで誰でも自由に利用可能". GIGAZINE. 16 December 2022.
  44. ^ a b Llano, Eutropio (December 15, 2022). "El generador de imágenes AI también puede producir música (con resultados de otro mundo)".
  45. ^ "Mubert launches Text-to-Music interface – a completely new way to generate music from a single text prompt". December 21, 2022.
  46. ^ "MusicLM: Generating Music From Text". January 26, 2023.
  47. ^ "5 Reasons Google's MusicLM AI Text-to-Music App is Different". January 27, 2023.
  48. ^ Zhang, Yifei (December 2023). "Utilizing Computational Music Analysis and AI for Enhanced Music Composition: Exploring Pre- and Post-Analysis". Journal of Advanced Zoology. 44 (S-6): 1377–1390. doi:10.17762/jaz.v44is6.2470. S2CID 265936281.
  49. ^ Sunkel, Cameron (2023-12-16). "New Research Reveals Top AI Tools Utilized by Music Producers". EDM.com - The Latest Electronic Dance Music News, Reviews & Artists. Retrieved 2024-04-03.
  50. ^ "Art created by AI cannot be copyrighted, says US officials – what does this mean for music?". MusicTech. Retrieved 2022-10-27.
  51. ^ "Can (and should) AI-generated works be protected by copyright?". Hypebot. 2022-02-28. Retrieved 2022-10-27.
  52. ^ Re: Second Request for Reconsideration for Refusal to Register A Recent Entrance to Paradise (Correspondence ID 1-3ZPC6C3; SR # 1-7100387071) (PDF) (Report). Copyright Review Board, United States Copyright Office. 2022-02-14.
  53. ^ a b c Bulayenko, Oleksandr; Quintais, João Pedro; Gervais, Daniel J.; Poort, Joost (February 28, 2022). "AI Music Outputs: Challenges to the Copyright Legal Framework". reCreating Europe Report. Retrieved 2024-04-03.
  54. ^ Ahuja, Virendra (June 11, 2021). "Artificial Intelligence and Copyright: Issues and Challenges". ILI Law Review Winter Issue 2020. Retrieved 2024-04-03.
  55. ^ Samuelson, Pamela (2023-07-14). "Generative AI meets copyright". Science. 381 (6654): 158–161. Bibcode:2023Sci...381..158S. doi:10.1126/science.adi0656. ISSN 0036-8075. PMID 37440639.
  56. ^ DeepDrake ft. BTS-GAN and TayloRVC: An Exploratory Analysis of Musical Deepfakes and Hosting Platforms
  57. ^ AI and Deepfake Voice Cloning: Innovation, Copyright and Artists’ Rights
  58. ^ "Google and Universal Music negotiate deal over AI 'deepfakes'". www.ft.com. Retrieved 2024-04-03.
  59. ^ a b Robinson, Kristin (2023-10-11). "Ghostwriter, the Mastermind Behind the Viral Drake AI Song, Speaks For the First Time". Billboard. Retrieved 2024-04-03.
  60. ^ "Drake/The Weeknd deepfake song "Heart on My Sleeve" submitted to Grammys". The FADER. Retrieved 2024-04-03.
  61. ^ a b c "The AI deepfake of Drake and The Weeknd will not be eligible for a GRAMMY". Mixmag. Retrieved 2024-04-03.
  62. ^ Marcus K. Dowling (May 6, 2024). "Randy Travis' shocks music industry with AI pairing for 'Where That Came From.' How the song came together". The Tennesseean. Retrieved May 6, 2024.
  63. ^ Maria Sherman (May 6, 2024). "With help from AI, Randy Travis got his voice back. Here's how his first song post-stroke came to be". AP News. Retrieved May 6, 2024.

Further reading

edit
edit