But I don't understand why its wrong. If its trained on lots of Urdu/Hindi music, no one pronounces those words like that. How does it get the a/e wrong while still singing almost correctly? It's weird.
And it's even cleverly mobile friendly.
-----
STYLE: Earth Circuit Fusion
INSTRUMENTATION: - Deep analog synth bass with subtle distortion - Hybrid percussion combining djembe and electronic glitches - Polytonal synth arpeggios with unpredictable patterns - Processed field recordings for atmospheric texture - Circuit-bent toys creating unexpected melodic accents
VOCAL APPROACH: - Female vocalist with rich mid-range and clear upper register - Intimate yet confident delivery with controlled vibrato - Layered whisper-singing technique in verses - Full-voiced chorus delivery with slight emotional rasp - Spoken-word elements layered under bridge melodies - Stacked fifth harmonies creating ethereal chorus quality
PRODUCTION: - Grainy tape saturation on organic elements - Juxtaposition of lo-fi and hi-fi within same sections - Strategic arrangement dropouts for dramatic impact - Glitch transition effects between sections
---
One thing I have noticed with the new model is that it listens to direction in the lyrics more now, for example [whispered] or [bass drop], etc.
There are clear limits. I have been unsuccessful in spacial arrangement.
EDIT: I realized I didn't specify, this is when you do custom and you specify the lyrics and the style separately.
Still not totally adherent, but if you can steer it with genre, detailed descriptions of genre, and elements of the genre it's way better than v4. Some descriptions work better than others so there's some experimentation to figure out what works for what you're trying to achieve.
You can also provide descriptions in [brackets] in the lyrics that work reasonably well in my experience.
Disclaimer: I work there as a SWE.
Some examples of style descriptions I've used that generated results close to what I had in mind are "romantic comedy intro music, fast and exciting, new york city" (aiming for something like the Sex and the City theme) and "mature adult romance reality tv show theme song, breakbeats, seductive, intimate, saxophones, lots of saxophones" which did indeed produce cheesy porn music.
I keep wanting to save some of the songs I hear. Damn, I don't think I would really be able to tell in a blind test that these were AI.
I really don't like that UI. It's hard to read, and when I found something it slips. Too much form over function
> I keep wanting to save some of the songs I hear.
Just click the title of the song. If you have an account you can add to favorites, download, etc.
Oh, the joys of infinite public domain music!
I guess they are hoping for the Uber outcome where they earn enough money during the illegal phase so they can pay some tiny fine and keep going.
Suno admitted to train their models with copyrighted music and are now defending the position that music copyrights and royalties are bad for the future of music.
Fair use encompasses a lot of possible scenarios involving copyrighted works which may or may not be commercially licensed: transformative derivative works, short excerpts for purposes of commentary, backups of copies obtained legally, playback for face-to-face instruction, etc.
For comparison, here's a song where I forced myself to do everything within Suno (took less than a week):
https://www.youtube.com/watch?v=R6mJcXxoppc
And here's one where I did the manual composition, worked with session artists, and it took a couple months and cost me several hundred dollars:
The first time I heard it, it was incredible. The 2nd wedding that did it, it started to feel boring. The 3rd time, everyone hated it.
Similar to image-generation, we're getting tired really fast of cookie-cutter art. I don't know how to feel about it.
Weird, stupid things. Writing theme songs for TV shows that don't exist, finding ways to translate song types from culture A to culture B, BGM for a video game you want to make, a sales song for Shikoku 1889 to sell Iyo railway shares, etc...
Some of us have zero cultural influence and services like Suno mean we aren't listening to the original brainrot (popular music). Sure, you might create garbage but it's your garbage and you aren't stuck waiting for someone to throw you a bone.
I love Suno, it's a rare subscription that is fun.
I'm pretty sure that I actually could, if I really wanted to, create this cover legitimately and even put it on Spotify with royalties going to the original artists (it seems they have a blanket mechanical license for a lot of works). But it was a "gag" song that probably has a market of just me, so hiring a team of people would be a lot of time and money for 3 minutes of a giggle. I also would have to worry about things like if it's changed too much to be a cover and getting sued for putting in extra effort.
That being said, your idea isn't original; there's already a flood of automated AI-generated cover songs being pushed onto Spotify, and they + distributors are (allegedly) starting to actively combat this.
I have a feeling that’s by design. Firstly for computation purposes, secondly to avoid someone making a studio-quality deepfake song.
That's not a tool issue. It just means that working on a raised floor is not the same a being able to reach a higher ceiling.
I don't know. Scrolling the Sora image generations feed is pretty fun.
It's got trendy memes, lots of mashups, and cool art. They've also managed to capture social media outrage bait on the platform: conservatives vs. liberals, Christians vs. atheists, and a whole other host of divisive issues that are interspersed throughout the feed. I think they have a social media play in the bag if they pursue it.
It feels like Sora could replace Instagram.
I can't say anything about autogenerated lyrics.
Which one are you referring to?
I've found some okay but listening to "meaningless" music doesn't sit right with me
https://www.youtube.com/watch?v=6bNMxWGHlTI
Cuando para mucho mi amore de felice corazón...
it's first in one of the rows, on the left
edm anti-folk is also great: https://suno.com/song/47f0585c-ca41-4002-9d7f-fe71f85e0c62
All of these example get ruined by the most simple and boring lyrics imaginable. Poetry is an art and clearly the model doesn't yet grasp all of its nuances like it does for the rest of the "composition".
At this point the only thing that gives this away as AI generated are the vocals.
>Among the stars, where the dreams and freedom meet.
>Finding the ecstasy of life’s uncharted quest,
>In every pulse of the music, feel the zest.
Like... what?
Only because the bar for music is so low nowadays. Thankfully poetry hasn't been commodified yet liked music has.
I IV V with different accents over the music and different drum sounds is fine, but thats not really music. It's pretty bad when you can pick out the chords progression in 5 sec. Cue the infamous 4-chord song skit by Axis of Awesome.
Music is more about the human that made it and their relation to you than the sound properties themselves. Same as other art. The more indirect the music process and the further you are from the living experience of the human creator, the less it resembles art. I feel art is more of an spectrum rather than a binary switch and the metric is how much direct human involvement did the audio experience have in terms that you can relate to.
Remove the human completely and you just have sound. It is likely that something like bepop, gabber or industrial synthwave would have been considered "sounds" rather than art by medieval folks or Mesopotamian people if they heard the sounds without knowing whether the source was human or not. Same with us if we were to heard some music from the year 3200 or 4500, we would likely not consider it music.
However in music - there is so much badly done human music as well, for me it's nearly impossible to understand the difference between a badly done human music and a high fidelity AI music (the chord progression, happens as often in human music). Moreover, I have put Suno AI on playlist mode before and it's actually been enjoyable, and I am a big AI sceptic! Sometimes even more so than Spotify's own (although they've been accused of putting AI music on playlists as well - but I am fairly sure the weak stuff that put me off was by humans - did I say I cannot differentiate?).
Especially some music genres - like Japanese Vocaloid ; Power Metal, some country, where certain genre specific things overwhelm the piece, AI does a very good job in mimicking those from the best of the best and put meagre efforts to shame.
Here is one AI song I generated in an earlier version of Suno - let me know if anything stands out as AI: https://www.youtube.com/watch?v=I5JcEnU-x3s
and another I recorded in my studio with an artist: https://www.youtube.com/watch?v=R6mJcXxoppc
Kind of sad, especially for composers (which I am trying to be). Ah well, can only keep moving forward.
Also as we can blur the line between instrument and audio; why can’t my piano morph into an organ over the course of a piece? (I’m familiar with the Korg Morpheus and similar; I mean in a much more real sense).
An no disrepect towards anyone using AI to create music, it is here and unstoppable, but I don't currently use generative AI in music myself. Yeah, think for performed works for a live audience (at least in classical music), most people want to hear music composed by humans (for the most part). Hopefully, will stay this way for awhile, otherwise, I've been going down a road that goes nowhere. Ah well, wouldn't be the first time:)
If anyone here has a subscription and they can spare the tokens, I think it would be fun if someone shared a song about Hacker News.
I'm hoping that in the future tools like Suno will allow you to produce / generate songs as projects which you can tweak in greater detail; basically a way of making music by "vibe coding". With 4.0 the annotation capabilities were still a bit limited, and the singer could end up mispronouncing things without any way to fix or specify the correct pronunciation. This blog post mentions that with 4.5 they enhanced prompt interpretations, but it doesn't actually go into any technical details nor does it provide clear examples to get a real sense of the changes.
Your comment inspired me to upgrade it to 4.5 because it did have that AI tinny quality. https://suno.com/s/tbZlkBL7XeLVuuN0
It sounds better but has lost some magic.
Here is the original comment - https://news.ycombinator.com/item?id=39997706
In that spirit, from the same “artist” here is your comment - https://suno.com/s/AumsIqrIovVhT0c9
And
https://suno.com/s/YGlpHptX6yXJVpHq
Not sure which I like more.
We can do better on user instruction for sure, duly noted. In my experience a lot of different stuff works (emotions, some musical direction sometimes, describing parts/layers of the track you want to exist, music-production-ish terminology, genres, stuff like intro/outro/chorus), but I think of it more as steering the space of the generated output rather than working 100% of the time. This can go in the style tags or in [brackets] in the lyrics. Definitely makes a difference in the outputs to be more descriptive with 4.5.
Someday, I’m sure, Suno will find a way to fix this issue. But today isn’t that day.
My reasoning is that the fact that it was made by another human is really important.
Not only because you might think a piece of music is lame because it was made by AI vs a human.
But also because all the things that bring you back to a piece of art is wrapped up in the person that made it.
People who are immense fans of the Beatles, Taylor Swift or Kanye West illustrate this point.
You keep coming back because you liked this person's music before, and so you can't wait to preorder their music in the future.
Same goes for books, paintings and really all other art I can think of.
An artist develops a following that snowballs into their music being broadly consumed.
There are "AI music artists" that have been around for a decade. Miquela is the one I know about. But in that timespan, hundreds of human artists have developed followings and cultural sway that far outweigh what Miquela has done.
It seems more and more that AI is simply another tool for humans to use. Rather than a replacement altogether of humans.
Allow users to creatively engage by providing suggested starting places in the form of BPM, key and chord progressions or as brief audio and/or MIDI sketches. For example, let me give the AI a simple sketch of a couple bars of melody as MIDI notes, then have it give me back several variations of matching rhythm section and harmonies. Then take my textual feedback on adjustments I'd like but let me be specific when necessary, down to per-section or individual instrument. Ideally, the interface should look like a simplified multi-track DAW, making it easy for users to lock or unlock individual tracks so the AI knows what to keep and what to change as we creatively iterate. Once finished, provide output as both full mix and separate audio stems with optional MIDI for tracks with traditional instruments.
Targeting this use case accomplishes two crucial things. First, it lowers the bar of quality the AI has to match to be useful and compelling. Let's face it, generating lyrics, melodies, instrumental performances and production techniques more compelling than a top notch team of humans is hard for an AI. Doing it every time and only in the form of a complete, fully produced song is currently nearly impossible. The second thing it does is increase the tangible value the AI can contribute right now. Today it can be the rhythm section I lack, tomorrow it can fill in for the session guitarist I need, next week it can help me come up with new chord progression ideas. It would be useful every time I want to create music, whether I need backing vocals, a tight bass riff, scary viola tremelos or just some musical inspiration. And nothing it did would have to be perfect to be insanely useful - because I can tweak individual audio stems and MIDI tracks far faster than trying to explain a certain swing shuffle feel in text prompts.
Seriously, for a tool anything like what I've described, I'd be all-in for at least $30/mo if it's only half-assed. Once it's 3/4-assed put me down for $50/mo - and I'm not even a pro or semi-pro musician or producer, just a musical hobbyist who screws around making stuff no one else ever hears. Sure, actual music creators are a smaller market than music listeners but we're loyal, not as price sensitive and our needs for perfection in collaborators are far lower too. Plus, all those granular interactions as we iterate with your AI step-by-step towards "great", becomes invaluable training data - yet doesn't require us creators to surrender rights to our final output. For training data, the journey is (literally) the reward.
So then there's the casual end-user who's making music for themselves to listen to. IMO this is largely a novelty that hasn't worked out. I haven't heard many people regularly listen to Suno because, again, music is already incredibly cheap. Spotify is ~$15/month and it gives you access to the Beatles and Rolling Stones. The novelty of AI-generated "Korean goa psytrance 2-step" is fun for a bit, but how much will people pay for it, how many, and for how long?
I do think there's a lot of potential targeting musicians who incorporate AI-generated elements in their songs. (Disclaimer: I am a musician who has been using vocal synths for many years, and have started incorporating AI-generated samples into my workflows.) However as you point out, the functionality needed for Suno to work here is very different from the "write prompt, get fully complete song" use case.
It'll be interesting to see where it goes from here. In general, AI-based tooling does appear to be pivoting more towards "tools for creators" rather than "magic button that produces content", so I'm hopeful.
[0] One notable one is the artist "009 Sound System", who had a bunch of CC-licensed tracks that became popular due to YouTube's music swapping feature; since the list was sorted alphabetically, their tracks ended up getting used in a ton of videos and gaining popularity. https://en.wikipedia.org/wiki/Alexander_Perls#YouTube
Yeah, AI music gen is super fun to play with for a half-hour or so - and it's great when I need a novelty song made for a friend's wedding or special birthday - like, once a year maybe. But neither of those seems like a use case that leads to sustainable, high-value ARR. I'm starting to wonder if maybe most AI music generation companies ended up here because AI researchers saw a huge pile of awesome produced content that was already partially tagged and it looked like too perfect of a nail not to swing their LLM hammer at. And, until recently, VCs were throwing money at anything "AI" without needing to show product/market fit.
I'm not sure they fully thought through the use case of typical music listeners or considered the entrenched competition offering >95% of all music humans have ever recorded - for around ~$10/mo. As you said, another potential customer is media producers who need background tracks for social media videos but between the stock music industry offering millions of great tracks for pennies each and the "Fivver"-type producers who'll make a surprisingly good custom track in a day that you can own for $25 - I'm not seeing much ARR for AI music generators there either.
Currently the launch hypothesis of AI music generation puts them in direct competition against mature, high-quality alternatives that are already entrenched and cheap. And those use cases are currently being served by literally the best-of-the-best content humanity has ever created. Targeting replacing that as their first target seems as dumb as a SpaceX setting "Landing on Mars" as the goal of their first launch. There's no way to incrementally iterate toward that value proposition. Sure, targeting more modest incremental goals may be less exciting, but it also doesn't require perfection. Fortunately, music producers have needs that are more tractable but still valuable to solve - and not currently well served by cheap, plentiful, high-quality alternatives. And music producers are generally easier to reach, satisfy and retain long-term than 'music listeners' or 'music licensers'.
Plus Ableton Live itself has a lot of generative tools these days:
https://www.youtube.com/watch?v=_RNXVfo-oLc
But I honestly don't see the point. The journey is the whole point when making music or any art really. AI doesn't solve a problem here. There never has been one in the first place. There is more music out there than you could ever listen to. Automating something humans deeply enjoy is misguided.
Listening to the studio mix on my headphones at home will always be better sound than being in a crowded concert.
I mean you are right to a certain degree, if it works, it works and if generative tools inspire you to make better music that is great. I am not so sure about that though.
I am forced to vibe code at work and it has not make me more creative. Is has made me want to quit my job.
I'm not saying you need to use generative tools, but if it helps you make music you should do it. Ultimately what you're sharing with the world is your taste, not your technical abilities. To slightly expand on a famous quote in the music world -
> I thought using AI was cheating, so I used loops. I then thought using loops was cheating, so I programmed my own using samples. I then thought using samples was cheating, so I recorded real drums. I then thought that programming it was cheating, so I learned to play drums for real. I then thought using bought drums was cheating, so I learned to make my own. I then thought using premade skins was cheating, so I killed a goat and skinned it. I then thought that that was cheating too, so I grew my own goat from a baby goat. I also think that is cheating, but I’m not sure where to go from here. I haven’t made any music lately, what with the goat farming and all.
If you enjoy writing music your way, great. But I strongly disagree that it’s a mistake to enable people to approach it differently.
All my artists friends where criticizing it and I was thinking it was some form of Neo-Ludditism that they were following. Why not embrace progress? No one is stopping them to not use it but if it helps lower the barrier of entry isn't that great? Surely generative AI could be used to enhance of the workflow of an artist?
Oh, how I have been wrong. In reality it has only been used to replace artists. To devalue their work. It has not place in a artists pipeline.
https://aftermath.site/ai-video-game-development-art-vibe-co...
I think the use of generative AI or at least of generalist LLM's is something fundamentally different than artists embracing new media and new processes. Like digital drawing is still roughly the same process as drawing on paper. The process is largely the same and most skills carry over. You are still in control. Using a prompt to create images is something that is not drawing.
I also recommend:
I started using it to generate songs that reinforce emotional regulation strategies -things like grounding, breathwork, staying present. Not instructional tracks, which would be unbearable, but actual songs with lyrics that reflect actual practice and skills.
It started as a way to help me decompress after therapy. I'd listen to a mini-album I made during the drive home. Eventually, I’d catch myself recalling a lyric in stressful moments elsewhere. That was the moment things clicked. The songs weren’t just a way for me to calm down on the way home, they were teaching me real emotional skills I could use in all parts of my life. I wasn’t consciously practicing mindfulness anymore; it was showing up on its own. Since then I’ve been iterating, writing lyrics that reflect emotional-cognitive skills, generating songs with them, and listening while I'm in the car. It's honestly changed my life in a subtle but deep way.
We already have work songs, lullabies, marching music, and religious chants - all music that serves a purpose besides existing to be listened to. Music that exists to teach us ways of interacting is a largely untapped idea.
This is the kind of functional application is what generative music is perfect for. Song can be so much more than listening to terminally romantic lyricists trying to speak to the lowest common denominator. They can teach us to be better versions of ourselves.
Still, I’m excited about the product. The composer could probably use some chain of thought if it doesn’t already, and plan larger sequences and how they relate to each other. Suno is also probably the most ripe for a functional neurosymbolic model. CPE wrote an algorithm on counterpoint hundreds of years ago!
https://www.reddit.com/r/classicalmusic/comments/4qul1b/crea... (Note the original site has been taken over, but you can access the original via way back. Unfortunately I couldn’t find a save where the generation demo works…but I swear it did! I used it at the time!)
https://suno.com/playlist/e6c3f3d1-a746-4106-bea1-e36073d227...
Side note: It feels a little vulnerable to be sharing these. They genuinely helped me through difficult times and I wasn't really expecting anyone else to ever listen to them.
https://music.apple.com/au/album/breath-of-the-cosmos/175227...
https://open.spotify.com/track/0mJoJ0XiQZ8HglUdhWhg2F?si=tID...
But I really think they've made a mistake with direction, realistically it should've been trained on tracker files, and build somgs via the method (but generate the vocals,individual instrument sounds for midi, obvs).
I think the quality would be higher since the track can be "rendered" out essentially, but also only then would it be a useful tool for actual musicians, to be able to get a skeleton file (mod, etc) for a song built up that they can then tweak and add a human touch to.
https://suno.com/playlist/d2886382-bcb9-4d6d-8d7a-78625adcbe...
I'm not sure if this is solvable, but I think it should be a bigger research topic. If anyone knows of any papers on this, I haven't found one yet (not sure what to search for).
You’re right!
> Even jazz uses 6-2-5-1's over and over.
You’re not even wrong! I wonder if jazz does anything else besides that?
What's the basis for this? Unfortunately it's hard to describe, but I've listened to a wide variety of popular and niche genres my whole life with a specific eye toward appreciating all the different ways people appreciate music and I know when something feels new.
Even most (or all?) pop music feels new. If it wasn't, I don't think it would be popular. Sure, it's all derivative, but what makes music enjoyable is when it combines influences in a fresh way.
"French house polka" achieved by doing a statistical blend of the two genres just isn't that interesting—it misses so many ways that things are combined by artists—specific details of the texture of the sound, how it's recorded, cultural references, tons of tiny little influences from specific artists and other genres, etc.
I've tried very specific prompts in Suno and it's not even close to useful for someone who knows what they're doing. The image generators are hardly better—things overwhelmingly trend toward a fixed set of specific styles that are well-represented in the training sets.
This critique falls down in certain areas though. Using tools like Suno to come up with songwriting ideas can be fantastic—as long as you have the taste to curate the outputs and feed it into your own creative process. It's also fantastic for creating vocal samples, and I'm sure it'll crush it for functional music (advertisements, certain types of social media) in short order
That is, you try something new (random) and you, the human, are also the verifier to see whether that random new thing was subjectively good, and then you select based on that.
In this understanding of creativity, creating a new style is a gradual process of evolution (mutation and selection) where your own brain is a necessary part of carrying out the selection. "Did the new idea (mutation) I just tried make me feel good or not (selection)?"
That activity of you being the verifier is effortless and instant, but the AI fundamentally can't tap into the verification and so it has no ability to do the selection step after randomness, and so it is impossible for creativity to emerge no matter the architecture (unless you somehow create a proxy for a human verifier which seems insanely hard).
The only solution I can see to this is to try to simulate this process, seems possible but hard.
Its great, however it sucked in other languages, sounding like a foreigner trying to speak like a local.
Or, if I'm listening to music just for the vibe, I really don't care how it's created, as long as it doesn't offend me auditorially. I'm really not listening actively. So I suppose that's a bit of an indictment of myself, but I don't think it's a serious character flaw in myself. I should probably just try to pay more attention to the people around me at all times.
I have a lot of fun putting my own poetry in here and mashing it up with the styles that I enjoy listening to, or that I think would work well with the poem. Again, I don't want to like it, but I do.
# Original record scratch contest-style song https://suno.com/s/8MvZmfkDPIPmKLtm
And this is a good example how the "magic" is lost in a cover of that same contest entry with no attempt to curate:
# v4.5 cover https://suno.com/s/KyCZZNn6PpL4JHbO
Here's another one I put a bit of time into, but with a much simpler structure. What I appreciated about the original were the emotions it stirred up when the notes came together just-so:
# Original ambient synth https://suno.com/s/JtmmbdA2VtgO4drK
New cover, pretty decent but it lost what I liked the most (haven't had great luck with v4.5 remasters yet, but I do a lot of weird things):
# v4.5 cover https://suno.com/s/Gi8wy1QjUaHmYNKy
# Original piano piece https://suno.com/s/yj8rHRRgJEWD83GY
# v4.5 remaster https://suno.com/s/Xx5Y5SNl1MdDrLsO
When you ignore the stuff that humans shouldn't get credit for - e.g. I didn't "make" this song, or play any part in its "production", but I did "curate" it - there's still something left to give credit for, right? It's basically like a DJ digging through a mysterious crate of records.
whywhywhywhy•12h ago
"Cajun synthpop chant" has no chanting or synths, it sounds more like country music with french woman vocals
mclau157•11h ago
svantana•11h ago
natdempk•10h ago
Mockapapella•7h ago
dumpsterdiver•5h ago
Others such as [Interrupt] will provide a DJ-like fade-out / announcement (that was <Artist name>, next up..." / fade-in - providing an opportunity to break the AI out of repetitive loops it obsesses about.
I've used [Bridge] successfully, and [Instrumental] [No vocals] work reliably as well (there are also instrumental options, but I still use brackets out of habit I guess).
natdempk•3h ago
For example I was trying to steer a melodic techno prompt recently in a better direction by putting stuff like this upfront:
All of this is just stuff I kind of made up and wanted in the song, but it meaningfully improved the output over just tags. I think "steering/nudging the generation space" is a decent idea for how I feel like this affects the output.I also often use them to structure things around song structure like [intro], [break], [chorus], and even get more descriptive with these describing things or moments I'd like to happen. Again adherence is not perfect, but seems to help steer things.
One of my favorite tags I've seen is [Suck the entire song through vacuum] and well... I choose to believe, check out 1:29 https://suno.com/s/xdIDhlKQUed0Dp1I
Worth playing around with a bunch, especially if you're not quite getting something interesting or in the direction you want.
lucasoshiro•11h ago
whywhywhywhy•9h ago
TonyTrapp•11h ago
timeon•4h ago
So another slop?
ofrzeta•6h ago