Share.

9 Comments

  1. I recently heard to a fake ai-generated Lady Gaga leak that included several of the hard slop words: endless, electric, neon, etc. I suspected it was Suno generated, but now I’m positive.

    Suno really seems to like common but “vivid” words that might be suggested in a lesson on songwriting but probably wouldn’t ring true to the average real songwriter.

  2. It’s not clear to me what we see on the horizontal and the vertical axes, and it’s also not clear to me what the font size signifies. Could you please explain?

    The vertical axis appears to be totally random, so there’s no point in e.g. comparing the top ten percent to the bottom ten percent, right?

    The horizontal axis is apparently the interesting one, the one that indicates “likely word usage”. But how did you calculate that? It certainly can’t be the case that the words on the extreme left occur exclusively in “Suno” lyrics. I for sure know a few human-written lyrics that contain “joy” or “laughter”, so they must have a “likely word usage” larger than 0.0 for human-written lyrics as well. Is this something like a difference in probabilities, i.e. something like *P*(“suno”) – *P*(“genius”)? Or did you use some sort of [keyness](https://en.wikipedia.org/wiki/Keyword_(linguistics)) measure? But most keyness measures that I know aren’t restricted to a fixed data range, which your points on the x axis certainly are.

    With regard to the font size, this may be related to absolute frequencies, as it’s the usual suspects like personal pronouns and articles that use a bigger font size (you know, those words that are usually filtered out in the first place). Is that really all that there is to it? If so, why even bother?

  3. Great post, saved it! What is this possible Rap bias about? Could you tell us a bit about your dataset & method?

  4. outragednitpicker on

    Stay on the left for 5-cent ice cream cones, Stay on the right to have your car keyed.