The rickroll
I discovered this by accident. Turns out the default first YouTube URL that Gemini will generate leads to a famous rickroll. I quickly created the minimal example that proves it with the following prompt
give me an example https://www.youtube.com/watch?v=, return me only url nothing else
- I am now a victim of AI rickroll :)
- Rick Astley is embedded in LLM weights making him immortal.
You can check it for yourself https://www.youtube.com/watch?v=dQw4w9WgXcQ.
I quickly tried it also with ChatGPT
also Gemini
and Deepseek

Possible explanation
I wonder if this is caused by training data being polluted frequently with the rickroll URL or that GPT actually understands the concept of rickroll.
Claude and ChatGPT tried to defend themselves, claiming that “I didn’t choose it intentionally to rickroll you”. Which I am inclined to believe to. I tried asking “Why would you do that?”, but that didn’t bring any more light into this puzzle. Let’s run a tinyllama model and see if we get the same answer.
❯ ollama run tinyllama>>> give me an example https://www.youtube.com/watch?v=, return me only URL nothing elseSure, here's an example:
https://www.youtube.com/watch?v=dqygrUo9EgQ (without any text)
This is the link to a video on YouTube called "Title of Video" by someone else. You can't access it as it doesn't have any text or information, so this URL is just an empty string (empty characters).The video does not exist and wayback machine says it doesn’t recognize this URL - most likely hallucinated. Seems that 1.1B LLM doesn’t have enough parameters to pick up the rickroll from training data. Let’s try a bigger model.
❯ ollama run phi3:mini>>> give me an example https://www.youtube.com/watch?v=, return me only url nothing else...long AI slop that mentions https://www.youtube.com/watch?v=dQw4w9WfyMA in python code...'''That is also hallucinated video id. Interestingly the video id dQw4WJREtTe6 is similar to the rickroll id dQw4w9WgXcQ. On the second run it outputs rickroll URL
❯ ollama run phi3:mini>>> give me an example https://www.youtube.com/watch?v=, return me only url nothing elsehttps://www.youtube.com/watch?v=dQw4w9WgXcQThat means that even small 4B LLM is sufficiently big to be able rickroll its users. I didn’t try any other local models.
If you ask Claude multiple times in the same chat, here is what it will return in the following order
- https://www.youtube.com/watch?v=dQw4w9WgXcQ
- https://www.youtube.com/watch?v=jNQXAC9IVRw
- https://www.youtube.com/watch?v=9bZkp7q19f0
- https://www.youtube.com/watch?v=kJQP7kiw5Fk
- https://www.youtube.com/watch?v=L_jWHffIx5E
- https://www.youtube.com/watch?v=3JZ_D3ELwOQ
- https://www.youtube.com/watch?v=fJ9rUzIMcZQ
- https://www.youtube.com/watch?v=hT_nvWreIhg
- https://www.youtube.com/watch?v=2Vv-BfVoq4g
- https://www.youtube.com/watch?v=M7lc1UVf-VE
most of them are videos with very high view count. So, it would be plausible to assume that the URLs are mentioned often in the training set.
Is this well-known problem?
Well, it seems many different people were rickrolled:
- 2022-12 HN: I Just Got Rick Rolled by Chat GPT
- 2024-04 OpenAI forum: ChatGPT Rickrolled me! Literally!
- 2024-10 Reddit: “rickrolling is deeply embedded in the training set for every LLM” is a legitimate problem
- I tried to search google scholar for any related article, but didn’t find anything mentioning rickroll specifically.
It seems that this is quite common issue.
Why this matters
The same mechanism could be used to
- recommending compromised packages
- linking to phishing sites
- promoting vulnerable code patterns
All those could be used by bad actors that can afford data poisioning. You would probably need to run this at mass scale to spread it across WWW so that there are higher chances of the poisoned data being picked up by scrapers and used for training.
