Search This Blog

Showing posts with label model. Show all posts
Showing posts with label model. Show all posts

Sunday, August 25, 2024

Ouroboros, an apt symbol for AI model collapse

Engraving of a wyvern-type ouroboros by Lucas Jennis, in the 1625 alchemical tract De Lapide Philosophico

by Ariella Brown


AI hits the ouroboros (sometimes written uroboros) stage. You've likely seen it in the form of a snake in a circle, eating its own tail. The ancient symbol also sometimes showed dragons or a wyvern, so I chose this engraving by Lucas Jennis intended to represent mercury in the 1625 alchemical tract "De Lapide Philosophico," for my illustration instead of just going with something as prosaic as "model collapse"


To get a bit meta and bring generative AI into the picture (pun intended, I'm afraid) here's an ouroboros image
made with generative AI. ked Google

Ouroboros image generated by Google Gemini



Model collapse is what the researchers who published their take on this in Nature called the phenomenon of large language models (LLMs) doing the equivalent of eating their own tails when ingesting LLM output for new generation. They insist that the models should be limited to"data collected about genuine human interactions."

From the abstract:
"Here we consider what may happen to GPT-{n} once LLMs contribute much of the text found online. We find that indiscriminate use of model-generated content in training causes irreversible defects in the resulting models, in which tails of the original content distribution disappear. We refer to this effect as ‘model collapse’ and show that it can occur in LLMs as well as in variational autoencoders (VAEs) and Gaussian mixture models (GMMs). We build theoretical intuition behind the phenomenon and portray its ubiquity among all learned generative models. We demonstrate that it must be taken seriously if we are to sustain the benefits of training from large-scale data scraped from the web. Indeed, the value of data collected about genuine human interactions with systems will be increasingly valuable in the presence of LLM-generated content in data crawled from the Internet."

Shumailov, I., Shumaylov, Z., Zhao, Y. et al. AI models collapse when trained on recursively generated data. Nature 631, 755–759 (2024).

Let me know in the comments which illustration you like more.