When post-modern neural networks feed from retro-pop culture
Frankenstein, Dracula, the Wolf Man… The AI-based art app Wombo Dream produced interesting results…
--
Since my childhood, I have profusely fed on popular culture, and Universal Studios monster films are top on my list of productions that I can watch over and over again. Even though the Frankenstein and Dracula films don’t do justice to the original novels respectively by Mary Shelley and Bram Stoker, Boris Karloff’s and Bela Lugosi’s performances heavily influenced how those characters went on to exist in the Western collective unconscious. The highly contrasted picture and acting draw from the German expressionist cinema (Metropolis by Fritz Lang, Nosferatu by F.W. Murnau, etc.), adding to the lasting impact these films left on Western culture and, it would seem, on the image databases from all over the internet that today’s AI is fed to make sense of textual prompts.
As AI art is on the rise, quite a few interfaces emerged online over the last few years. The Wombo Dream app I discovered a few days ago works with two neural networks: VQGAN an CLIP. VQGAN generates images that are similar to other images, whereas CLIP determines how well the text prompt matches the image. They both work ‘hand-in-hand’ — CLIP tells VQGAN how well the text matches the picture it generated, VQGAN adjusts the picture based on CLIP feedback and sends it back to CLIP to check again and so on.
My (very simple) method
The Wombo Dream app’s interface has a text input box and a list of rendering filters that influence the graphical output. For each monster, I followed the same ‘method’. In the text box prompt, I typed the monster’s name and used the “Synthwave” filter to generate the image in a very 80s pop style, which I thought corresponded well to this trial’s general vibe. No explicit mention was made of Universal Studios. Even though the app allows you to upload a picture the AI can base itself to generate its own representation, I did not use this feature. The pictures were generated one after the other so no link was made between the characters in the text prompt. The webapp allows you to generate a picture again if you deem that the output is not satisfying. I limited myself to 7 ‘regenerations’ maximum.
Let’s find out how AI sees my childhood icons.
Textual prompt: “Frankenstein”
Textual prompt: “The Bride of Frankenstein”
Textual prompt: “The Creature from the Black Lagoon”
Textual prompt: “The Invisible Man”
Textual prompt: “The Mummy”
Textual prompt: “The Wolf Man”
Textual prompt: “Dracula”
Conclusion: Training sets and cultural stereotypes
As already announced in the first paragraph, the AI unsurprisingly reproduced the aforementioned monsters in an interestingly ‘accurate’ way — the distinctive features are there, save a few little details. In Frankenstein, we can distinguish the tall and prominent brow, the square chin and the haircut, but the bolts are missing. In The Bride the hair and profile (eyes, lips) are more than just a vague reminder.
The Creature from the Black Lagoon appears scaly, green and to be underwater. The Invisible Man wears his distinctive hat and coat (although not pictured in the comparative image above). The Wolf Man adorns his thick facial hair/fur, and Dracula wears his cape with red lining as the red bat-like shadow is floating above him. The Mummy’s depiction is too loose as it shows a figure with features that evoke Ancient Egypt iconography.
In my opinion, if the neural networks used datasets from all over the internet, it is easy to understand why these outputs were so unsurprisingly resembling the actual images of the textual prompts.
Try googling the monsters’ names and hit ‘Images’. For Frankenstein, Boris Karloff’s version dominates the panorama of millions of search results and there are barely any alternative occurrences. As The Bride of Frankenstein didn’t have any remakes or barely any alternative graphical depictions, Elsa Lanchester’s performance became iconic. Same goes for The Creature from the Black Lagoon and The Wolf Man — one film, one creature that went on to exist in various merchandisable forms all over internet. I suspect that for the latter, the fact that it is called “Wolf Man” and not “Werewolf” created a textual filter that oriented the final output in a significant way.
As for The Invisible Man and even more so for The Mummy, their internet occurrences are graphically more diverse, leading to a looser interpretation, especially for The Mummy as the title is also a common noun referring to an actual real thing, unlike the other purely fictional names. However, it does reproduce a figure graphically defined by traits that remind us of a pharaoh (the head piece and turquoise and gold colours) and not of a mother.
Finally, Dracula yielded a commonplace output, as despite numerous adaptations of Stoker’s novel, most depictions in popular visual culture remained quite stereotypical and loyal to Bela Lugosi’s version, as John Edgar Browning and Caroline Joan Picart developed in their book Dracula in Visual Media: Film, Television, Comic Book and Electronic Game Appearances, 1921–2010, published in 2014, one rare exception being Francis Ford Coppola’s 1992 adaptation aptly titled Bram Stoker’s Dracula which breaks from the cinematographic traditional depictions initiated by Universal Studios to dive back into the novel.
In a nutshell, I venture to give these AI-generated pictures above a deeper meaning — I read them as fragmented glimpses into our own cultural expectations, stereotypes and bias that kaleidoscopically and inevitably ripple through neural network training sets.
I hope you enjoyed this article, many thanks for reading!
Would you like to support me and other writers on Medium?
To get access to unlimited stories, you can also consider signing up to become a Medium member for just 5$. If you sign up using my link, I’ll receive a small commission at no extra cost to you. Thank you!
Jelena