Creating art with artificial intelligence is not new. It’s as old as AI itself.
What’s new is that a number of tools now allow most people to generate images by entering a text prompt. All you have to do is write “a landscape in the style of van Gogh” in a text box and the AI can create a beautiful image as instructed.
The power of this technology lies in its ability to use human language to control art production. But do these systems accurately translate an artist’s vision? Can incorporating language into art really lead to artistic breakthroughs?
I’ve been working with generative AI as an artist and computer scientist I contend that this new kind of tool limits the creative process.
When you write a text prompt to generate an image with AI, the possibilities are endless. If you’re a casual user, you might be happy with what the AI generates for you. And startups and investors have donated billions I got into this technology and see it as an easy way to create graphics for articles, video game characters, and advertisements.
In contrast, an artist may need to write an essay-like prompt in order to create a high-quality image that reflects their vision—with the right composition, lighting, and shading. This lengthy prompt does not necessarily describe the image, but typically uses many keywords to invoke the system of what is going on in the artist’s mind. There is a relatively new term for this: fast engineering.
Basically, the role of an artist using these tools is to reverse engineer the system to find the right keywords that will force the system to generate the desired output. It takes a lot of effort and a lot of trial and error to find the right words.
AI is not as smart as it seems
In order to learn how to better control the outputs, it is important to realize that most of these systems are trained using pictures and captions from the Internet.
Consider what a typical caption says about an image. Subtitles are usually written to complement the visual experience of surfing the web.
For example, the caption could include the name of the photographer and the copyright owner. On some sites like Flickr, a caption typically describes the type of camera and lens used. On other websites, the caption describes the graphics engine and hardware used to render an image.
In order to write a useful text prompt, users must enter many non-descriptive keywords for the AI system to create an appropriate image.
Today’s AI systems are not as intelligent as they seem; They are essentially intelligent retrieval systems that have large memories and work through associations.
Artists are frustrated with a lack of control
Is this really the kind of tool that can help artists create great works?
At Playform AI, a generative AI art platform that I founded, we are Conducted a poll to better understand artists’ experiences with Generative AI. We collected responses from over 500 digital artists, traditional painters, photographers, illustrators, and graphic designers using platforms such as DALL-E, Stable Diffusion, and Midjourney, among others.
Only 46% of respondents found such tools “very useful”, while 32% found them somewhat useful but unable to integrate them into their workflow. The rest of the users – 22% – did not find them useful at all.
The biggest limitation artists and designers highlighted was a lack of control. On a scale of 0 to 10, with 10 being the most control, respondents indicated that their ability to control the outcome was between 4 and 5. Half of the respondents found the results interesting but not of high enough quality to use in their practice.
When it came to believing if generative AI would impact their practice, 90% of the artists surveyed thought it would; 46% believed the impact would be positive, 7% predicted it would be a negative impact. And 37% believed their practice would be affected, but were unsure how.
The best visual art goes beyond language
Are these limitations fundamental or will they just go away as technology improves?
Of course, newer versions of generative AI will give users more control over outputs, as well as higher resolutions and better image quality.
But for me the biggest limitation as far as art is concerned is fundamental: it’s the process of using language as the main driver in generating the image.
Visual artists are by definition visual thinkers. When imagining their work, they typically base it on visual references, not words—a memory, a collection of photographs, or other artwork they encountered.
When language is the focus of image creation, I see an additional barrier between the artist and the digital canvas. Pixels are only rendered through the lens of language. Artists lose the freedom to manipulate pixels outside of the confines of semantics.
There is another fundamental limitation with text-to-image technology.
If two artists enter the exact same prompt, it is very unlikely that the system will generate the same image. It’s not because of anything the artist did; The different results are simply due to the AI starting from different random starting images.
In other words: the artist’s work is limited to chance.
Nearly two-thirds of the artists we surveyed had concerns that their AI generations might resemble the work of other artists and that the technology might not reflect their identities — or even replace them entirely.
The issue of artist identity is crucial when it comes to making and recognizing art. In the 19th century, when photography became popular, there it was a debate about whether photography is an art form. In 1861, a court case ensued in France to decide whether photography as an art form could be copyrighted. The decision depended on whether an artist’s unique identity could be expressed through photographs.
The same questions arise when looking at AI systems that are trained with the existing images of the Internet.
Before the advent of the text-to-image prompt Creating art with AI was a more involved process: Artists typically trained their own AI models based on their own images. This allowed them to use their own work as a visual reference and had more control over the results, which better reflected their unique style.
Text-to-image tools can be useful for certain writers and casual users who want to create graphics for a work presentation or social media post.
But when it comes to art, I can’t imagine how text-to-image software can adequately reflect the artist’s true intentions or capture the beauty and emotional resonance of works that captivate the viewer and see the world with new eyes let see.
Want to learn more about AI, chatbots, and the future of machine learning? Check out our full coverage artificial intelligenceor browse our guides The best free AI art generators And Everything we know about OpenAI’s ChatGPT.