Can Artificial Intelligence do Art?
2023 will clearly go down in history as the year when artificial intelligence is suddenly available to everyone, and in a multitude of applications. Currently, a wide variety of AI applications are springing up like mushrooms and it is not easy to keep track of them all. Therefore, in my blog articles I will take a closer look at a few selected applications and only briefly introduce others.
I want to start with Midjourney from Discord. Midjourney is a text-to-image AI that is not among the first of these applications, but is currently the most popular AI among designers, gamers and photo enthusiasts because the visual results are absolutely stunning and impressive. I can hardly imagine the impact this application will have in the future in terms of the professions of graphic designers, illustrators and photographers, as well as on the issues of copyright, deepfakes and manipulations, or the future of databases with their millions of images that want to be sold.
To get good results with Midjourney, the challenge is to develop the right prompt, which significantly influences and controls the image result (see also my blog article on prompt crafting). A prompt is a written instruction to the AI about what appearance the image should have when it is generated.
First of all, you have to register with Discord and find your way around the user interface, which is somewhat confusing. I don't want to go into detail here about how best to find your way around the Midjourney environment, so I'll just refer you to YouTube, where there are countless tutorial videos on how to use the Midjourney AI.
The 3-part prompt for a new image (Midjourney always produces 4 images with variations per prompt) always begins with the command /imagine and a prompt that is structured as follows:
1. image prompt (URL to an image). This part is optional, i.e. if you do not specify an image as a template, this part of the prompt can also be omitted).
2. text description (a precise description of what is to be included in the image, e.g. appearance, clothing or posture of one or more persons, visible image detail, buildings, landscape, colours, brightness or darkness, image styles (e.g. hyperrealistic, comic, steampunk, illustration, Disney or Pixar style, references to well-known artists, illustrators, photographers, etc.). This second part is the most difficult part and the professionals guard their prompts, which they have formulated in tedious, countless repetitions until the final result is satisfactory, like a treasure. The golden path to the super picture leads only through a prompt that is precisely formulated. The first prompt bundles are already being sold on the net for 30 or 40 €, on special themes such as architecture, landscapes, depictions of people, etc.
3. Parameter: Es gibt eine Unmenge an verschiedenen Parameterkürzeln, die an die Beschreibung angehängt werden und unmittelbaren Einfluss auf das zu generierende Bild haben, wie z.B.- -ar für Aspect ratio (Bildformat),- -no für den Ausschluss von bestimmten Elementen, – -s <Nummer> für einen zusätzlichen ästhetischen Stil, – -v 4 (oder 5) für die Version von Midjourney und viele andere Werte. Die Website von Midjourney listet diese Parameter auf und beschreibt sie im Detail. Natürlich gibt es im Netz auch unzählige Quellen, die diese Parameter erklären.
Since 14 March 2023, version 5 of Midjourney has been available (currently only for paying subscribers), which creates images that are much better than version 4, which had problems, for example, in depicting hands and feet or reflections well. This has improved fundamentally with version 5 and sometimes you have the feeling that the AI literally wants to push hands (or feet) into the picture to make this clear. Nevertheless, sometimes strange representations still occur, e.g. six fingers on one hand or the fingers look unusually crooked.
I will certainly write more blog articles on Midjourney, because as a graphic artist or designer this programme is a real gamechanger and the days are clearly too short to fully engage with all the possibilities this application opens up to you. Since a picture is worth a thousand words, I will close this article with an example, using the same prompt for V4 and V5 to illustrate the improvements. In principle, Midjourney understands many languages, but internally they are always translated into English. Therefore, to maintain precision, I choose to use my prompt in English. The prompt for the following 2 examples from Midjourney versions 4 and 5 is:
/imagine photography shot through an outdoor window of a coffee shop with neon sign lighting, window glares and reflections, depth of field, [person] sitting at a table, portrait, kodak portra 800, 105 mm f1. 8 – -ar 2:1. The only difference is the parameter- -v 4 and a - -v 5 at the end of the prompt to address the respective versions.
As you can see, various descriptions are made here, separated by commas, up to a special Kodak film material (Portra 800), a camera aperture and aspect ratio of 2:1.
The result of Midjourney 4 looks like this:
Lighting and blurring look impressive, but the glass reflections are not convincing. It is also noticeable that hands are hardly shown and the faces look more like in good 3D games.
Midjouney 5, on the other hand, makes a significant leap forward:
Faces look even more 'real', hands and reflections are convincingly rendered and ray tracing, i.e. the calculation of visible and invisible light, is greatly improved.
Looking at these examples, one can really get anxious about how much this application will turn our work as graphic designers, illustrators or photographers upside down. The possibilities of discrimination due to faulty data, partially lack of data protection, the addictive potential of dealing with AI or the power consumption that AI consumes in this form are also critical points to point out. Nevertheless, I also see this development as positive and will make this a topic in a later blog article, because creativity is still reserved for us humans. To answer the initial question of whether AI can do art, I am rather holding back because the answer can only be subjective. Everyone must answer this question for themselves.
And by the way: this article was written by a human being 😉