Front Page
Gallery
Tools
Contact




Introduction

Workflow

Copyright

Gallery

Tips

Midjourney's Reference Style

Consistent Replication

AI Art

Effect on Human Art

Desired Improvements

Links

Epic Soundtracks









Introduction





If you read my story, you know that I've been creating my own board and card games since I was about 12 years old.

One thing that I've always missed was the art to go with my games, in particular my science fiction card game. Over many years, I've immersed myself in a world of graphical art from various artists, but I was never able to draw or pay artists to create the artwork for the games.

Forward several decades later and, today, I can do what I do best: use words to create worlds, using the power of AI prompts. When I saw first hand the power of AI art, it felt like my creativity could finally be unleashed. As weeks passed, I realized that this project is turning more and more into a fusion between my previous work (lore and generative art) and the generative visual power of the AI.

After a few weeks of playing with AI art, I felt not like a kid in the candy store, but like a kid who was let loose in the entire chocolate factory.

Visit the gallery to see the generated AI art.







Workflow

I generate AI art with Midjourney, Lexica and Leonardo, sometimes using both since each has its limitations. I post-process them with a photo editor.

I'm not responsible for the esthetic of the drawing process itself, nor for all the details in an image, just like I wouldn't be in total control if I were to commission an image from a human artist. But, like a landscape photographer, I am there to push the button and bring the magic into the light. I create text prompts, sometimes with my own lore and generative art (as reference images), I curate the generated images and post-process them to produce the esthetic that feels good to me.

My goal is to bring to light the most imaginative and visually stunning images possible, images that would make the child in me remember that he's still there, and go "Wooow!" The fact that I don't control every bit of the composition is irrelevant to me. I want to be amazed. I want my eyes to sparkle with the joy of a child discovering the world.

For each prompt, I do maybe 2 or 3 generations, so I don't reroll much. However, I do a lot of tests to refine prompts, making a lot of small changes at each generation. This allowed me to extract the essential patterns that are now part of the Prompt Builder tool and Tips.

How much post-processing do I apply on the AI images? It depends on how clean they are and whether it makes sense to alter them dramatically. Also, some images can't handle a lot of post-processing because they start showing (hard to clean) artifacts. I generally clean the images of artifacts and enhance their colors to make them look stunning on wide gamut displays (like TVs).

All images are generated on a display with a wide color gamut (similar to that of Display-P3), and have embedded in them the color profile of the display on which they were generated. This lets image viewer applications adjust the colors so that you can see the intended colors, as closely as possible. If you need the ICC profile of the display, you can download it from here.

Some images are upscaled to 8K resolution with, sometimes, excellent clarity.

If you want to use an image as wallpaper on a Windows computer with a wide gamut display, set it with a software which ignores the color profile (like IrfanView). The wallpaper feature from the desktop personalization app (and also from the contextual menu of "File Explorer") has an issue with embedded color profiles, issue which makes the images too saturated.

For tips on how to print the AI art, read this. You should use an image editor to first convert an image's embedded color profile to a printable one.







Copyright

All the images that I generate with AI may be used freely by anyone, including for commercial purposes.

For some of the AI images, I've used reference images from my generative art images, for others I've used my own lore. Such AI images can't be generated only from prompts (with seeds), they also require the images that I've made with mathematics, or the lore.

My generative art images may be used freely by anyone, as reference images to generate AI images. My lore may be used freely by anyone in text prompts to generate AI images or text. Outside of this scope, they may not be used commercially. The generated AI images may be used in any way, including for commercial purposes, even without attribution to "voidsculptor.art".







Gallery

You can view my AI art galley here.

For tips on how to view the AI art, read this. For tips on how to print it, read this.









Tips





A prompt can contain three main parts: the visual style (like photograph or illustration), the composition styling (like lighting and type of composition) and the composition itself (describing the entire scene). The order of these parts affects the resulting AI image only in small details, so order them as you prefer. The composition styling can be mixed in with the composition, if it comes more naturally to describe the scene that way.

This allows the separation of the visual style from the composition, so as to be able to carry the same visual style across a body of work. The consistency of the visual style is critical, whereas the composition is different every time.

The order of the words from a prompt matters especially in a combination of styles, where the relevance of each style decreases from the start toward the end of the prompt.

The composition should be as concise as possible, described as if talking to another person. If the default visual style of the AI is not desired, the desired visual style should be added.

Words should be added incrementally to the prompt, in order to better shape the desired result and to understand what words are really necessary, so the prompt doesn't end up looking like a word soup where two-thirds of the words are noise.

Advanced users should plan their prompts before using them in an AI art generator. They can be changed in a feedback loop according to the AI's output, but they should be planned in advance.

In order to get more control over the result and to feel more involved in the generative process, use (your own art as) reference images, in so called image prompts.

Reference images are interpreted separately from the text, so it doesn't matter which words come right after the reference images.

A human artist can work to incrementally improve a received commissioned image by receiving more detailed textual instructions. AI can't work like that. In order to improve a specific AI image, you have to use that image as a reference image for the next generation, with an improved text prompt, and so on until you're satisfied, but each generation will be quite different from the initial image.

A reference image can contain elements stitched together to better convey your ideas, to represent a scene, to direct the composition and to have the AI render the whole scene in a cohesive manner. If the (sharp) edges of the stitched elements aren't blended in the rest of the image, they may show up in the AI images, so you'll have to experiment with what's best.

All parts of a reference image (foreground, background, colors) are used and affect the result, so edit the image (with a photo editor) before you use it. Make sure to remove any text from it, else you'll likely see gibberish text on the AI image. If the reference image has a solid color background, any background described in the text prompt may be ignored.

Once you post-process an AI image and see that it has visible artifacts due to the fact that you had to push the post-processing hard, you can use the post-processed image as a reference image for a new generation. Use the same prompt, unless you want to (try to) change something; the original seed won't matter since the reference image is now the seed. The new generation should have very good technical quality, with no post-processing artifacts, because the AI is fully reimagining it. The new image should require much less post-processing. Unfortunately, the charm of the original image (with its pose and facial expression) will likely get lost.

Use abstract reference images in order to get the AI to be more creative. Abstract images are impossible to describe in words because the shapes they contain can usually be described only with words like "curves" and "splotches". However, the AI can detect visual patterns and use them. See here how I used my own art.

The effect of an artist's name in a text prompt may be unexpected because the AI:

  • Uses the artist's visual / graphical style and the artist's composition and subjects, and most of the times only the visual style is desired since the composition is new.

  • Would have to have seen a significant number of images produced by that artist, which is not the case for the overwhelming majority of artists.

  • If the artist is known, the effect of the artist's name varies proportionally with the number of known images of that artist.

  • Picks up on the meaning of the words from, or on the nationality of, the artist's name and changes the scene to match that meaning or nationality. For example, Midjourney doesn't know about the style of "Void Sculptor", but picks up on the meaning of the words "void" and "sculptor" and generates a style that's very different than the prompt without the name.

  • Generates a different image even if it doesn't know the artist, even if the seed from the prompt remains the same. If the AI doesn't know anything about some text from the prompt, it uses it as randomness (aside from the seed).

There are times when the word "photography" may negatively interfere with a prompt, for example when the prompt references an ancient time period (since there are no photographs from that time period, so the result will look modern), and when the prompt is about interior design (where the word "photography" may cause the AI to focus on one detail instead of the whole space). If this happens to you, avoid the word, and if then the resulting visual style is not what you want, use a word like "(hyper)realistic" instead.

If you want a specific composition, you'll have to reroll many times because the AI doesn't follow (textual) instructions well enough.

Finalize an AI image with post-processing in a photo editor.

If you want to generate AI art that is inspired from other people's art, either from their visual style or from their composition, and you're asking yourself whether it's ethical to use names or images of artists in the prompt, then consider doing the following instead:

  • Use reference images in prompts. Note that the AI (normally) uses reference images just as reference and doesn't try to replicate the actual content. If you were to go to a human artist to commission an artwork, you would show that artist some images with what you like and what you would like the result to look like, images made by other people.

  • Convert human-made images to sketches, with an image editor, and use the sketches as reference images. You can also make manual changes yourself, like stitch elements together or remove existing elements.

  • Use an AI engine known as image-to-text. Such an AI receives an input image and outputs text which describes the image, text that you can then refine and use as a text prompt to get a similar visual style (as that of the input image).

  • Use a style mix of at least two artists, preferably three, by using all their names in the text prompt, style that belongs to no single artist. The AI art generator will also mix in a bit of its own style.

  • Ask a language AI (like GPT) to describe the style of an artist. Use a prompt like "As a master of visual arts, describe the art style of artist [artist name] in a list with maximum 30 words, and concatenate the items of the list in a single string separated by commas."

The AI may fine tune itself based on the AI images that the user generates and likes. In Midjourney, you can see this when using someone else's prompt (including the seed) and getting a very different result. Because of this, it's virtually impossible to make unbiased tests; the way to do it is to use separate accounts for every generation.

In Midjourney, variations don't usually increase the quality of the AI images, and many times decrease it. This is especially true for variations of variations.

In Midjourney, using a negative prompt, especially with a high weight, may dramatically degrade the quality of the AI images.









Midjourney's Reference Style

The reference style isn't actually replicated, but mixed with a bit of photorealism, cleaned up, sharpened and amplified. The best reference style images are relatively basic illustrations, not ideal / perfect illustrations.

Since the reference style isn't replicated, the resulting style is unpredictable, so a lot of testing is required to see what you like, but if you have an artistic sense and you persist, the results are absolutely mind blowing.

While the resulting style is unpredictable, the category of aesthetic remains the same. So, a cartoon reference image produces a cartoon, an anime produces an anime, an oil painting produces an oil painting, a render produces a render.

Varying the parameters and combining multiple reference images is predictable (enough). Predictability allows you to understand the potential of other (untested) combinations, and therefore dramatically reduces the number of tests that you have to perform.

Before you start combining multiple reference images, test only single images in a fixed prompt in order to understand the result. Prompts which generate subjects with different scales (like half-body portraits, landscapes, and something in between) are rendered with quite different styles for the same reference image, so you have to optimize for each scene type separately.

To test the reference images, make sure to use prompts which fit most of the images from your project. If your project consists mostly of half-body portraits, use a test prompt which asks for a person's portrait (different prompts for men and women). If your project consists mostly of landscapes, use a test prompt which asks for a landscape. If your project consists of both half-body portraits and landscapes, use two test prompts.

Evaluation sample prompt:

[prompt] --v 6 --seed [fixed seed to minimize randomness] --stylize 200 --sw 100 –sv {1,3,2,4} --sref [URL of an image with an illustration reference style]

Refinement sample prompt:

[prompt] --v 6 --seed [fixed seed to minimize randomness] [--style raw] --stylize {100,200,500} --sw {50,100,200,500} –sv {1,3,2,4} --sref [URL of an image with an illustration reference style]

If you have a lot of reference images to test, use each of them in the evaluation sample prompt, as this is the fastest and cheapest way to evaluate reference images. Once you see a style that you like, you can use it in the refinement sample prompt to get realism-related variations of that style.

Multiple URLs may be separated by a blank space, and each may have a weight: --sref [URL::weight] [URL::weight] [URL::weight] . This will average the style among all the reference images, but all images have the same "style", "stylize" and "sw" parameter values. The more images there are, the more stable the average is, which is great for maintaining the style consistency. You can give the resulting style some character by increasing the weight of one URL.

To converge toward the desired result, vary the "style", "stylize" and "sw" parameter values. This will produce visual styles which vary between bad illustration and photography, although the ends of this range won't always be available. The truly esthetic illustration styles are somewhere in the middle of these parameter values. There is a lot of overlap of these parameters, so you'll have to test multiple combinations, which can quickly go out of control (including cost-wise).

The visual style specified in the prompt affects the resulting style to a limited extent, at least when the style weight is low (like 50).

What do these parameters do:

  • --style raw: Using the raw style moves the resulting style slightly toward photography. The change is rather subtle, but visible when compared side by side.

  • --stylize 100...1000: Using a low stylize value moves the resulting style toward illustration, while using a high stylize value moves the resulting style toward photography. The value can be as low as 0, but the results are rarely interesting.

  • --sw 30...300 (best: 50...150): Using a low style weight value moves the resulting style away from the reference style (toward the default style, like photography). Using a high style weight value moves the resulting style toward the reference style. The value can be as low as 0 and as high as 1000, but the results are rarely interesting and contain more artifacts.

If the result isn't photorealistic enough even at the maximum stylize value (1000), reduce the style weight in order to reduce the stregth of the reference style. You can also use the raw style in order to slightly amplify the photorealistic effect.

If the result is too photorealistic, especially in the background, even at a low stylize value (or the stylize is high and you don't want to reduce it), increase the style weight in order to increase the stregth of the reference style. This usually reduces the photorealism, particularly of the background, because the entire result gets closer to the reference style.

The the "style" and "stylize" parameters have the same effect without the reference style.

Other tips:

  • If you like how the (foreground) character and the background created by two different reference images look like, you can stitch them together, that is, stitch the character from one reference image over the background of the other reference image.

  • If a reference image has significantly large areas with strong and opposed colors, the AI may flip-flop between those colors, creating strong contrasts either between images or even within images.









Consistent Replication





It's currently not possible to consistently replicate an (reference) image in Midjourney, and only change small things in it.

However, it's possible to get relatively close by using the reference image, with the highest image weight, together with the prompt that generated that image (but ignoring the seed).

Here is an example text prompt:

interior design, a single-level house with a deck and a huge courtyard in the middle, warm wood decor, sunlit courtyard, courtyard walkway, contemporary design, ultra sharp, 8k, full color --no water --ar 16:9 --v 5.1

Here is the result of the prompt above:

Here is the result of using the image above with the same prompt, as a reference image, with an image weight of 2:

[url] [prompt from above] --iw 2

The result is a set of images that are relatively close to the original, but, most importantly, close to each other:









AI Art





AI is a very useful tool for the artists who dream and imagine, a tool that can draw for people who couldn't otherwise draw a straight line, or that can replace the tedious nature of using software to paint digitally. With AI, artists can focus on the look and feel, and on composition of an image instead of on drawing its every detail.

AI art will create a feedback loop between people and AI, loop that will revolutionize the world of art. The AI will get its inspiration from other people's work, just like human artists have always done (either to imitate or to avoid), and people will then get their inspiration from the art generated by AI, and so on. This is how people have done it for millennia.

AI is exceptionally good at blending and coloring. The resulting style is consistent across the entire image, in contrast to how images look when disparate textures are used in human art.

AI is also exceptionally creative. Its ability to creatively fill in the gaps (in a prompt) is on a superhuman level, literally.







Effect on Human Art

Here are some important conversations about the effect of AI on human artists.







Future of Art

The future of commercial digital art is create-on-demand. For example, people who need an image for home decoration will go to an AI art generator, search for a little while for what they need, and then generate a great match for their home, with the style and colors they need.

At the same time, art will be integrated (even more) in products and sold as part of those products, not so much individually. Other possibilities are: the services that offer AI art generators allowing the generated art to be purchased, and TV manufacturers allowing everyone to publish their art to be purchased and displayed on TVs.







Copyright

Can human-made images be used as reference images for the AI? Would this break copyright law?

Nobody comes up ideas from The Void. Human artists learn from other artists and have always been inspired by other people's art, either to imitate or to avoid. Some of the greatest human artists of all time have copied the composition of other artists' paintings.

To say that AI, either the current AI generators or the future robots, isn't allowed to learn (without explicit approval) from what humans have created (and made public) is like telling humans the same thing. It's the wrong path to take.

The argument should never have been about "AI art is theft" but about "What do we do next?"

There is no law that forbids anyone to learn from the work of others, and copyright law doesn't protect either the style or composition of art, at least so long as the new art is markedly different from the original art. To the contrary, the law explicitly allows the creation of derivative works starting from an existing work.

Laws vary with the country, and the rise of AI art may lead to changes in copyright law in the future. If the laws change, any human-made content (images, video, text) should be explicitly allowed to be used for machine learning and as reference content in a human's prompt.

The individual expression of each piece of (AI) art should be protected by copyright, so long as it's markedly different from previously made content, be it made by humans or by AI. So, if there is a visual resemblance between two images, but it can be seen without hesitation that they are different, then they should be protected as two distinct images. If this were not the case then someone could lie and present AI art as if it were human-made, thereby avoiding the lack of copyright for AI images.

This should be done at least for the AI art used in large projects, like graphic novels, at least until the moment when people can use a single prompt to generate the entire project. This would allow people to get extensively involved in the generation process, so they would use AI as a tool that takes the human imagination, planning and human-made art and sketches, and render them in a cohesive manner.

What would happen if someone were to start generating AI art by trying out all combinations of words? Such generations should be excluded from copyright protection. But such an attack on copyright would be practically impossible since each text prompt uses (internally) a 32 or 64 bit random seed, which means that for every text prompt there are billions of possible results, although many would be very similar.

Something else to consider is that the AI fine tunes itself based on a user's generations and likes, which means that different people who use the same prompt (including the seed) get different results. I've seen cases where the resulting images have the same visual style, but the compositions have nothing in common.

But the most important part is that every text prompt can be accompanied by reference images, and that means that you have to multiply the previous combinations by billions, for each reference image used (because there are billions of images in the world), never mind the fact that each image itself can be a combination of other images stitched together. This means that if the need would ever arise to prove how an AI image was generated, it's possible to reproduce the image in a court of law, yet it would be practically impossible for other people to produce the same image independently. So long as the person who created the prompt (with text and reference images) keeps the prompt or seed secret, nobody else can practically generate the same image.







Opting Out

Many individual artists want to opt out from having their images used for training the AI, for fear of losing their income. But will that stop the AI from producing images? No. Would the AI images look less artistic? Maybe, but even the artists who opt out will still have to compete with the AI art.

Even if the artists whose images are used for training the AI were paid for their own images, the training is performed on billions of images, so the payout to each artist would be minuscule.

Digital artists will have to adapt to the competition because the bar will be set very high. Any artist who feels that a text prompt is not allowing for enough creativity, should use their own art as reference images. See here how I used my own art.







Will People Ever Make Art Again?

People will always make art and show it freely to other people online, although not necessarily at very high resolution. The Internet was already full of such images even before AI became mainstream.

Hardware (tablets), software (image editors) and textures, are tools, not art. Art is the imagination that creates a composition. The artists who adapt and use this new tool called AI will make quick sketches (that they keep private) and put them through the AI to render them, allowing the artist the artistic freedom to imagine a composition, but eliminating the most part of rendering the images. The fact that some artists like this time consuming rendering part will not affect how most artists will adapt and evolve in the era of AI.







Does AI Rip-off Human Art?

During training with art made by human artists, an AI learns by retaining small bits of relevant information that later allow it to generate somewhat similar images. This is the same way humans learn about anything, and create things inspired by what they've learned from. This is the whole difference between "learning" and "replicating".

From a text prompt, an AI can't generate images that are virtually identical to the images it learned from, because it simply doesn't store the information that's necessary to be able to do so.

However, virtually identical replication may occur occasionally. If this happens and the prompt is generic, then it's a rare occurrence. If a reference image (made by a human) was used and the AI was pushed to generate a replica of that, then this is a human problem, not an AI problem.

Perhaps a more relevant issue is that the people who generate AI art can't be sure whether an AI image is new or virtually identical to something that the AI learned from. In this case, a reverse image search can be made on the Internet, to check whether the same image can be traced to a prior time and human artist.







Is AI Art a Remix of Human Art?

Yes, yet the question and the answer are irrelevant because the remixing of much fewer elements created all the human culture:

  • A few tens of symbols, the alphabet, were remixed into language, something that made it possible for cavemen to evolve into beings capable to send machines outside of the solar system.

  • A handful of primary colors were remixed and used to create all the painted human art.

Some people will not be convinced by such simple analogies because they believe that humans have free will, somehow detached from the precise laws of physics, and can create an infinite number of combinations of art. But how would a human brain do that if it had no sensorial input from the environment? What signal would it process? How would that signal get inside the mind? It couldn't.

But let's say that it could. Why, then, it doesn't do so in babies? Why can't a baby create a a masterpiece artwork? Because the mind must first be formed from the sensorial input received from the environment. This means that a human mind can only remix what it knows. The same is true for the AI, but since it has no sensors of its own, people have to feed it signal.







Signature on Images

Some AI images appear to have signatures, logos, or watermarks, similar to human-made art. During training, the AI learned that some images have signatures, so the AI is trying to imitate their presence as well. Nobody is denying that such images are being used for training AIs; copyright law doesn't forbid such derivative use.







Does AI Make Theft Easy?

Some people complain that AI art generation makes art theft easy because anyone can replicate any artist's style and art, and sell the copies.

Technology makes things easier. Printing and photo cameras made it easy to copy paintings. Computers and the Internet made it easy to copy paintings and photographs. AI follows the same pattern. The world has constantly adapted and the artists who a few centuries ago couldn't sell one painting, today have jobs where art has mass commercial applications. It's just that these applications change with the technology and it's not yet known what the future will look like.







Artists should be paid for their art used to train the AI

Unfortunately it wouldn't benefit the artists.

The AI doesn't directly use any image or art to generate an image, it is trained on images to find patters, and this training produces roughly 1 byte per image (out of the millions of pixels of every image). But consider that it would be possible to associate each such byte used in a generation to an artist to pay.

Consider the following completely hypothetical things:

  • Let's say that the average AI users pays 100 dollars per year to the AI company.

  • Let's say that there is an average of 10 million users per year. This means that all the money that AI company gets is 1 billion dollars per year, out of which they have to pay everything, especially the servers and electricity.

  • The AI is trained on billions of images, roughly 5 billion images. To be conservative, round this down to just 1 billion training images.

  • All the money (1 billion dollars) divided to the number of images used in training (1 billion) means that the AI company gets an average of 1 dollar per training image per year.

  • If the AI company pays 10% of what they make, that's 10 cents per training image per year.

  • A poll done by Midjourney showed that 50% of users prefer to generate photos, and 50% prefer illustrations.

  • Let's ignore the above poll and presume that the AI users focus much more on generating illustrations than photos, so only a small subset of 1% of the training images (= the illustrations) is used, so multiply the above 10 cents 100 fold. That's 10 dollars per illustration per year. This is what an artist could possibly get.

People think that because it's possible to generate images with a look similar to what an artist makes, that's what millions of AI users do all day long.

Personally, I'm building illustration styles by mixing multiple reference images (out of the tens of thousands of images I've generated). I'm using the reference weights and the stylization parameters to alter the styles incrementally in order to literally build styles which never existed and aren't copying any artist's style. I can see that this is true because I see how the styles are an incremental and manually-driven mixture.

It would be good if artists were paid for the training data, but it wouldn't make a dent because this is a problem of automation of effort, not of art and artistic styles.









Desired Improvements





How AI art generators should improve:

  • Generate high resolution images, at least 4K (3840 pixels wide), with good sharpness.

  • Control and consistency of the visual style, regardless of the rest of the prompt. This requires a separation between the visual style and the composition. This could be done with models (that can be mixed, like "airbrush painting" and "pencil"), with a custom set of images from which the visual style is extracted, or with the text prompt itself. This is clearly possible now since existing out-painting keeps the visual style. Midjourney currently ignores the specified visual style in many cases, or its variation is enormous; presumably, this allows it to be extremely creative. Other AIs are consistent with the visual style, although presumably this results in limitations like less creativity, fewer elements in an image, and few available visual styles.

  • Visual style transfer via a prompt (either text or reference images).

  • In-painting, so that we can replace deformed parts.

  • Out-painting, so that we can expand images.

  • Good quality faces on full body people, in images with a 16:9 aspect ratio (so a relatively small height).

  • High bit-depth colors, at least 30-bit colors, so that we can push the images in post-processing. PNG images support up to 48-bit colors.

  • Consistent (human) faces.

  • AI editing: make changes to existing images, via a prompt (either text or reference image).









Links





Here are my tools, one to help build prompts, the other to view a gallery of prompts.

AI art generators: Midjourney, Lexica, Leonardo.

Image-to-text engines (describe images in words): Midjourney's "describe" command, CLIP, Replicate.

Learn to generate AI art: Christian Heidorn (Midjourney), Future Tech Pilot (Midjourney), Olivio Sarikas (Stable Diffusion), Jenn Mishra (Midjourney).

Visual libraries of artists: Midlibrary.

Prompt marketplaces: Promptbase.

Stock images (many images free to use): Unsplash, Pexels.

AI copyright videos: Shadiversity.









Epic Soundtracks





For soundtracks to listen to, for an epic experience, see this.







Front Page