September 5, 2022

Generating an image from a text prompt offline is now easy

Emmanuele Villa

How to install open source Stable Diffusion on your computer to generate images

If you don’t know it, stable diffusion is a latent text-to-image diffusion model capable of generating photo-realistic images given any text input.

You can try it online on various websites and discord servers, but some days ago I stumbled upon this repository and I decided that I wanted to run stable diffusion on my own computer, both to try it with more freedom and to test my GPU, since I don’t know really how much computing power these kind of models need.

I will gloss over the fact that yesterday morning the installation readme made me install conda, python, wsl, ubuntu, nvidia GPU drivers and docker, then I went out while pulling the 17 gb docker image and came back finding an updated readme with a one-click exe installer!

So now it’s very simple, you just need to download the zip file and (very important) extract it directly into C or in the shortest path you can, because otherwise windows will complain about file path lenght and abort the installation.

After extracting the folder, double click the Start Stable Diffusion UI.cmd file and then go out for a walk, because it will take a lot of time. The total space occupied after the installation is around 18 GB!

The model itself, in the pythorch’s .ckpt format is more than 4 GB:

After the installation is complete, run the cmd file again and go to localhost:9000 to open the UI where you can insert the text prompt, image modifiers and other settings. You can have a full view of the ui here.

For my first test, I put this:

In the middle of a swamp, A black dragon is resting atop a tall white marble pillar. The sky is cloudy.

After approximately one minute on my geforce 1060 6 Gb, this was the result:

The image is 512×512. You can increase it up to 1024×1024 in the settings, but any resolution higher than 512 gave me a VRAM out of memory error 🙁

As you can see from the result, the dragon is there, the pillar is there, the sky is cloudy and.. is that a swamp in the back? Let’s try to use the same seed and apply a modifier!

In the middle of a swamp, A black dragon is resting atop a tall white marble pillar. The sky is cloudy, Detailed and Intricate.

Modifiers can modify the drawing style, visual style, the pen, the carving and etching, the camera, the color, the emotion and even emulate an artist.

In the middle of a swamp, A black dragon is resting atop a tall white marble pillar. The sky is cloudy, Detailed and Intricate, Digital Art, Fantasy, Aerial View, Wildlife Photography, Electric Colors, Sad, by Van Gogh

Those images are very good, but they are not perfectly fitting the sentence and my vision of the image, so it’s time to use another cool feature: the input image. First thing first, I’ll reduce the modifiers to “Detailed and Intricate, Fantasy, Sad, Realistic, by Ben Enwonwu” which represents better the style I want.

Then, let’s take the image generation step by step, beginning with the background.

A swamp. The sky is cloudy, Detailed and Intricate, Fantasy, Sad, Realistic, Aerial View, by Ben Enwonwu

Awesome! Now we can use this image as the input for the next and see the full result!

In the middle of a swamp, A black dragon is resting atop a tall white marble pillar. The sky is cloudy, Detailed and Intricate, Fantasy, Sad, Realistic, Aerial View, by Ben Enwonwu

Aaand yes! This is more representative of the full sentence! I don’t know if I chose a wrong sentence or the model has trouble with complicated aerial view (and also this image is a bit strange), but even if the model doesn’t grasp the full concept we can use the powerful input image tool to create it step by step!

I’m totally not an artist and I can’t even imagine or describe things, so I’m sure that true artist can do and are doing amazing things with these neural network models, much so that a game designer won an art competition using an image generated by the Midjourney model

What do you think about it? Leave a heart and a comment 🙂

Generating an image from a text prompt offline is now easy

Emmanuele Villa

Share:

Leave a Reply Cancel reply

Related Posts

WoW Process Priority Manager

The Renshuu Widget for Android

Reading One Piece In Japanese, Part 3

Reading One Piece in Japanese, part 2

A journey in One Piece Japanese Version