Currently commonly used stability? Diffusion? And then what? Dahl II. I haven't done any training on text generation, even if I make similar requirements, it is easy to generate garbled or distorted text. But using image editing software (such as Adobe? Photoshop, GIMP, etc. ), add text to the generated picture. Open the picture generated by AI, select the appropriate tool (such as text tool) in edit mode, set parameters such as font, size and color, and then add the specified text to the picture. This problem can be effectively avoided, and the novice party can save the fast run.
However, many people have put forward similar demands, such as printing specific words on brands and clothes, or making specific tattoos and watermarks, which is indeed of practical significance. It is said that? Google? what's up AI? It's gonna be okay, and then what? NVIDIA? Open it? Edith.-Me? There are already more results available.
I think the general idea is to use text to generate the model, input the text to be displayed into the model, and then generate the corresponding image. There are some open source text generation models (such as AttnGAN, StackGAN, etc. ) can be used for this task.
For example, in the example exposed by netizens, can common ordinary texts be used directly? Dressed? Answer? T-shirt Reading? "Some? Words "? Try to generate it, but even so, it is difficult to set the font/font size/style, etc. Directionally. If so? Logo or tattoo, preferably something similar? PS? The positive film and the negative film can be superimposed in an integrated way.
Are you online? Is it stable? Diffusion? Automatic generation is the best. If you can't integrate, you can wait for the bosses to do similar plug-in functions directly. This method needs related algorithms and training data, and machine learning technology can help realize the association between images and texts.
. In fact, there are many similar ways to pinch people in the game now. What impressed me deeply was Yinzi's "loyalty", which is actually the superposition of a picture, bound to the character modeling. As for the use mentioned? controlNet? Yes, I thought at first that I might not be able to control it so carefully (or maybe I haven't studied it thoroughly enough …).
At present? controlNet? What is the most promising preprocessing model and function provided by model description? mlsd? But that thing is mainly used to detect the edges of buildings ... I still have to try. In short, my idea is that this should not be difficult for big bosses or modelers. I tried to increase my writing training, but my Chinese may not be very good.
Future? Question? There may be simpler models or plug-ins to implement when there are many people, so it is a bit too high to find some strange skills now.