Introduction
For version 1.0:
This model is based on 'Illustrious XL 1.0' with some minor modifications and was trained on the Danbooru2023 along with the dataset I previously used for training my LoRA models.
For version 2.0:
This developed model is intended to allow everyone to experience the v-pred version of Illustrious XL, instead of having to spend a large amount of STARDUST to unlock the Illustrious XL v3.0 v-pred and v3.5 v-pred versions.
I independently researched and developed this version based on various existing XL model architectures. However, due to the many modifications I made, I’m not sure it can still be considered 'Illustrious XL'.
The model was trained on the danbooru2024, danbooru_newest-all datasets, as well as a custom dataset (which I collected and labeled using natural language with GPT-4.5, and later manually verified by me).
For version 3.0:
With this version, the model was created with the purpose of adapting to as many styles as possible, while also balancing detail stability in the generated images. This model includes styles and artist styles (from Danbooru and e621).
Although it is oriented towards being a pre-trained model, you can use it normally. However, to achieve optimization, I suggest you combine it with LoRA or fine-tune it to create the style you desire.
The model was trained on the danbooru2024, danbooru_newest-all datasets, e621 as well as a custom dataset, with 40% of this data annotated using both tags and natural language.
This model is an epsilon-prediction model that can easy to use.
For version 3.1:
This version improves the issues encountered in version 3.0. In addition, it also enhances image quality related to styles and artist styles (from Danbooru and e621).
This model was trained on the same dataset as version 3.0, but I re-annotated it, added many new anime characters, and improved the quality of existing ones.
The model improves stability when generating images at a resolution of 1536x1536.
This version will have two variants: one for v-pred and one for e-pred (the e-pred version will be released first).
Important Note
This is the first base model I've created, so any feedback is welcome. Feel free to share your thoughts so I can improve it in future versions.
Version 2.0 is a V-prediction model (unlike epsilon-prediction), and it requires a number of specific parameters.
Version 3.0 should be set in low CFG about 2 to 4. When you encounter images generated with high contrast.
Suggested settings:
All example images were generated using the following settings:
Positive prompt: masterpiece,best quality,amazing quality
Negative prompt: bad quality,worst quality,worst detail,sketch,censor, simple background,transparent background
CFG: 5-7
Clip skip: 2
Step: 20-30
Sampler: Euler a
Note: I don't use any post-processing and Lora to enhance the example images. I only use these settings and a custom prompt with my base model to generate.
Acknowledgments
Thanks to narugo1992 and Nyanko for sharing such valuable data.
If you'd like to support my work, you can do so through Ko-fi!