Pony models use score system
A quirk in training means one has to preface prompt with 'score_9 score_8 score_7 ' to reach the training data for the model. Problem might be fixed in latest pony model. Check docs.
Prompt does not have to be verbatim. Just a kind if 'vibe' is enough.
Illustrious models do not have this issue.
See 'prompt as soundwaves' article: https://tensor.art/articles/913912699476253681
To mix / enforce concepts its best to repeat stuff at different points in the prompt , with a good space of other kinds if text in between them
CLIP has context size of 75 tokens. You can count tokens here , see coverohoto for this article: sd-tokenizer dot rocker dot boo (Google it)
Exceeding the token limit <150 tokens will create two encoding vectors A and B , and the average (A+B)/2 will be used for input
You can also blend within each encodings , as long as there is a decent amount of separation , see article for further details why that works
//--//
For SDXL especially , avoid token lengths slightly above 75 tokens , or slighly above 150 tokens as that will create an empty batch encoding vector B to the actual 75 token prompt A , when calculated as the final encoding (A+B)/2 would just make A be added to an empty vector B , essentially the same as prompting at half strength (A:0.5) when only slightly exceeding 75 tokens in prompt length.
//---//
To fix this; repeat the prompt again so the prompt is at a size slightly below 150 tokens.
Or reduce word count so prompt is below 75 tokens.
Capital letters A-Z and symbols are counted as single tokens.
You can browse all of the token vectors here: https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/blob/main/tokenizer/vocab.json



