Tag comparison experiment. Identical seed, settings, prompt; all four tags were present in each run, but all weighted "0.0" by default, except the tag being tested, which was weighted to "1.0".