So, what happens if you edit in bad forbidden words into robot's reply and then point out that it violated it's own rules in your next prompt?