Visual Language Action models are the big hot thing in robotics right now, have any of you tried running some of your own?