>>105963972 (OP)you need to be able to turn words into tokens
so you could start with straight up a dictionary
have a layer that accounts for typos, you got a couple options here
idk, tree list of acceptable spellings + distance factor bw words
theres a couple ways you could do that too
and then once the words are numbers
you reformulate the problem as "which numbers elicit what other numbers in response"
you throw a whole lot of maths at the problem and see what sticks
you need to define success to see what correlates with it
so you might have to label a whole lot of data
which isnt a problem bc you can define what a successful thread is using the reply count and automate the whole process