Characterising semantically coherent classes of text through feature discovery