Scalable and Interpretable Approaches for Learning to Follow Natural Language Instructions