Learnability of Autoregressive Transformers