Synthetic pre-training for robustness in information retrieval