Rethinking translation unit size: an empirical study of an English-Japanese newswire corpus