Text this: A generative model for evaluating missing data methods in large epidemiological cohorts