Text this: An Ensemble of Convolutional Neural Networks for Sound Event Detection