Text this: Frequent-pattern discovering algorithm for large-scale corpus