Text this: Parsing of Research Documents into XML Using Formal Grammars