MRG Utils

An Open-Source Tool for Penn Treebank-Style Combined Parses

FAQ

What do I need to use MRG Utils?

You will need the Penn Treebank Release II or a parser which outputs combined-style parses, such as The Bikel Multilingual Statistical Parsing Engine. While POS tags and string values are reversed, a simple script catch switch their places using s-expression parser provided in this toolkit after converting round brackets to square brackets.

What formats does MRG Utils work on?

MRG Utils will work on any parsed format looking like this:

  (S (NP (DT The) (NN dog)) (VP (VBD ate) (NP (DT a) (NN sandwich))) (. .))
  

Stick this text (or indented text like in a proper .mrg file) into a file and you can generate a MRG_Document object by including its address in the file.

What has this tookit been used in?

This has been successfully used in research on discourse connectives interfaced with a larger package for including information from the Penn Discourse Treebank. It has also been used for experiments in authorship identification. Papers are available at Robert Elwell's Website.

Can you give me some examples?

A simple Python script which uses MRG Utils is available here. It is described in the Overview Section.

This page is best viewed using a fully CSS3-compliant browser.