Setting up Method 1.


Note: Basic knowledge of GATE is assumed (loading and working with corpora and different processing resources).

1) Linguistic preprocessing and pattern based extraction:
Build a Corpus Pipline Application and load the following PR's (Processing Resources).
        *  Document Reset PR
        * ANNIE English Tokeniser (set annotationSetName = Tokens)
        * ANNIE Sentence Splitter (set inputASName = outputASName = Tokens)
        * ANNIE POS Tagger (set inputASName = outputASName = Tokens)
You can use any other name instead of Tokens but make sure that all these resources use the same annotation set.

Then l
oad the following JAPE transducers one by one in this order and add them to the pipeline.
Alternatively you can simply load main.jape which will load all the transducers for you.

          * 1_NP_ID (set inputASName = outputASName = Tokens)
          * 2_VB_ID (set inputASName = outputASName = Tokens)
          * 3_Funct_ID (set inputASName = Tokens and  outputASName = Functionality). While you can use any other annotation set name instead of Tokens, you have to use Functionality as the output annotation set for the last pattern - the next modules assume it exists.

Note: you could save this corpus somewhere so that you do not repeat these steps.

Method1 Setup in GATE


2) Ontology learning

Save the rio10.jar in the lib directory of your GATE installation.
Load
the VuOntologyBuilder  PR (using the supplied creole.xml and VuOntologyBuilder.jar).
Build a  Pipeline Application which only contains the VuOntologyBuilder  PR. 

Set the  parameters as follows:
    * myCorpus  = a corpus that has been processed as above. You could do the processing in the same GATE session or reload processed data that was saved (as XML or in a data store).
    * rdfOntology = a path to a filename where you want the created ontology to be stored. This ontology will be saved as .rdfs so you can see it with any ontology editor tool (e.g. Protege). You can also load it immediately in GATA (Language Resources\New\Ontology ... the chose the path to your ontology).
    * verbLexicon and nounLexicon are the paths to files containing the verb and the noun lexicon supplied in the package.
    * pruning = is the pruning selector. If true pruning will be performed, if false then no pruning will be performed.
    * functionalityStyle = the style to be used for generating the functionality hierarchy. If the value is "VerbOnly" only the distinct verbs will be taken into account (e.g. Calculating). Any other value will activate the other functionality style in which both verbs and their objects are used (e.g. CalculateSum).