java - How to extract noun phrases from the parsed text -


i have parsed text constituency parser copy result in text file below:

(root (s (np (nn yesterday)) (, ,) (np (prp we)) (vp (vbd went) (pp (to to).... (root (frag (sbar (sbar (in while) (s (np (prp i)) (vp (vbd was) (np (np (ex... (root (s (np (nn yesterday)) (, ,) (np (prp i)) (vp (vbd went) (pp (to to..... (root (frag (sbar (sbar (in while) (s (np (nnp jim)) (vp (vbd was) (np (np (.... (root (s (s (np (prp i)) (vp (vbd started) (s (vp (vbg talking) (pp..... 

i need extract nounphrases (np) text file. wrote following code extract first np each line. however, need extract noun phrases. code is:

public class nounphrase {      public static int findclosingparen(char[] text, int openpos) {         int closepos = openpos;         int counter = 1;         while (counter > 0) {             char c = text[++closepos];             if (c == '(') {                  counter++;             }             else if (c == ')') {                 counter--;             }         }         return closepos;     }       public static void main(string[] args) throws ioexception {          arraylist nplist = new arraylist ();         string line;         string line1;         int np;          string input = "/local/input/temp/temp.txt";          string output = "/local/output/temp/temp-out.txt";            fileinputstream  fis = new fileinputstream (input);         bufferedreader br = new bufferedreader(new inputstreamreader(fis,"utf-8"         ));         while ((line = br.readline())!= null){         char[] linearray = line.tochararray();         np = findclosingparen (linearray, line.indexof("(np"));         line1 = line.substring(line.indexof("(np"),np+1);         system.out.print(line1+"\n");         }     } } 

the output is:

(np (nn yesterday))...i need other nps in line (np (prp i)).....i need other nps in line (np (nnp jim)).....i need other nps in line (np (prp i)).....i need other nps in line 

my code takes first np on each line closing parenthesis need extract nps text.

while writing own tree parser exercise (!), if want results, easiest way use more of functionality of stanford nlp tools, namely tregex, designed such things. can change final while loop this:

tregexpattern tpattern = tregexpattern.compile("np"); while ((line = br.readline()) != null) {     tree t = tree.valueof(line);     tregexmatcher tmatcher = tpattern.matcher(t);     while (tmatcher.find()) {       system.out.println(tmatcher.getmatch());     } } 

Comments

Popular posts from this blog

c++ - Delete matches in OpenCV (Keypoints and descriptors) -

java - Could not locate OpenAL library -

sorting - opencl Bitonic sort with 64 bits keys -