Re: Natural Language Parser Wanted

Peter Ludemann <peter.ludemann@quintus.com>
Wed, 3 Mar 1993 18:04:43 GMT

From comp.compilers

Related articles
Natural Language Parser Wanted kit@philabs.philips.com (1993-02-18)
Re: Natural Language Parser Wanted johnm@cory.berkeley.edu (1993-02-19)
*Re: Natural Language Parser Wanted peter.ludemann@quintus.com (Peter Ludemann)* (1993-03-03)**

| List of all articles for this month |

Newsgroups:	comp.compilers
From:	Peter Ludemann <peter.ludemann@quintus.com>
Keywords:	parse
Organization:	Compilers Central
References:	93-02-100
Date:	Wed, 3 Mar 1993 18:04:43 GMT

>[It is my impression that people gave up trying to parse natural language
>using context-free methods in about 1966. ...]

johnm@cory.berkeley.edu (John D. Mitchell) writes:
> Yep. Natural language parsing/understanding/etc. has been a major branch
> of the AI universe for a long time. You should go over to comp.ai.nlang
> (or something like that).
>
> My understanding of the latest developments is that you can get actually
> reasonably good understanding but only over limited knowledge domains.

Natural language processing today is done with both top-down and bottom-up
parsers. A description of some of these can be found in "Logic Grammars"
by Abramson and Dahl (Springer 1989); it also shows how to use simpler
versions of the NL parsers to very easily produce compilers for computer
languages, using something like context sensitive attribute grammars (I
wouldn't dream of going back to yacc/lex after using logic grammar
formalisms).

Many NL systems use "chart parsers", based on Earley's original paper (but
made more efficient, of course). If people are interested, I can dig up
some references (Ross's book on Prolog and D.S. Warren's contain examples,
but I don't have references handy).

IBM has a product called "LanguageAccess" which contains a chart parser
(written in Prolog) plus extensive customization facilities; it can be
used to drive a relational database and interpret the results. There is
also a "reverse parser" for generating paraphrases (these are needed if a
query has more than one possible intepretation).

Finally, for those of you who have been deviled by languages like PL/I
which allow keywords in arbitrary order, here's a simple example from
Dahl&Abramson for parsing free-form Latin (with some matching for case,
courtesy of the "attribute grammar" style. The "skip(G)" construct
introduces context sensitivity and allows free word order. The
"{dict(Usage,Word)}" is used to look up words in the dictionary and get
their part of speech plus case (if applicable).

    sentence(s(N,A,V)) --> noun_phrase(nom,N), noun_phrase(acc,A), verb(V).
    noun_phrase(Case,np(A,N)) --> adjective(Case,A), noun(Case,N).
    noun_phrase(Case,N) --> noun(Case,N).
    noun(Case,n(Word)), skip(G) --> skip(G), [Word], {dict(noun(Case),Word)}.
    adjective(Case,a(Word)), skip(G) --> skip(G), [Word],
                                                                              {dict(adjective(Case),Word)}.
    verb(v(Word)), skip(G) --> skip(G), [Word], {dict(verb,Word)}.

    dict(verb,amat).
    dict(noun(acc),puerum).
    dict(noun(nom),puella).
    dict(adjective(acc),parvum).
    dict(adjective(nom),bona).

This parses
    [puella,bona,puerum,parvum,amat]
    [amat,parvum,puerum,bona,puella]
    [puerum,puella,parvum,bona,amat]
and the rest of the 5! permutations to:

        sentence
                noun-phrase(nominative)
                        adjective `bona'
                        noun(nominative) `puella'
                noun-phrase(accusative)
                        adjective `parvum'
                        noun(accusative) `puerum'
                verb `amat'

I leave it as an exercise to write this in yacc/lex. :-)

- peter ludemann, quintus
--

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.

Re: Natural Language Parser Wanted

Peter Ludemann <peter.ludemann@quintus.com>Wed, 3 Mar 1993 18:04:43 GMT

Peter Ludemann <peter.ludemann@quintus.com>
Wed, 3 Mar 1993 18:04:43 GMT