Re: Natural Language Parser

rpw3@rpw3.org (Rob Warnock)
30 Sep 2015 11:05:06 GMT

          From comp.compilers

Related articles
Natural Language Parser seimarao@gmail.com (Seima Rao) (2015-09-29)
Re: Natural Language Parser gneuner2@comcast.net (George Neuner) (2015-09-29)
Re: Natural Language Parser thothic.quinn@gmail.com (Quinn Jackson) (2015-09-29)
Re: Natural Language Parser rpw3@rpw3.org (2015-09-30)
Re: unnatural natural language, was Natural Language Parser gneuner2@comcast.net (George Neuner) (2015-10-01)
Re: Natural Language Parser cr88192@hotmail.com (BGB) (2015-10-02)
Re: Natural Language Parser genew@telus.net (Gene Wirchenko) (2015-10-06)
| List of all articles for this month |

From: rpw3@rpw3.org (Rob Warnock)
Newsgroups: comp.compilers
Date: 30 Sep 2015 11:05:06 GMT
Organization: Rob Warnock, Consulting Systems Architect
References: 15-09-025 15-09-028
Keywords: parse
Posted-Date: 30 Sep 2015 21:42:26 EDT

George Neuner <gneuner2@comcast.net> wrote:
+---------------
| Seima Rao <seimarao@gmail.com> wrote:
| > I am looking for a C++ API based English Language Parser.
...
| > This is my only requirement. I dont need any semanticizing
| > artefacts.
|
| You need to be aware that natural languages *can't* be parsed
| without semantics - i.e. without considering "parts of speech".
|
| Depending on context, the same word may represent, e.g., a verb, an
| adverb, or even a (type of) noun. What part of speech the word
| represents in context determines the ultimate meaning of the sentence.
| Even which part of speech a word represents may be controversial. Most
| human writing (and speaking) is quite imprecise: most sentences can be
| parsed in more than one way, and the meanings of the various parsings
| may be very different.
+---------------


Exactly. Consider the classical "Buffalo" example:


        https://en.wikipedia.org/wiki/Buffalo_buffalo_Buffalo_buffalo_buffalo_buffalo_Buffalo_buffalo
        "Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo"
        is a grammatically correct sentence in the English language,
        used as an example of how homonyms and homophones can be used
        to create complicated linguistic constructs.
        ...
        Thomas Tymoczko has pointed out that there is nothing special
        about eight "buffalos"; any sentence consisting solely of the
        word "buffalo" repeated any number of times is grammatically
        correct.
        ...
        Versions of the linguistic oddity can be constructed with
        other words which similarly simultaneously serve as collective
        noun, adjective, and verb, some of which need no capitalization
        (such as "police").




-Rob


-----
Rob Warnock <rpw3@rpw3.org>
627 26th Avenue <http://rpw3.org/>
San Mateo, CA 94403


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.