Re: Languages with optional spaces

"Ev. Drikos" <drikosev@gmail.com>
Sun, 1 Mar 2020 19:41:49 +0200

          From comp.compilers

Related articles
[7 earlier articles]
Re: Languages with optional spaces awanderin@gmail.com (awanderin) (2020-02-26)
Re: Languages with optional spaces drikosev@gmail.com (Ev. Drikos) (2020-02-28)
Re: Languages with optional spaces christopher.f.clark@compiler-resources.com (Christopher F Clark) (2020-02-29)
Re: Languages with optional spaces drikosev@gmail.com (Ev. Drikos) (2020-02-29)
Re: Languages with optional spaces DrDiettrich1@netscape.net (Hans-Peter Diettrich) (2020-03-01)
Re: Languages with optional spaces christopher.f.clark@compiler-resources.com (Christopher F Clark) (2020-03-01)
Re: Languages with optional spaces drikosev@gmail.com (Ev. Drikos) (2020-03-01)
Re: Languages with optional spaces christopher.f.clark@compiler-resources.com (Christopher F Clark) (2020-03-02)
Re: Languages with optional spaces drikosev@gmail.com (Ev. Drikos) (2020-03-02)
Re: Languages with optional spaces gah4@u.washington.edu (2020-03-02)
Re: Languages with optional spaces drikosev@gmail.com (Ev. Drikos) (2020-03-12)
Re: Languages with optional spaces gene.ressler@gmail.com (Gene) (2020-04-14)
Re: Languages with optional spaces mertesthomas@gmail.com (2020-04-19)
[1 later articles]
| List of all articles for this month |
From: "Ev. Drikos" <drikosev@gmail.com>
Newsgroups: comp.compilers
Date: Sun, 1 Mar 2020 19:41:49 +0200
Organization: Aioe.org NNTP Server
References: 20-02-015 20-02-017 20-02-033 20-02-034
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="54570"; mail-complaints-to="abuse@iecc.com"
Keywords: lex
Posted-Date: 01 Mar 2020 12:52:57 EST
Content-Language: en-US

On 29/02/2020 21:38, Ev. Drikos wrote:
> On 29/02/2020 11:48, Christopher F Clark wrote:
> ...
> Obviously, those who coded such BASIC parsers had some simpler rules,
> ie the position of the first 'TO' might be used for the statement 50.
>


Dear Mr. Clark,


I'll elaborate a little if you don't mind. IMHO, one problem here is
that an identifier before a keyword is quite difficult for a scanner.
One could possibly let the parser recognize such obscure identifiers.


Let's assume that the first 'TO' found in a for-statement after a '='
is the keyword that separates the lower bound from the upper bound of
the loop index.


As a demo example, the simplified grammar at the end of this message
fails to parse only the last statement (999) below, which looks like
a known limitation of my Simulator. An actually generated C++ parser
would need some time and effort that currently I don't plan to spend:


10 let i=1
11 letleti=1
20 i=1
30 for forj=1 to n
40 FORFORJ=1TON
50 FOR N = ITOJTOK
60 FORFORJ=IIITONTOJ
70 FORFORJ=TO TO TO
80 FORFORJ=TTTON
999 FORFORJ=TOTON




Of course, this is far away from a working solution because one needs
to know a lot of details, ie if any spaces read are indeed important.


What would you suggest apart a hand coded lexer?


Ev. Drikos




----------------------------------------------------------------------
        SYNTAX RULES
----------------------------------------------------------------------


#sma <id>
#sma <idl>
#sma TO


<grm> ::=
                        <statements>


<statements> ::=
                        <statements> <statement>
            | <statement>


<statement> ::=
                        <for-stmt>
            | <assignment>
            | <empty>


<for-stmt> ::=
                        <label> FOR <id> = <lbound> TO <ubound> <;>


<lbound> ::=
                        <number>
            | <id-p>


<id-p> ::=
                        <id-p> <id-part>
            | <id-start>


<id-part> ::=
                        <letter>
            | <digit>
            | $


<id-start> ::=
                        <letter>


<ubound> ::=
                        <rhs>


<rhs> ::=
                        <id>
            | <number>


<assignment> ::=
                        <label> LET <id> = <rhs> <;>
            | <label> <idl> = <rhs> <;>


<empty> ::=
                        <;>






----------------------------------------------------------------------
    LEXICAL CONVENTIONS
----------------------------------------------------------------------


#ignore spaces


#sma <id>
#sma <idl>


token ::=
                        spaces
            | END
            | FOR
            | LET
            | NEXT
            | PRINT
            | TO
            | <digit>
            | <label>
            | <number>
            | <letter>
            | <idl>
            | <id>
            | $
            | =
            | <;>


spaces ::=
                        { \t | \s }...


END ::=
                        E N D


FOR ::=
                        F O R


LET ::=
                        L E T


NEXT ::=
                        N E X T


PRINT ::=
                        P R I N T


TO ::=
                        T O


<digit> ::=
                        0 .. 9


<label> ::=
                        { 0 .. 9 }...


<number> ::=
                        { 0 .. 9 }...


<letter> ::=
                        A .. Z


<idl> ::=
                        { A .. Z [{$|A .. Z|0..9}...]} -= {key[{$|A..Z|0..9}...]}


key ::=
                        END
            | FOR
            | LET
            | NEXT
            | PRINT


<id> ::=
                        A .. Z [ { $ | A .. Z | 0 .. 9 }... ]


<;> ::=
                        ;
            | \n


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.