Re: Languages with optional spaces

"Ev. Drikos" <drikosev@gmail.com>
Sun, 1 Mar 2020 19:41:49 +0200

From comp.compilers

Related articles
[7 earlier articles]
Re: Languages with optional spaces awanderin@gmail.com (awanderin) (2020-02-26)
Re: Languages with optional spaces drikosev@gmail.com (Ev. Drikos) (2020-02-28)
Re: Languages with optional spaces christopher.f.clark@compiler-resources.com (Christopher F Clark) (2020-02-29)
Re: Languages with optional spaces drikosev@gmail.com (Ev. Drikos) (2020-02-29)
Re: Languages with optional spaces DrDiettrich1@netscape.net (Hans-Peter Diettrich) (2020-03-01)
Re: Languages with optional spaces christopher.f.clark@compiler-resources.com (Christopher F Clark) (2020-03-01)
*Re: Languages with optional spaces drikosev@gmail.com (Ev. Drikos)* (2020-03-01)**
Re: Languages with optional spaces christopher.f.clark@compiler-resources.com (Christopher F Clark) (2020-03-02)
Re: Languages with optional spaces drikosev@gmail.com (Ev. Drikos) (2020-03-02)
Re: Languages with optional spaces gah4@u.washington.edu (2020-03-02)
Re: Languages with optional spaces drikosev@gmail.com (Ev. Drikos) (2020-03-12)
Re: Languages with optional spaces gene.ressler@gmail.com (Gene) (2020-04-14)
Re: Languages with optional spaces mertesthomas@gmail.com (2020-04-19)
[1 later articles]

| List of all articles for this month |

From:	"Ev. Drikos" <drikosev@gmail.com>
Newsgroups:	comp.compilers
Date:	Sun, 1 Mar 2020 19:41:49 +0200
Organization:	Aioe.org NNTP Server
References:	20-02-015 20-02-017 20-02-033 20-02-034
Injection-Info:	gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="54570"; mail-complaints-to="abuse@iecc.com"
Keywords:	lex
Posted-Date:	01 Mar 2020 12:52:57 EST
Content-Language:	en-US

On 29/02/2020 21:38, Ev. Drikos wrote:
> On 29/02/2020 11:48, Christopher F Clark wrote:
> ...
> Obviously, those who coded such BASIC parsers had some simpler rules,
> ie the position of the first 'TO' might be used for the statement 50.
>

Dear Mr. Clark,

I'll elaborate a little if you don't mind. IMHO, one problem here is
that an identifier before a keyword is quite difficult for a scanner.
One could possibly let the parser recognize such obscure identifiers.

Let's assume that the first 'TO' found in a for-statement after a '='
is the keyword that separates the lower bound from the upper bound of
the loop index.

As a demo example, the simplified grammar at the end of this message
fails to parse only the last statement (999) below, which looks like
a known limitation of my Simulator. An actually generated C++ parser
would need some time and effort that currently I don't plan to spend:

10 let i=1
11 letleti=1
20 i=1
30 for forj=1 to n
40 FORFORJ=1TON
50 FOR N = ITOJTOK
60 FORFORJ=IIITONTOJ
70 FORFORJ=TO TO TO
80 FORFORJ=TTTON
999 FORFORJ=TOTON

Of course, this is far away from a working solution because one needs
to know a lot of details, ie if any spaces read are indeed important.

What would you suggest apart a hand coded lexer?

Ev. Drikos

----------------------------------------------------------------------
        SYNTAX RULES
----------------------------------------------------------------------

#sma <id>
#sma <idl>
#sma TO

<grm> ::=
                        <statements>

<statements> ::=
                        <statements> <statement>
            | <statement>

<statement> ::=
                        <for-stmt>
            | <assignment>
            | <empty>

<for-stmt> ::=
                        <label> FOR <id> = <lbound> TO <ubound> <;>

<lbound> ::=
                        <number>
            | <id-p>

<id-p> ::=
                        <id-p> <id-part>
            | <id-start>

<id-part> ::=
                        <letter>
            | <digit>
            | $

<id-start> ::=
                        <letter>

<ubound> ::=
                        <rhs>

<rhs> ::=
                        <id>
            | <number>

<assignment> ::=
                        <label> LET <id> = <rhs> <;>
            | <label> <idl> = <rhs> <;>

<empty> ::=
                        <;>

----------------------------------------------------------------------
    LEXICAL CONVENTIONS
----------------------------------------------------------------------

#ignore spaces

#sma <id>
#sma <idl>

token ::=
                        spaces
            | END
            | FOR
            | LET
            | NEXT
            | PRINT
            | TO
            | <digit>
            | <label>
            | <number>
            | <letter>
            | <idl>
            | <id>
            | $
            | =
            | <;>

spaces ::=
                        { \t | \s }...

END ::=
                        E N D

FOR ::=
                        F O R

LET ::=
                        L E T

NEXT ::=
                        N E X T

PRINT ::=
                        P R I N T

TO ::=
                        T O

<digit> ::=
                        0 .. 9

<label> ::=
                        { 0 .. 9 }...

<number> ::=
                        { 0 .. 9 }...

<letter> ::=
                        A .. Z

<idl> ::=
                        { A .. Z [{$|A .. Z|0..9}...]} -= {key[{$|A..Z|0..9}...]}

key ::=
                        END
            | FOR
            | LET
            | NEXT
            | PRINT

<id> ::=
                        A .. Z [ { $ | A .. Z | 0 .. 9 }... ]

<;> ::=
                        ;
            | \n

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.

Re: Languages with optional spaces

"Ev. Drikos" <drikosev@gmail.com>Sun, 1 Mar 2020 19:41:49 +0200

"Ev. Drikos" <drikosev@gmail.com>
Sun, 1 Mar 2020 19:41:49 +0200