Wrestling with phase 1 of a C compiler

luser droog <luser.droog@gmail.com>
Wed, 7 Sep 2022 09:47:29 -0700 (PDT)

          From comp.compilers

Related articles
Wrestling with phase 1 of a C compiler luser.droog@gmail.com (luser droog) (2022-09-07)
Re: Wrestling with phase 1 of a C compiler luser.droog@gmail.com (luser droog) (2022-09-09)
Re: Wrestling with phase 1 of a C compiler luser.droog@gmail.com (luser droog) (2022-09-11)
Wrestling with phase 1 of a C compiler christopher.f.clark@compiler-resources.com (Christopher F Clark) (2022-09-12)
Re: Wrestling with phase 1 of a C compiler gah4@u.washington.edu (gah4) (2022-09-12)
Re: Wrestling with phase 1 of a C compiler christopher.f.clark@compiler-resources.com (Christopher F Clark) (2022-09-13)
Re: source languages, was Wrestling with phase 1 of a C compiler gneuner2@comcast.net (George Neuner) (2022-09-14)
[3 later articles]
| List of all articles for this month |
From: luser droog <luser.droog@gmail.com>
Newsgroups: comp.compilers
Date: Wed, 7 Sep 2022 09:47:29 -0700 (PDT)
Organization: Compilers Central
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="31723"; mail-complaints-to="abuse@iecc.com"
Keywords: C, parse, question
Posted-Date: 07 Sep 2022 17:01:00 EDT

At my tedious glacial pace, I have rewritten my parser library
for the umpteen-plus-one'th time only to stall out at an earlier
step than where I stalled out the last time around.


I'm trying to do phase 1 of the C compilation, which is just recognizing
newlines in the input.


The input is modeled as a lazy list which calls fgetc() to produce
integers as needed. Right now I have a tiny parser to recognize
the possible line termination sequences and normalize them to
a single newline.


static parser
position_grammar( void ){
    return either( bind( ANY( str("\r\n"),
chr('\r'),
chr('\n') ),
Operator( NIL_, new_line ) ),
                                    item() );
}


static object
new_line( list env, object input ){
    return Int('\n');
}


So, I can run this parser and peel out the integer from the result.
And then I'm wrapping the result with this function to couple
each byte with its (row,col) information.




static list
position( object item ){
    static int row = 0,
                          col = 0;
    if( valid( eq_int( '\n', item ) ) )
        return cons( item, cons( Int( ++ row ), Int( col = 0 ) ) );
    else
        return cons( item, cons( Int( row ), Int( ++ col ) ) );
}


But ... I guess my problem is the lack of functional programming
tools in the C language, which I already knew, and is nobody's fault
but my own. But I'm not happy with the static variables for row and col.
I don't have monadic sequencing to help route state through my
function graphs.


But ... can I extract the "position counting" part out and do it by
zipping the input stream with an iota stream to provide counting?
This feels like the right direction, but I'm not sure how to reset the
column counter when a newline is recognized. Has anyone navigated
these weeds before and blazed any trails?


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.