Related articles |
---|
[3 earlier articles] |
Re: State-of-the-art algorithms for lexical analysis? costello@mitre.org (Roger L Costello) (2022-06-06) |
Re: State-of-the-art algorithms for lexical analysis? 480-992-1380@kylheku.com (Kaz Kylheku) (2022-06-06) |
Re: State-of-the-art algorithms for lexical analysis? gah4@u.washington.edu (gah4) (2022-06-06) |
State-of-the-art algorithms for lexical analysis? christopher.f.clark@compiler-resources.com (Christopher F Clark) (2022-06-06) |
Re: State-of-the-art algorithms for lexical analysis? gah4@u.washington.edu (gah4) (2022-06-06) |
Re: State-of-the-art algorithms for lexical analysis? DrDiettrich1@netscape.net (Hans-Peter Diettrich) (2022-06-07) |
Re: State-of-the-art algorithms for lexical analysis? christopher.f.clark@compiler-resources.com (Christopher F Clark) (2022-06-07) |
Re: State-of-the-art algorithms for lexical analysis? DrDiettrich1@netscape.net (Hans-Peter Diettrich) (2022-06-08) |
Re: counted characters in strings robin51@dodo.com.au (Robin Vowels) (2022-06-10) |
Re: counted characters in strings martin@gkc.org.uk (Martin Ward) (2022-06-11) |
Re: counted characters in strings drb@msu.edu (2022-06-11) |
From: | Christopher F Clark <christopher.f.clark@compiler-resources.com> |
Newsgroups: | comp.compilers |
Date: | Tue, 7 Jun 2022 19:40:11 +0300 |
Organization: | Compilers Central |
References: | 22-06-006 22-06-007 22-06-008 22-06-013 22-06-015 |
Injection-Info: | gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="50232"; mail-complaints-to="abuse@iecc.com" |
Keywords: | lex, comment |
Posted-Date: | 07 Jun 2022 13:05:09 EDT |
Yes, as our moderator explained. I was talking about things like
FORTRAN Hollerith strings, but more importantly network packets, where
they give the size of the "field" within a packet and then you simply
take that many characters (or bytes or bits or some other quanta) as
the "token". This is quite important for parsing "binary" data. And,
sometimes the numbers are text like I showed but in many protocols the
numbers are "binary" e.g. something like
\xAHabcdefghij where \xA is a single 8 bit character (octet) whose
bits are "0000 1010" (or maybe 4, 8 bit, characters -- 4 octets),
that represent a 32 integer).
And, as our moderator pointed out, this makes a terrible regular
expression, NFA, DFA, but it is actually quite easy in nearly any
programming language. You read the length in, convert it to an integer
and then loop reading that many characters from the input and call
that a "token".
Kind regards,
Chris
--
******************************************************************************
Chris Clark email: christopher.f.clark@compiler-resources.com
Compiler Resources, Inc. Web Site: http://world.std.com/~compres
23 Bailey Rd voice: (508) 435-5016
Berlin, MA 01503 USA twitter: @intel_chris
------------------------------------------------------------------------------
[Right. When I was writing Fortran lexers, Hollerith strings were among the
simplest of the kludges I had to use. -John]
Return to the
comp.compilers page.
Search the
comp.compilers archives again.