Independent Study - Assistance?

Alexander Morou <>
Sun, 15 Feb 2009 22:17:36 -0600

          From comp.compilers

Related articles
Independent Study - Assistance? (Alexander Morou) (2009-02-15)
Re: Independent Study - Assistance? (Chris F Clark) (2009-02-16)
Re: Independent Study - Assistance? (Alexander Morou) (2009-02-21)
Re: Independent Study - Assistance? (Chris F Clark) (2009-02-27)
Re: Independent Study - Assistance? (Alexander Morou) (2009-02-28)
| List of all articles for this month |

From: Alexander Morou <>
Newsgroups: comp.compilers
Date: Sun, 15 Feb 2009 22:17:36 -0600
Organization: Compilers Central
Keywords: lex, question
Posted-Date: 16 Feb 2009 04:36:23 EST
X-Google-Sender-Auth: 371b8da9465241f5

I have a question for the language experts, if they don't mind. I've
been interested in language theory for quite some time, and I've done
'OK' in studying it on my own for the past year+ or so. I'm
interested in generative programming, and one of the things I want to
create in the future is a Compiler Compiler (of which I posted about
here recently).

Great goal, except there's a few things I don't yet know. One is how
to automate state machine generation for regular expressions, I just
haven't been able to get this part down (close, but not quite,
certainly not the least state version, I forgot to consider repeat
cycles -_-). I'm thinking a large part is due to my educational
background (or lack thereof). I don't know set theory, which I think
large parts of regular expression talk is based upon. I've read
articles that speak about regular languages, and creating NFAs to
transition into a DFA of the same functional nature. What I don't
know is starting to catch up to me, and it's been making it difficult
to even read most of the articles of late interest. They define a
series of symbols that equal theoretical sets and other stuff and use
mathematical operations I've never seen, so the comprehension just
isn't there. It's not that I'm incapable of getting it I just don't
know what they're saying (it's greek to me, if you'll pardon the pun).

( )
Here's an example, I read an article on minimal DFA Regex generation
(from comp.compilers to be exact, circa 1993). I understand what he's
saying but I can't see the practical side to things:
"0x" [0-9A-Fa-f]+ ([Uu] [Ll]? | [Ll] [Uu]?)? |
[0-9]+ (([Uu] [Ll]? | [Ll] [Uu]? | [Mm]) |
('.' [0-9]+ ([Ee] [+-]? [0-9]+)?)? ([Dd] | [FF] | [Mm]))?

[0-9] = x1
[Uu] = x2
[Ll] = x3
[Mm] = x4
[Dd] = x5
[FF] = x6
'0' = x7
[1-9] = x8
'x' = x9
[A-Fa-f] = x10
(x1|x10) = x11
x2 x3? = x12
x3 x2? = x13
x12|x13 = x14
x14|x4 = x15
'.' = x16
x1 x1* = x17
[Ee] = x18
[+-] = x19
x5 | x6 = x20
x20|x4 = x21
x7|x8 = x22

x7 x9 x11 x11* x14? |
x22 x1* (x15 | (x16 x17 (x18 x19? x16)?)? x21)?

Given the concept of taking a regular expression and lifting from it
the used symbols and sets, and using replacements for them, you should
obtain the least-state result regular expression. The problem as I
see it is how do you handle that from a practical standpoint,
generating useful information out of the above. Even assuming you
could write a program to handle it, how would you prevent a faux pas,
or since the replacements are at times a replacement of a replacement,
how would you ensure the movements you're making don't lead from one
area to another that you're not supposed to be in (ie. you follow the
-wrong- path)?

Mind you as far as my education goes, I have an Associate's of Science
in Computer Information Systems. That might give you perspective as
to what I'm lacking as far as traditional theory/math goes. If you're
thinking I should wait until I can study this formally, I don't know
if that's plausible because I don't know when I can afford to go back
to school.

I turned to this newsgroup because the further I go in my own
independent research, the less others around can assist (understand)
me. I'm in a relatively tech-free area, so there's very few tech
jobs, and programming jobs: there are none to speak of.

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.