Thu, 8 Dec 1994 21:51:37 GMT

Related articles |
---|

Extending REG-EXP to 2-Dimension. mosh@ramanujan.cs.albany.edu (1994-11-21) |

Re: Extending REG-EXP to 2-Dimension. steve@cegelecproj.co.uk (1994-11-30) |

Re: Extending REG-EXP to 2-Dimension. ruiter@ruls41.fsw.leidenuniv.nl (1994-12-01) |

Re: Extending REG-EXP to 2-Dimension. rockwell@nova.umd.edu (1994-12-03) |

Re: Extending REG-EXP to 2-Dimension. lloyd@kauri.vuw.ac.nz (1994-12-05) |

Re: Extending REG-EXP to 2-Dimension. pwk@eb.ele.tue.nl (1994-12-05) |

Re: Extending REG-EXP to 2-Dimension. steve@cegelecproj.co.uk (1994-12-05) |

Re: Extending REG-EXP to 2-Dimension. bakul@netcom.com (1994-12-08) |

Newsgroups: | comp.compilers |

From: | bakul@netcom.com (Bakul Shah) |

Keywords: | lex |

Organization: | NETCOM On-line Communication Services (408 261-4700 guest) |

References: | 94-11-137 94-12-033 |

Date: | Thu, 8 Dec 1994 21:51:37 GMT |

*>Jan-Peter de Ruiter:*

*>: [ Request for regexp systems that work on "boxes" of text ]*

*>: As far as I know it has not been solved in any way. The problem is*

*>: that you need to extend the notion of linearity (characters*

*>: following other characters) in 2 dimensions.*

Even if we forget about regular expressions and whether this

problem is polynomial time, the first problem we face is how to

*formally* specify an arbitrary 2D pattern of text.

Even something as simple as a fixed 2D pattern (i.e. described

by a matrix, where non-zero elements are chars we are looking

for) can be a bit hairy. We know how to look for a fixed 2D

pattern (match the first row of pattern against a line, move down

to the appropriate column and exactly match second row and so

forth). But how do we specify it?

How about a box like

XX...XX

X X

. .

. .

X X

XX...XX

Where ... denotes an arbitrary number of Xes but the *shape* must

be a rectangle? Even here we know what to do (i.e. we can write

a program that handles exactly this problem), but how to specify.

Next, how do we specify a square box? A box that is half as long

as it is wide but the length of the wide side is defined by the

number of Xes? And remember that such a pattern may be

*anywhere* in the quarter plane defined by a text file.

One idea that comes to mind is to use modified back references.

Recall that in regexp "(a)(bb)\1\1\2\1", back reference \1 matches

the first submatch, that is "a" and \2 matches "bb". The whole

RE matches "abbaabba". We can define a new kind of back ref.

"\l1" (that is, backslash, ell, one) to denote the length of the

first submatch and so on.

Give this, we can specify an arbitrarily wide and long box of

Xes as

^(.*) (XX+) .*\n

(.{\l1} X.{\l2-2}X .*\n)*

.{\l1} X{\l2} .*\n

(expression broken up in lines and by spaces to make it somewhat

easier to read. Spaces are not significant and "." matches

anything but a newline, {} contains the repetition count)

This says, in the first line we have any number of chars followed

by atleast two Xes, followed by any number of chars. This line

can be followed by any number of lines that have the same number

chars as the first sub match, followed by an X, followed by chars

two less than the number of consecutive Xes in the first line,

followed by another X, followed by anything and finally the last

line has Xes in the same position as the first line.

I don't know if this highly irregular expression can be mapped to

any kind of finite automation.

Bakul Shah <bakul@netcom.com>

--

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.