Thu, 1 Dec 1994 12:35:36 GMT

Newsgroups: | comp.compilers |

From: | ruiter@ruls41.fsw.leidenuniv.nl (Jan-Peter de Ruiter) |

Summary: | Extending REG-EXP to figures. |

Keywords: | lex, DFA |

Organization: | Compilers Central |

References: | 94-11-137 |

Date: | Thu, 1 Dec 1994 12:35:36 GMT |

[ Request for regexp systems that work on "boxes" of text ]

This is a really hard problem that has been discussed once

in the Icon newsgroup.

As far as I know it has not been solved in any way. The problem

is that you need to extend the notion of linearity (characters

following other characters) in 2 dimensions.

This could perhaps be done by using a 'circular' approach,

for instance like this:

CCCCC

CBBBC

CBABC

CBBBC

CCCCC

So in the expression "ABC", A, B and C are all regexps that

describe properties of a 'circle' of text. These expressions

themselves should be modified to be able to describe circular

structures, and the relations between these circular expressions

should be formalized in some way or other.

Even if these formal problems could be solved, the complexity

of this kind of text analysis will in all probability be huge.

I'd be interested if anyone has any comments or 'pointers'

regarding this idea.

Jan

