25 Aug 2004 14:53:51 -0400

Related articles |
---|

Implementing set theory operations similar to SQL's select and update barry_j_kelly@hotmail.com (2004-08-23) |

Re: Implementing set theory operations similar to SQL's select and upd gneuner2@comcast.net (George Neuner) (2004-08-25) |

Re: Implementing set theory operations similar to SQL's select and upd dmclean@stsci.edu (Donald F. McLean) (2004-08-25) |

Re: Implementing set theory operations similar to SQL's select and upd kwheinri@bsr2.uwaterloo.ca (Kenn Heinrich) (2004-09-03) |

From: | George Neuner <gneuner2@comcast.net> |

Newsgroups: | comp.compilers |

Date: | 25 Aug 2004 14:53:51 -0400 |

Organization: | Compilers Central |

References: | 04-08-117 |

Keywords: | analysis |

Posted-Date: | 25 Aug 2004 14:53:51 EDT |

On 23 Aug 2004 12:07:59 -0400, barry_j_kelly@hotmail.com (Barry Kelly)

wrote:

*>I'm writing an engine designed to implement behaviour which may be*

*>customized along multiple dimensions.*

*>*

*>1. My problem is, in essence, implementing a set-based language,*

*>although written through a GUI and probably using predicate expression*

*>evaluation trees rather than machine code for implementation (although*

*>that may be very slow, and that's what this post is about).*

*>*

*>I have a sequential list of multiple (1000+), overlapping set*

*>definitions. I want to know, quickly enough to be usable for a*

*>developer using a GUI front end, does the set they are defining*

*>completely encompass a previously defined set? Is it only a partial*

*>subset? What is that subset? Can that subset be converted easily into*

*>its own representation rather than as the product of set difference*

*>and/or intersection?*

*>*

*>Does anybody know of concrete papers, books, articles dealing with*

*>concrete implementation details and algorithms that I should be*

*>tracking down rather than the (thus far unsuccessful) general searches*

*>for "set theory"?*

*>*

It's not clear from your problem description whether there is an

actual database or whether the rules define abstract sets. It really

doesn't matter - in order to evaluate the sets for equality you need

to have some kind of concrete representation for them.

The rulebase will be easier to evaluate if each rule stands alone as a

logical predicate. If this is not true for the human readable

representation (very likely 8-), you should internally reduce rule

chains to a single rule that describes the result directly. If

partial chain application is allowed, you should produce a separate

direct rule for each possible partial chain. In application, you

identify and apply the most specific rule.

Regarding the coverage problem - you will definitely want to avoid

creating a cartesian product on scads of dimensions ... instead you'll

want to order evaluation of the predicates in such a way as to produce

the smallest possible result set at each step. If each rule stands

alone, the evaluation order will not affect the result. Identify and

compute intersections before unions or joins to keep the intermediate

results as small as possible.

Using the database analogy, you don't need to actually create the

intermediate table at each step ... you only need to compute the

columns and estimate the number of rows (ie. a virtual table). At

each step, you want to identify and perform the operation that will

produce the table with the least number of result rows. You continue

until either a single table remains or all the remaining intermediate

tables are empty (a null query). The final table, when filled in with

data, is your optimized query.

For practical matters, you should look for papers on database query

optimization, solving equation and logic predicate systems, etc. You

should also read up on dynamic programming techniques if you aren't

familiar with them. I'm sorry I don't have any good cites to give you

- my work has taken me in other directions and I don't follow research

in these fields very closely any more.

AFA implementation - these kinds of problems are best solved in logic

or functional languages (or in languages that easily accomodate those

techniques). I personally like Lisp, but ML, Prolog or one of their

close variants would also work well. This is not the kind of problem

I would like to tackle in C++, Java or the language du jour.

George

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.