Implementing set theory operations similar to SQL's select and update

barry_j_kelly@hotmail.com (Barry Kelly)
23 Aug 2004 12:07:59 -0400

          From comp.compilers

Related articles
Implementing set theory operations similar to SQL's select and update barry_j_kelly@hotmail.com (2004-08-23)
Re: Implementing set theory operations similar to SQL's select and upd gneuner2@comcast.net (George Neuner) (2004-08-25)
Re: Implementing set theory operations similar to SQL's select and upd dmclean@stsci.edu (Donald F. McLean) (2004-08-25)
Re: Implementing set theory operations similar to SQL's select and upd kwheinri@bsr2.uwaterloo.ca (Kenn Heinrich) (2004-09-03)
| List of all articles for this month |

From: barry_j_kelly@hotmail.com (Barry Kelly)
Newsgroups: comp.compilers
Date: 23 Aug 2004 12:07:59 -0400
Organization: http://groups.google.com
Keywords: question, analysis

I'm writing an engine designed to implement behaviour which may be
customized along multiple dimensions.


This might seem a bit off-topic, but in my defense I say:


1. My problem is, in essence, implementing a set-based language,
although written through a GUI and probably using predicate expression
evaluation trees rather than machine code for implementation (although
that may be very slow, and that's what this post is about).


2. comp.db.theory isn't moderated and has a low signal/noise ratio, is
mostly about normalization and use of databases, rather than
specifically implementing the mechanics of (say) SQL.


The Problem
-----------
(For those interested in why this is the problem, the background
overview below has more information).


I have a sequential list of multiple (1000+), overlapping set
definitions. I want to know, quickly enough to be usable for a
developer using a GUI front end, does the set they are defining
completely encompass a previously defined set? Is it only a partial
subset? What is that subset? Can that subset be converted easily into
its own representation rather than as the product of set difference
and/or intersection?


If that seems a little vague, the details in the overview below flesh
it out a bit more. The format of the set definitions looks like (dimX,
dimY, dimZ, ...) where each dimN is either an explicit set "{a, b, c}"
or an 'any' rule "*".


Does anybody know of concrete papers, books, articles dealing with
concrete implementation details and algorithms that I should be
tracking down rather than the (thus far unsuccessful) general searches
for "set theory"?


A Brief Background Overview
---------------------------
Example: given a list of business requirements such as:


1: Field f_A has validation v_X for all applications.
2: Field f_A has override validation v_Y in application app_1 for
supervisors.
3: Field f_A has override validation v_Z in application app_1 in state
TX.


Object-oriented techniques have been rejected because they constrain
me to using a fixed order of dimensions for reuse and overriding. A
possible solution presents itself:


1. Creating a notional Cartesian-product of all dimensions of
customization.


2. Overriding based on the application of rules whose scope is defined
by sets described by a simple language: by example, '{a,b}' represents
the set containing a and b, while '*' represents the set containing
all the values for that dimension.


Using the business requirements above:


The dimensions: (app, field, supervisor: bool, state)
The value: the validation rule


The requirements:


1: (*, f_A, *, *) => v_X
2: (app_0, f_A, true, *) => v_Y
3: (app_0, f_A, *, TX) => v_Z


The table generated from these dimensions:


          Location | Rule
-------------------|------
app_0 f_A true MS | v_Y
app_0 f_A true TX | v_Z
app_0 f_A false MS | v_X
app_0 f_A false TX | v_Z
// other applications all use v_X


Essentially, when an application needs to know the rule it needs to
apply to a particular field, it works out its dimension in
multi-dimensional space, in the form of a tuple which corresponds to
the location column: (app_0, f_A, true, TX) for example.


This seems to lead to combinatorial explosion, but that is obviated by
the fact that a subset of this *notional* table will be selected on
the basis of version, state, etc., and thence precalculated and
stored. It will thus grow no faster or worse than any 'normal'
application where each part is specified by hand or limited
object-oriented inheritance. The list of requirements which defines
how the notional table is generated forms the real backing store, and
it grows linearly. However, the interaction within this list, or stack
of rule applications, needs to be calculated with at least some level
of efficiency.


That's where set theory steps in, hopefully.


P.S.: The above model is simplified. Actually, there are more than
just validation rules in the table, and rather than storing rules, an
ordered list of rules are stored. Each requirement needs to specify
whether it overrides the previous requirement, comes before it,
inserts in the middle, etc. That's all easy enough once the set of
applications is worked out.


-- Barry Kelly


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.