Representing unions of Cartesian products

markjan@cs.kun.nl (Mark-Jan Nederhof)
Wed, 26 Aug 1992 15:29:02 GMT

          From comp.compilers

Related articles
Representing unions of Cartesian products markjan@cs.kun.nl (1992-08-26)
Re: Representing unions of Cartesian products graham@maths.su.oz.au (1992-08-30)
Representing unions of Cartesian products hws@sagitta.csis.dit.csiro.au (1992-09-03)
| List of all articles for this month |
Newsgroups: comp.compilers,comp.compression
From: markjan@cs.kun.nl (Mark-Jan Nederhof)
Organization: University of Nijmegen, The Netherlands
Date: Wed, 26 Aug 1992 15:29:02 GMT
Keywords: theory, design, question

I am looking for an efficient way to store and manipulate unions of
Cartesian products.


The application domain is abstract evaluation of certain programs (e.g.
datalog programs) and analysis of formal grammars (context-free grammars
extended with features/parameters over a finite domain).


Our goal is to determine the set of parameters with which a nonterminal
can generate a terminal string. If this is done by means of bottom-up
analysis then we need the following operations:


-For two or more sets of tuples of the form A_1 * A_2 * ... * A_n, for
fixed n, we have to compute the union in an efficient form. (The operator
* denotes the Cartesian product.) Here the sets A_i consist of atomic
values.


-For two unions of Cartesian products in such an efficient form, call them
C_1 and C_2, we have to compute a third set C_3 in the same form which
represents C_1 - C_2. (The operator - denotes the subtraction of sets of
tuples.)


-Let us consider two unions of Cartesian products in such an efficient
form, call them again C_1 and C_2. We now have to compute the set C_3
which represents the set of tuples


{(a_1, a_2, ..., a_n) | (b_1, ..., b_m) <- C_1 AND
                                                (c_1, ..., c_k) <- C_2 AND
                                                  R (a_1, ..., a_n, b_1, ..., b_m, c_1, ..., c_k)},


where R is some relation which for some fixed pairs of its arguments says
that they should have the same value. (The operator <- denotes inclusion.)
This operation would be needed in case of a grammar rule with two members
(with m and k parameters respectively; R is determined by ``consistent
substitution''). You can generalise this to arbitrary grammar rules.


This query is inspired by two things:


-The sets of atomic values are typically too big to represent sets of
tuples in a naive way by one element for every tuple.


-When the union of sets of the form A_1 * A_2 * ... * A_n is computed then
very often these sets overlap, and are even ``almost the same but not
quite''.


At this moment I am using a representation where a union of Cartesian
products is transformed into another such that the Cartesian products do
no longer overlap (i.e. they have no tuples in common). If possible I
merge two Cartesian products into one. These transformations are however
rather expensive. Does anyone have a better suggestion? references to
papers?


I suggest follow-up at comp.compression.


Mark-Jan Nederhof
University of Nijmegen
The Netherlands
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.