Re: Generic sequence matching in Java

Burak Emir <Burak.Emir@epfl.ch>
15 Aug 2004 22:19:00 -0400

          From comp.compilers

Related articles
Generic sequence matching in Java rob@ricardis.tudelft.nl (Rob van der Leek) (2004-08-13)
Re: Generic sequence matching in Java Burak.Emir@epfl.ch (Burak Emir) (2004-08-15)
Re: Generic sequence matching in Java liekweg@ipd.info.uni-karlsruhe.de (F. Liekweg) (2004-08-23)
Re: Generic sequence matching in Java robvanderleek@yahoo.com (Rob van der Leek) (2004-08-23)
| List of all articles for this month |
From: Burak Emir <Burak.Emir@epfl.ch>
Newsgroups: comp.compilers
Date: 15 Aug 2004 22:19:00 -0400
Organization: EPFL
References: 04-08-089
Keywords: lex, Java
Posted-Date: 15 Aug 2004 22:19:00 EDT

Hello Rob,


How about something 100% Java compatible, but not a Java solution ?


The Scala programming language offers regular pattern matching
http://scala.epfl.ch/intro/regexppat.html


Rob van der Leek wrote:
>
> I'm looking for a Java library that provides generic sequence matching.
> For example, let's say I have a sequence of Java objects:
>
> Object ol[] = { new A(), new A(), new B(), new C() };
>
> and would like to extract all subsequences from this sequence that start
> with one or more objects of type A followed by an object of type B. I
> consider this analogue to a textual regular expression "(a+)b" on an
> input of "aabc".


Speaking of "extraction", I assume you want to match a regular
expression with variables ? The Scala syntax for that is "v @ r" where
r is any pattern(or regular expression) and v is a variable.


If you want a type test for some Java class A, you would write _:A in
patterns. _ alone is a wildcard, _:T is a type pattern. The *+?| mean
what they usually mean in regexps.


> I could think of a very simple syntax for such a library, say X, as:
>
> x.X matcher = new x.X();
> matcher.expression(
> new Object[] {
> new x.OneOrMore(new A()),
> new B()
> }
> };
> matcher.match(ol);
>
That syntax would just tell you whether a sequence matches or not. In
Scala, you would write "case Seq((_:A)+,_:B) => ...". But I took from
your description that those substrings could appear anywhere in the
input sequence.


The following would extract all subsequences of the form (a+)b, in order
of their appearance.


object MyMatcher {
      def javamatch(inp:Array[Foo]) = {
          val inp2: Seq[Foo] = inp; // inserts a conversion ("view")
          val list = mymatch(inp2);
          ...turn into some Java object and return..,
      }


      def mymatch(inp2:Seq[Foo]) = inp2 match {
            case Seq( _*, res @((_:A)+,_:B), more @ _* ) =>
                res :: mymatch(more)
            case _ => Nil
      }
}


... and once compiled, you can call it from Java with


MyMatcher$.javamatch( myJavaArray );


hope this helps,
Burak


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.