|PDF grammar and PostScript grammar email@example.com (psobkiew) (2002-01-30)|
|Re: PDF grammar and PostScript grammar firstname.lastname@example.org (2002-02-06)|
|Re: PDF grammar and PostScript grammar email@example.com (David Z Maze) (2002-02-06)|
|Re: PDF grammar and PostScript grammar firstname.lastname@example.org (2002-02-16)|
|Re: PDF grammar and PostScript grammar email@example.com (2002-02-28)|
|From:||firstname.lastname@example.org (Derek B. Noonburg)|
|Date:||16 Feb 2002 01:13:36 -0500|
|Organization:||Prodigy Internet http://www.prodigy.com|
|Posted-Date:||16 Feb 2002 01:13:36 EST|
> [PDF is basically tarted up Postscript, and Postscript has a trivial
> token stack syntax like that of Forth. -John]
A PDF page content stream is simplified PostScript -- no control flow,
no real stack. It's a sequence of operations, where each operation is
zero or more operands followed by an operator, e.g., "10 20 m 100 200
l" means move to the point (10, 20), and then draw a line to (100,
200). Each operator completely consumes its operands and leaves
nothing on the stack (unlike Forth and PostScript).
PDF files are more complex. A PDF file consists of a sequence of
numbered objects. Examples of objects are fonts, images, hyperlinks,
page content streams, and lots more. There's a cross-reference
("xref") table at the end of the file that maps object number to
position in the file (byte offset from the beginning of the file).
It's actually even messier - a file can be "updated": you tack some
more objects on the end, some of which can logically replace existing
objects, and then append a new xref table with offsets for the new
objects and a pointer to the previous xref table.
PDF really isn't something you want to attack with lex and yacc.
Return to the
Search the comp.compilers archives again.