Re: Different string format options, benefits?

henry@zoo.toronto.edu (Henry Spencer)
Fri, 25 Oct 1991 17:59:15 GMT

          From comp.compilers

Related articles
Different string format options, benefits? coxs2@rpi.edu (Sean C. Cox) (1991-10-16)
Re: Different string format options, benefits? pardo@cs.washington.edu (1991-10-17)
Re: Different string format options, benefits? pk@cs.tut.fi (1991-10-18)
Re: Different string format options, benefits? agulbra@Siri.Unit.NO (1991-10-18)
Re: Different string format options, benefits? db@dcs.ed.ac.uk (Dave Berry) (1991-10-20)
Re: Different string format options, benefits? tm@well.sf.ca.us (1991-10-22)
Re: Different string format options, benefits? buzzard@eng.umd.edu (1991-10-25)
Re: Different string format options, benefits? henry@zoo.toronto.edu (1991-10-25)
Re: Different string format options, benefits? sdm7g@aemsun.med.virginia.edu (1991-11-01)
Re: Different string format options, benefits? bliss@sp64.csrd.uiuc.edu (1991-11-05)
| List of all articles for this month |

Newsgroups: comp.compilers
From: henry@zoo.toronto.edu (Henry Spencer)
Keywords: C, optimize
Organization: U of Toronto Zoology
References: 91-10-079 91-10-089 91-10-098
Date: Fri, 25 Oct 1991 17:59:15 GMT

In article 91-10-098 buzzard@eng.umd.edu (Sean Barrett) writes:
>Using a known size, you can unroll the loop--I don't think it's possible
>to unroll the loop of a zero-terminated string, since you have to test
>every byte.


Oh yes you can. See my paper in the next Usenix for all sorts of fun
details on the tricks you can play. (Yes, you're going to have to wait
a few months to see this. No, I'm not handing out preprints, sorry.)


However, Sean's general point is valid. Having more information in hand
is always potentially useful in code generation. Other things being
equal, counted strings clearly win over terminated strings here.


I'd guess that the decision to use terminated strings in Unix, and later
in C (note that the system-call conventions of Unix pre-date C) was mostly
the result of the "no control blocks" philosophy: don't force the programmer
to build non-trivial data structures just to talk to the system. Notably,
counted strings would have meant building knowledge of this multi-component
data structure into assembler and compiler, if string literals were to be
available in a convenient way. Terminated strings are also a lot closer
to the Unix filesystem's approach to storing text, which explicitly eschews
having a notion of "records" with predefined internal structure.
--
Henry Spencer @ U of Toronto Zoology, henry@zoo.toronto.edu utzoo!henry
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.