Re: Different string format options, benefits?

buzzard@eng.umd.edu (Sean Barrett)
Fri, 25 Oct 91 06:09:40 GMT

          From comp.compilers

Related articles
Different string format options, benefits? coxs2@rpi.edu (Sean C. Cox) (1991-10-16)
Re: Different string format options, benefits? pardo@cs.washington.edu (1991-10-17)
Re: Different string format options, benefits? pk@cs.tut.fi (1991-10-18)
Re: Different string format options, benefits? agulbra@Siri.Unit.NO (1991-10-18)
Re: Different string format options, benefits? db@dcs.ed.ac.uk (Dave Berry) (1991-10-20)
Re: Different string format options, benefits? tm@well.sf.ca.us (1991-10-22)
Re: Different string format options, benefits? buzzard@eng.umd.edu (1991-10-25)
Re: Different string format options, benefits? henry@zoo.toronto.edu (1991-10-25)
Re: Different string format options, benefits? sdm7g@aemsun.med.virginia.edu (1991-11-01)
Re: Different string format options, benefits? bliss@sp64.csrd.uiuc.edu (1991-11-05)
| List of all articles for this month |

Newsgroups: comp.compilers
From: buzzard@eng.umd.edu (Sean Barrett)
Summary: 6502 strcpy
Keywords: code, C
Organization: Fraternity of Avian Deists
References: 91-10-079 91-10-089
Date: Fri, 25 Oct 91 06:09:40 GMT

So tm@well.sf.ca.us (Toshi Morita) writes:
>On a 6502 I think (size, bytes) is faster:
...
>.loop1 lda (source),y ; 5 cycles
> sta (dest),y ; 5 cycles
> iny ; 2 cycles
> bne .loop1 ; 3 cycles (when taken)


Using a known size, you can unroll the loop--I don't think it's possible
to unroll the loop of a zero-terminated string, since you have to test
every byte.


>the null-terminated version looks like:
>.loop lda (source),y ; 5 cycles
> sta (dest),y ; 5 cycles
> beq .exit ; 2 cycles (if not token)
> iny ; 2 cycles
> bne .loop ; 3 cycles (if taken)


Except for the problem with needing to increment the high bytes
when y turns over, this could be implemented as also 15 cycles:


dey
loop iny ; 2
lda (source),y ; 5
sta (dest),y ; 5
bne loop ; 3


Now this code doesn't really work, since it breaks if y turns over. But
we can unroll the original code and ignore the iny testing if we unroll
by a factor that divides 256 evenly.


loop lda (source),y
sta (dest),y
beq done
iny
lda (source),y
sta (dest),y
beq done
iny
...
iny
bne loop
inc source+1
...


You might be able to speed this up more if you have a lot of free zero
page addresses by keeping multiple pointers into the spaces and
incrementing y by, say, 16 bytes at a time; I'm not sure if this would
really work since you'd have to update a lot more pointers when y turns
over.


Anyway, this approach can be made to approach 15 cycles, while the other
can approach 12. So the advantage is still definitely with the explicit
size. Of course with the 6502 you can do nasty things with self modifying
code to muck with those cycle counts and make them shorter, but I think
that basic win is always going to be there-- you have to test every load
with length terminated, and you don't with explicit size.


Back to the original question, though--one might note that there are a few
C functions that expect an explicit length field in some sense, e.g.
strncpy, strncmp. I have a suspicion that this may be optimal in some
ways: to store a string with an "independent" length field whose location
isn't necessarily related to the character data, e.g.


struct string
{
char *data;
unsigned length;
};


or simply keeping two data items in hand. This is because it allows you
to play with substrings without doing any copying. As long as you're not
writing in place in strings, I can get the substring n characters starting
at i in string foo with:


bar.data = foo.data + i;
bar.length = n;


etc.


Of course, the "no-write to your strings" restriction might not be easily
enforcable, so it may make more sense to provide library routines that
accept explicit lengths and don't expect trailing nulls or leading
lengths.


buzzard@eng.umd.edu
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.