Related articles |
---|
Different string format options, benefits? coxs2@rpi.edu (Sean C. Cox) (1991-10-16) |
Re: Different string format options, benefits? pardo@cs.washington.edu (1991-10-17) |
Re: Different string format options, benefits? pk@cs.tut.fi (1991-10-18) |
Re: Different string format options, benefits? agulbra@Siri.Unit.NO (1991-10-18) |
Re: Different string format options, benefits? db@dcs.ed.ac.uk (Dave Berry) (1991-10-20) |
Re: Different string format options, benefits? tm@well.sf.ca.us (1991-10-22) |
Re: Different string format options, benefits? buzzard@eng.umd.edu (1991-10-25) |
Re: Different string format options, benefits? henry@zoo.toronto.edu (1991-10-25) |
Re: Different string format options, benefits? sdm7g@aemsun.med.virginia.edu (1991-11-01) |
Re: Different string format options, benefits? bliss@sp64.csrd.uiuc.edu (1991-11-05) |
Newsgroups: | comp.compilers |
From: | tm@well.sf.ca.us (Toshi Morita) |
Keywords: | code, C |
Organization: | Whole Earth 'Lectronic Link, Sausalito, CA |
References: | 91-10-061 91-10-072 91-10-079 |
Date: | 22 Oct 91 05:56:19 GMT |
agulbra@Siri.Unit.NO (Arnt Gulbrandsen) writes:
>On machines with very few registers it might be much faster to copy
><bytes,null> strings; 8 or 12-bit machines might require two registers to
>store the length, and I know of one processor which only *has* three
>registers. (The 6502.)
On a 6502 I think (size, bytes) is faster:
ldy #0
lda (source),y
clc
adc #2
pha
iny
lda (source),y
adc #0
beq .Less_Than_256
tax ; Loop for multiples of 256 bytes
dey
.loop1 lda (source),y ; 5 cycles
sta (dest),y ; 5 cycles
iny ; 2 cycles
bne .loop1 ; 3 cycles (when taken)
inc source+1
inc dest+1
dex
bne .loop1
.Less_Than_256
pla ; Loop for mod 256
beq .exit
tay
dey
beq .one
.loop2 lda (source),y ; 5 cycles
sta (dest),y ; 5 cycles
dey ; 2 cycles
bne .loop2 ; 3 cycles (when taken)
.one lda (source),y ; (last byte cleanup)
sta (dest),y
.exit rts
This is code for a two-byte little-endian length prefix. A single-byte length
prefix would make the loop setup much shorter...but anyway, ignoring loop
setup overhead, it's 15 cycles per byte. The code for (bytes, null)
looks like:
ldy #0
.loop lda (source),y ; 5 cycles
sta (dest),y ; 5 cycles
beq .exit ; 2 cycles (if not token)
iny ; 2 cycles
bne .loop ; 3 cycles (if taken)
inc source+1
inc dest+1
jmp .loop
.exit rts
So (size, bytes) is 15 cycles per byte, and (bytes, null) is 17 cycles
per byte (excluding loop setup time).
tm@well.sf.ca.us
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.