Re: Different string format options, benefits?

tm@well.sf.ca.us (Toshi Morita)
22 Oct 91 05:56:19 GMT

          From comp.compilers

Related articles
Different string format options, benefits? coxs2@rpi.edu (Sean C. Cox) (1991-10-16)
Re: Different string format options, benefits? pardo@cs.washington.edu (1991-10-17)
Re: Different string format options, benefits? pk@cs.tut.fi (1991-10-18)
Re: Different string format options, benefits? agulbra@Siri.Unit.NO (1991-10-18)
Re: Different string format options, benefits? db@dcs.ed.ac.uk (Dave Berry) (1991-10-20)
Re: Different string format options, benefits? tm@well.sf.ca.us (1991-10-22)
Re: Different string format options, benefits? buzzard@eng.umd.edu (1991-10-25)
Re: Different string format options, benefits? henry@zoo.toronto.edu (1991-10-25)
Re: Different string format options, benefits? sdm7g@aemsun.med.virginia.edu (1991-11-01)
Re: Different string format options, benefits? bliss@sp64.csrd.uiuc.edu (1991-11-05)
| List of all articles for this month |
Newsgroups: comp.compilers
From: tm@well.sf.ca.us (Toshi Morita)
Keywords: code, C
Organization: Whole Earth 'Lectronic Link, Sausalito, CA
References: 91-10-061 91-10-072 91-10-079
Date: 22 Oct 91 05:56:19 GMT

agulbra@Siri.Unit.NO (Arnt Gulbrandsen) writes:


>On machines with very few registers it might be much faster to copy
><bytes,null> strings; 8 or 12-bit machines might require two registers to
>store the length, and I know of one processor which only *has* three
>registers. (The 6502.)


On a 6502 I think (size, bytes) is faster:


                ldy #0
                lda (source),y
                clc
                adc #2
                pha
                iny
                lda (source),y
                adc #0
                beq .Less_Than_256


                tax ; Loop for multiples of 256 bytes
                dey
.loop1 lda (source),y ; 5 cycles
                sta (dest),y ; 5 cycles
                iny ; 2 cycles
                bne .loop1 ; 3 cycles (when taken)
                inc source+1
                inc dest+1
                dex
                bne .loop1


.Less_Than_256


                pla ; Loop for mod 256
                beq .exit
                tay
                dey
                beq .one
.loop2 lda (source),y ; 5 cycles
                sta (dest),y ; 5 cycles
                dey ; 2 cycles
                bne .loop2 ; 3 cycles (when taken)


.one lda (source),y ; (last byte cleanup)
                sta (dest),y


.exit rts


This is code for a two-byte little-endian length prefix. A single-byte length
prefix would make the loop setup much shorter...but anyway, ignoring loop
setup overhead, it's 15 cycles per byte. The code for (bytes, null)
looks like:


                ldy #0
.loop lda (source),y ; 5 cycles
                sta (dest),y ; 5 cycles
                beq .exit ; 2 cycles (if not token)
                iny ; 2 cycles
                bne .loop ; 3 cycles (if taken)
                inc source+1
                inc dest+1
                jmp .loop


.exit rts


So (size, bytes) is 15 cycles per byte, and (bytes, null) is 17 cycles
per byte (excluding loop setup time).


tm@well.sf.ca.us
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.