Re: Internal Representation of Strings

"Armel" <armelasselin@hotmail.com>
Thu, 26 Feb 2009 17:13:12 +0100

          From comp.compilers

Related articles
[26 earlier articles]
Re: Internal Representation of Strings cr88192@hotmail.com (cr88192) (2009-02-23)
Re: Internal Representation of Strings marcov@stack.nl (Marco van de Voort) (2009-02-23)
Re: Internal Representation of Strings haberg_20080406@math.su.se (Hans Aberg) (2009-02-23)
Re: Internal Representation of Strings tony@my.net (Tony) (2009-02-24)
Re: Internal Representation of Strings DrDiettrich1@aol.com (Hans-Peter Diettrich) (2009-02-24)
Re: Internal Representation of Strings tony@my.net (Tony) (2009-02-25)
Re: Internal Representation of Strings armelasselin@hotmail.com (Armel) (2009-02-26)
Re: Internal Representation of Strings marcov@stack.nl (Marco van de Voort) (2009-02-27)
Re: Internal Representation of Strings tony@my.net (Tony) (2009-02-28)
Re: Internal Representation of Strings cr88192@hotmail.com (cr88192) (2009-03-03)
Re: Internal Representation of Strings armelasselin@hotmail.com (Armel) (2009-03-02)
Re: Internal Representation of Strings tony@my.net (Tony) (2009-03-03)
Re: Internal Representation of Strings hebisch@math.uni.wroc.pl (Waldek Hebisch) (2009-03-05)
[1 later articles]
| List of all articles for this month |

From: "Armel" <armelasselin@hotmail.com>
Newsgroups: comp.compilers
Date: Thu, 26 Feb 2009 17:13:12 +0100
Organization: les newsgroups par Orange
References: 09-02-051 09-02-068 09-02-078 09-02-120 09-02-125
Keywords: storage
Posted-Date: 26 Feb 2009 20:14:16 EST

> Simply a length and the character data immediately following, probably.
> Reallocation in memory is going to have to be done for dynamic strings of
> course/maybe depending on the application. I think that will be where I'll
> start and with a 32-bit length.


this is a common implementation, which is rather cool when strings are
immutable or in Copy On Write (you need to add a reference count then along
the length).


> Then if it shows to have a null terminator
> on there to pass to 3rd party libs, I'll add that functionality.


this is indeed _very_ useful to keep a few zero bytes at the end, whenever
calling a C-like API, it avoids memory allocation/de-allocation and copy at
API call time... malloc/free are extremely time consuming (with respect to
simply putting a zero at the end of string), and this need for
temporary-zero-ended copy will clearly not be in the "I think of this as a
big task" when calling very simple APIs.


NB: I write above 'a few' because in my mind, it depends on encoding,
UTF-16/32 will need more than one zero byte. of course, dealing with
evertyhing in UTF8 is so simple.


HIH
Armel



Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.