Discussion:
[perl #56262] [RFC] chr opcode
(too old to reply)
NotFound
2008-06-23 16:50:16 UTC
Permalink
# New Ticket Created by NotFound
# Please include the string: [perl #56262]
# in the subject line of all future correspondence about this issue.
# <URL: http://rt.perl.org/rt3/Ticket/Display.html?id=56262 >


The chr opcode is documented in docs/ops/string.pod as:

chr(out STR, in INT)
The character specified by codepoint integer $2 in the
current character set is returned in string $1.

But the implementation just calls string_chr, which return a string
ascii, iso_8859_1 or utf8 depending on the codepoint value.

If we change that, we break some code, like
examples/shotout/mandelbrot.pir, that seems to expect that the result
will always be a string with just a byte with the numeric value of his
argument for every 0-255 value.

Maybe we need another opcode, or a way to tell the desired character
set and encoding.
--
Salu2
Patrick R. Michaud
2008-06-23 21:47:54 UTC
Permalink
Post by NotFound
chr(out STR, in INT)
The character specified by codepoint integer $2 in the
current character set is returned in string $1.
But the implementation just calls string_chr, which return a string
ascii, iso_8859_1 or utf8 depending on the codepoint value.
If we change that, we break some code, like
examples/shotout/mandelbrot.pir, that seems to expect that the result
will always be a string with just a byte with the numeric value of his
argument for every 0-255 value.
Maybe we need another opcode, or a way to tell the desired character
set and encoding.
I don't think there's a notion of "current character set", so
we should remove that from the documentation. It should simply
say "The character specified by codepoint integer $2 is returned
in string $1." I think the existing behavior (returning ascii,
iso8859-1, or unicode depending on the codepoint) is reasonable.

Mandelbrot.pir only sends values 0-255 to C<chr>, so it always
gets back a single-character string in fixed8 encoding.

I think the C<chr> opcode as it currently exists is fine for most
purposes. If we need a special-purpose operation to "build a fixed8
string" or "generate a codepoint in a given charset" then perhaps
a method or function to do this would be more appropriate than an
opcode.

Pm
NotFound
2009-02-06 01:09:19 UTC
Permalink
I'll apply this patch in two days if I hear no objection -- or sooner if
the other contributors to this thread approve.
+1
--
Salu2
Loading...