Christoph Otto
2008-10-11 21:53:53 UTC
# New Ticket Created by Christoph Otto
# Please include the string: [perl #59810]
# in the subject line of all future correspondence about this issue.
# <URL: http://rt.perl.org/rt3/Ticket/Display.html?id=59810 >
Calling string_hash with a seed value other than the one used in src/hash.c
(3793) can cause strange and wonderful failures if the STRING is reused by imcc.
What happens is that after the STRING's hash is computed, it's cached in
s->hashval. This is works fine unless the first caller of string_hash on a
given STRING uses a seed other than 3793 *and* the STRING is reused by imcc as
a hash key. When this happens, the second call to string_hash sees a cached
hash which was computed with an unexpected seed. When this STRING is used as
a hash key, parrot_hash_get_bucket looks in the wrong bucket and fails to find
the associated value.
This leads to various levels of badness. In Pipp's case, it means that with
the following PIR code, the lookup of the hypothetical do_stuff METHOD would
fail because the STRING 'do_stuff' would be hashed by
Parrot_PhpArray_get_string_keyed_str with an unexpected seed.
p = new 'PhpArray'
p['do_stuff'] = 'x' #caches an unexpected hash in s->hashval
p.'do_stuff'() #method lookup fails with reused STRING 'do_stuff'
Because string_hash is marked PARROT_API and presumably intended for external
use, a caller should be able to call it with an arbitrary seed without messing
up the rest of Parrot. The attached patch allows this by adding a seedval
field to parrot_string_t and checking both the seedval and the hashval before
returning the cached hashval of a STRING.
I'd appreciate comments on whether this is the right approach and if there is
anything the patch misses.
# Please include the string: [perl #59810]
# in the subject line of all future correspondence about this issue.
# <URL: http://rt.perl.org/rt3/Ticket/Display.html?id=59810 >
Calling string_hash with a seed value other than the one used in src/hash.c
(3793) can cause strange and wonderful failures if the STRING is reused by imcc.
What happens is that after the STRING's hash is computed, it's cached in
s->hashval. This is works fine unless the first caller of string_hash on a
given STRING uses a seed other than 3793 *and* the STRING is reused by imcc as
a hash key. When this happens, the second call to string_hash sees a cached
hash which was computed with an unexpected seed. When this STRING is used as
a hash key, parrot_hash_get_bucket looks in the wrong bucket and fails to find
the associated value.
This leads to various levels of badness. In Pipp's case, it means that with
the following PIR code, the lookup of the hypothetical do_stuff METHOD would
fail because the STRING 'do_stuff' would be hashed by
Parrot_PhpArray_get_string_keyed_str with an unexpected seed.
p = new 'PhpArray'
p['do_stuff'] = 'x' #caches an unexpected hash in s->hashval
p.'do_stuff'() #method lookup fails with reused STRING 'do_stuff'
Because string_hash is marked PARROT_API and presumably intended for external
use, a caller should be able to call it with an arbitrary seed without messing
up the rest of Parrot. The attached patch allows this by adding a seedval
field to parrot_string_t and checking both the seedval and the hashval before
returning the cached hashval of a STRING.
I'd appreciate comments on whether this is the right approach and if there is
anything the patch misses.