Electronics & Programming

develissimo

Open Source electronics development and programming

  • You are not logged in.
  • Root
  • » PHP
  • » [PHP-DEV] Zend engine's hashtable performance tweaks [RSS Feed]

#1 Dec. 31, 2010 16:09:11

Pierre J.
Registered: 2009-11-02
Reputation: +  0  -
Profile   Send e-mail  

[PHP-DEV] Zend engine's hashtable performance tweaks


hi,

Did you forget to attach the patch? Attach it as .txt so the list
won't reject it.

Cheers,

On Fri, Dec 31, 2010 at 3:40 PM, Marcin Babij
<marcin.ba...@nasza-klasa.pl> wrote:
> Hello,
> I work for social network company, where we were running optimization
> project. One of it's results is patch to Zend engine's Hashtable, which we
> want to share and ask you for comments and improvements.
>
> Why we do this?
> We run profiling on our production servers and found out that zend_hash_*
> functions take 10-20% CPU time of request. So there is some room for easy
> improvements.
>
> What was done?
> - Hash function in zend_hash.h was rebuilt and became much faster, without
> losing the most important properties.
> - Hashtable implementation was changed from Simple chaining to Open
> addressing with linear probing, but with linked bucket, not included in hash
> array, which causes:
> -- Bucket structure to lose 2 pointers.
> -- Searching works similar, but don't have to jump with pointers stored in
> different memory locations, inserting, deleting and rehashing don't need to
> update linked list, but must search for first empty bucket, which is fast,
> because it scans continuous memory.
> -- Load factor decreases from 1.0 to 0.5-0.75 to make less collisions and
> faster hashtable, which in turn increases memory footprint a little.
> - Open addressing doesn't change significantly performance, but next thing
> was to create new array (arEmpty), which is of size nTableSize bytes, which
> keeps track of used/empty buckets and makes inserting and rehashing much
> faster. In future it can be tested as bit-array with size of nTableSize/8
> bytes.
> - More macros were added to replace repetitive constructs.
> - New constants were added to allow:
> -- Creating new hashtables of size at least X (where 4 and 8 are
> reasonable), which makes no rehashing and reallocing memory while changing
> size to 2 and then to 4.
> -- For small tables it's better to extend them by a factor of 4 times, not
> 2, to make rehashing cost smaller for most hashtables, of cost of little
> higher memory consumption.
> -- For large tables it's better to have other load factor, closer to 1,
> while for small tables it's better to use load factor closer to 0.5.
> - APC was patched to take changes in Bucket structure into account.
>
> How was it tested?
> It was tested with make test, where one more (comparing to original sources)
> test fails, but it's most probably because
>http://bugs.php.net/bug.php?id=48858- IMO test is badly constructed (is too
> simple) and any change of hashing function makes it fail. Also it was tested
> on our testing environment and production servers against >30mln requests to
> our site, with 120requests/s at peak on Xeon @ 2.50GHz with 8GB RAM running
> Debian Linux.
>
> What is the gain?
> After tests CPU usage dropped by about 4% -6%.
> Memory footprint goes up just by few percent.
>
> What can be done in future?
> - Make new constants configurable by php.ini.
> - Test if changing arEmpty from byte-array to bit-array helps on
> performance.
> - Tweak default constants' values using some real-live benchmarks.
> - Prove (or modify and prove) hash function to have property, that it has no
> collisions if two keys don't differ on no more than 6 bytes, which will lead
> to memcmp omit first (or last) 6 bytes of key. Also simpler thing may be
> proven, that is it has no collisions if two keys are not longer than 6
> bytes, which will make most string keys omit memcpy at all.
>
> The patch was created and tested against php-5.3.0, apc-3.1.3p1, then merged
> with php-5.3.4, apc-3.1.6 without conflicts, and for these last versions
> patches are attached. Also, it shouldn't conflict with
>http://wiki.php.net/rfc/performanceimprovements.
>
> --
> Marcin Babij
> nasza-klasa.pl
>
> --
> PHP Internals - PHP Runtime Development Mailing List
> To unsubscribe, visit:http://www.php.net/unsub.php>



--
Pierre

@pierrejoye |http://blog.thepimp.net|http://www.libgd.org--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit:http://www.php.net/unsub.php

Offline

#2 Dec. 31, 2010 16:17:07

Marcin B.
Registered: 2009-11-02
Reputation: +  0  -
Profile   Send e-mail  

[PHP-DEV] Zend engine's hashtable performance tweaks


Sorry, I've attached .patch files, I'm attaching them as .txt now.hi,

Did you forget to attach the patch? Attach it as .txt so the list
won't reject it.

Cheers,

On Fri, Dec 31, 2010 at 3:40 PM, Marcin Babij
<marcin.ba...@nasza-klasa.pl> wrote:Hello,
I work for social network company, where we were running optimizationproject. One of it's results is patch to Zend engine's Hashtable, whichwewant to share and ask you for comments and improvements.

Why we do this?We run profiling on our production servers and found out thatzend_hash_*functions take 10-20% CPU time of request. So there is some room foreasyimprovements.

What was done?- Hash function in zend_hash.h was rebuilt and became much faster,withoutlosing the most important properties.
- Hashtable implementation was changed from Simple chaining to Openaddressing with linear probing, but with linked bucket, not included inhasharray, which causes:
-- Bucket structure to lose 2 pointers.-- Searching works similar, but don't have to jump with pointers storedindifferent memory locations, inserting, deleting and rehashing don'tneed toupdate linked list, but must search for first empty bucket, which isfast,because it scans continuous memory.-- Load factor decreases from 1.0 to 0.5-0.75 to make less collisionsandfaster hashtable, which in turn increases memory footprint a little.- Open addressing doesn't change significantly performance, but nextthingwas to create new array (arEmpty), which is of size nTableSize bytes,whichkeeps track of used/empty buckets and makes inserting and rehashing muchfaster. In future it can be tested as bit-array with size ofnTableSize/8bytes.
- More macros were added to replace repetitive constructs.
- New constants were added to allow:
-- Creating new hashtables of size at least X (where 4 and 8 arereasonable), which makes no rehashing and reallocing memory whilechangingsize to 2 and then to 4.-- For small tables it's better to extend them by a factor of 4 times,not2, to make rehashing cost smaller for most hashtables, of cost of little
higher memory consumption.
-- For large tables it's better to have other load factor, closer to 1,
while for small tables it's better to use load factor closer to 0.5.
- APC was patched to take changes in Bucket structure into account.

How was it tested?It was tested with make test, where one more (comparing to originalsources)test fails, but it's most probably becausehttp://bugs.php.net/bug.php?id=48858- IMO test is badly constructed(is toosimple) and any change of hashing function makes it fail. Also it wastestedon our testing environment and production servers against >30mlnrequests toour site, with 120requests/s at peak on Xeon @ 2.50GHz with 8GB RAMrunningDebian Linux.

What is the gain?
After tests CPU usage dropped by about 4% -6%.
Memory footprint goes up just by few percent.

What can be done in future?
- Make new constants configurable by php.ini.
- Test if changing arEmpty from byte-array to bit-array helps on
performance.
- Tweak default constants' values using some real-live benchmarks.- Prove (or modify and prove) hash function to have property, that ithas nocollisions if two keys don't differ on no more than 6 bytes, which willleadto memcmp omit first (or last) 6 bytes of key. Also simpler thing may be
proven, that is it has no collisions if two keys are not longer than 6
bytes, which will make most string keys omit memcpy at all.The patch was created and tested against php-5.3.0, apc-3.1.3p1, thenmergedwith php-5.3.4, apc-3.1.6 without conflicts, and for these last versions
patches are attached. Also, it shouldn't conflict withhttp://wiki.php.net/rfc/performanceimprovements.

--
Marcin Babij
nasza-klasa.plIndex: Zend/zend_hash.c
===================================================================
--- Zend/zend_hash.c 2010-12-30 17:09:14.000000000 +0100
+++ Zend/zend_hash.c 2010-12-31 01:45:14.000000000 +0100
@@ -5,7 +5,7 @@
| Copyright (c) 1998-2010 Zend Technologies Ltd. (http://www.zend.com) |
+----------------------------------------------------------------------+
| This source file is subject to version 2.00 of the Zend license, |
- | that is bundled with this package in the file LICENSE, and is |
+ | that is bundled with this package in the file LICENSE, and is |
| available through the world-wide-web at the following url: |
|http://www.zend.com/license/2_00.txt. |
| If you did not receive a copy of the Zend license and are unable to |
@@ -21,12 +21,16 @@

#include "zend.h"

-#define CONNECT_TO_BUCKET_DLLIST(element, list_head) \
- (element)->pNext = (list_head);
\
- (element)->pLast = NULL;
\
- if ((element)->pNext) {
\
- (element)->pNext->pLast = (element);
\
- }
+#define ZEND_HASH_SMALL_TABLE_SIZE 64
+#define ZEND_HASH_MIN_TABLE_SIZE 4
+#define ZEND_HASH_SMALL_LOAD_FACTOR_TABLE_SIZE 16
+#define ZEND_HASH_SMALL_LOAD_FACTOR_INV 1.75
+#define ZEND_HASH_LOAD_FACTOR_INV 1.25
+
+#define EQ_INDEX(ptr,index) \
+ (EXPECTED(((ptr)->nKeyLength == 0) && (ptr)->h == index))
+
+#define CONNECT_TO_BUCKET_DLLIST(element, list_head)

#define CONNECT_TO_GLOBAL_DLLIST(element, ht) \
(element)->pListLast = (ht)->pListTail;
\
@@ -91,7 +95,9 @@


#define ZEND_HASH_IF_FULL_DO_RESIZE(ht) \
- if ((ht)->nNumOfElements > (ht)->nTableSize) { \
+ if ((ht)->nTableSize <= ZEND_HASH_SMALL_LOAD_FACTOR_TABLE_SIZE ? \
+ ((ht)->nNumOfElements*ZEND_HASH_SMALL_LOAD_FACTOR_INV >
(ht)->nTableSize) : \
+ ((ht)->nNumOfElements*ZEND_HASH_LOAD_FACTOR_INV >
(ht)->nTableSize)) { \
zend_hash_do_resize(ht);
\
}

@@ -102,7 +108,6 @@
return zend_inline_hash_func(arKey, nKeyLength);
}

-
#define UPDATE_DATA(ht, p, pData, nDataSize)
\
if (nDataSize == sizeof(void*)) {
\
if ((p)->pData != &(p)->pDataPtr) {
\
@@ -136,11 +141,10 @@
}


-
ZEND_API int _zend_hash_init(HashTable *ht, uint nSize, hash_func_t
pHashFunction, dtor_func_t pDestructor, zend_bool persistent ZEND_FILE_LINE_DC)
{
uint i = 3;
- Bucket **tmp;
+ void *tmp;

SET_INCONSISTENT(HT_OK);

@@ -151,7 +155,10 @@
while ((1U << i) < nSize) {
i++;
}
- ht->nTableSize = 1 << i;
+ ht->nTableSize = 1 << (i+1);
+ if (ht->nTableSize < ZEND_HASH_MIN_TABLE_SIZE) {
+ ht->nTableSize = ZEND_HASH_MIN_TABLE_SIZE;
+ }
}

ht->nTableMask = ht->nTableSize - 1;
@@ -165,21 +172,15 @@
ht->persistent = persistent;
ht->nApplyCount = 0;
ht->bApplyProtection = 1;
-
- /* Uses ecalloc() so that Bucket* == NULL */
- if (persistent) {
- tmp = (Bucket **) calloc(ht->nTableSize, sizeof(Bucket *));
- if (!tmp) {
- return FAILURE;
- }
- ht->arBuckets = tmp;
- } else {
- tmp = (Bucket **) ecalloc_rel(ht->nTableSize, sizeof(Bucket *));
- if (tmp) {
- ht->arBuckets = tmp;
- }
+
+ tmp = pemalloc_rel(ht->nTableSize * (sizeof(Bucket*)+1),
ht->persistent);
+ if (!tmp) {
+ return FAILURE;
}
-
+ ht->arBuckets = (Bucket**)tmp;
+ ht->arEmpty = ((uchar*)tmp) + (ht->nTableSize * (sizeof(Bucket*)));
+ memset(ht->arEmpty, 0, ht->nTableSize * 1);
+
return SUCCESS;
}

@@ -216,38 +217,34 @@
}

h = zend_inline_hash_func(arKey, nKeyLength);
- nIndex = h & ht->nTableMask;
+ nIndex = ZEND_HASH_BUCKET(ht, h);

- p = ht->arBuckets;
- while (p != NULL) {
- if ((p->h == h) && (p->nKeyLength == nKeyLength)) {
- if (!memcmp(p->arKey, arKey, nKeyLength)) {
- if (flag & HASH_ADD) {
- return FAILURE;
- }
- HANDLE_BLOCK_INTERRUPTIONS();
+ ZEND_HASH_ITERATE_UNTIL_FREE(ht, nIndex, p) {
+ if (EQ_HASH_KEYS(p, arKey, h, nKeyLength)) {
+ if (flag & HASH_ADD) {
+ return FAILURE;
+ }
+ HANDLE_BLOCK_INTERRUPTIONS();
#if ZEND_DEBUG
- if (p->pData == pData) {
- ZEND_PUTS("Fatal error in
zend_hash_update: p->pData == pData\n");
- HANDLE_UNBLOCK_INTERRUPTIONS();
- return FAILURE;
- }
-#endif
- if (ht->pDestructor) {
- ht->pDestructor(p->pData);
- }
- UPDATE_DATA(ht, p, pData, nDataSize);
- if (pDest) {
- *pDest = p->pData;
- }
+ if (p->pData == pData) {
+ ZEND_PUTS("Fatal error in zend_hash_update:
p->pData == pData\n");
HANDLE_UNBLOCK_INTERRUPTIONS();
- return SUCCESS;
+ return FAILURE;
+ }
+#endif
+ if (ht->pDestructor) {
+ ht->pDestructor(p->pData);
+ }
+ UPDATE_DATA(ht, p, pData, nDataSize);
+ if (pDest) {
+ *pDest = p->pData;
}
+ HANDLE_UNBLOCK_INTERRUPTIONS();
+ return SUCCESS;
}
- p = p->pNext;
}
-
- p = (Bucket *) pemalloc(sizeof(Bucket) - 1 + nKeyLength,
ht->persistent);
+
+ p = (Bucket *)pemalloc_rel(sizeof(Bucket) - 1 + nKeyLength,
ht->persistent);
if (!p) {
return FAILURE;
}
@@ -262,11 +259,11 @@

HANDLE_BLOCK_INTERRUPTIONS();
CONNECT_TO_GLOBAL_DLLIST(p, ht);
- ht->arBuckets = p;
+ ZEND_HASH_FILL_BUCKET(ht, nIndex, p);
HANDLE_UNBLOCK_INTERRUPTIONS();

ht->nNumOfElements++;
- ZEND_HASH_IF_FULL_DO_RESIZE(ht); /* If the Hash table is
full, resize it */
+ ZEND_HASH_IF_FULL_DO_RESIZE(ht);
return SUCCESS;
}

@@ -281,38 +278,34 @@
return zend_hash_index_update(ht, h, pData, nDataSize, pDest);
}

- nIndex = h & ht->nTableMask;
-
- p = ht->arBuckets;
- while (p != NULL) {
- if ((p->h == h) && (p->nKeyLength == nKeyLength)) {
- if (!memcmp(p->arKey, arKey, nKeyLength)) {
- if (flag & HASH_ADD) {
- return FAILURE;
- }
- HANDLE_BLOCK_INTERRUPTIONS();
+ nIndex = ZEND_HASH_BUCKET(ht, h);
+
+ ZEND_HASH_ITERATE_UNTIL_FREE(ht, nIndex, p) {
+ if (EQ_HASH_KEYS(p, arKey, h, nKeyLength)) {
+ if (flag & HASH_ADD) {
+ return FAILURE;
+ }
+ HANDLE_BLOCK_INTERRUPTIONS();
#if ZEND_DEBUG
- if (p->pData == pData) {
- ZEND_PUTS("Fatal error in
zend_hash_update: p->pData == pData\n");
- HANDLE_UNBLOCK_INTERRUPTIONS();
- return FAILURE;
- }
-#endif
- if (ht->pDestructor) {
- ht->pDestructor(p->pData);
- }
- UPDATE_DATA(ht, p, pData, nDataSize);
- if (pDest) {
- *pDest = p->pData;
- }
+ if (p->pData == pData) {
+ ZEND_PUTS("Fatal error in zend_hash_update:
p->pData == pData\n");
HANDLE_UNBLOCK_INTERRUPTIONS();
- return SUCCESS;
+ return FAILURE;
}
+#endif
+ if (ht->pDestructor) {
+ ht->pDestructor(p->pData);
+ }
+ UPDATE_DATA(ht, p, pData, nDataSize);
+ if (pDest) {
+ *pDest = p->pData;
+ }
+ HANDLE_UNBLOCK_INTERRUPTIONS();
+ return SUCCESS;
}
- p = p->pNext;
}
-
- p = (Bucket *) pemalloc(sizeof(Bucket) - 1 + nKeyLength,
ht->persistent);
+
+ p = (Bucket *)pemalloc_rel(sizeof(Bucket) - 1 + nKeyLength,
ht->persistent);
if (!p) {
return FAILURE;
}
@@ -321,7 +314,7 @@
p->nKeyLength = nKeyLength;
INIT_DATA(ht, p, pData, nDataSize);
p->h = h;
-
+
CONNECT_TO_BUCKET_DLLIST(p, ht->arBuckets);

if (pDest) {
@@ -329,12 +322,12 @@
}

HANDLE_BLOCK_INTERRUPTIONS();
- ht->arBuckets = p;
+ ZEND_HASH_FILL_BUCKET(ht, nIndex, p);
CONNECT_TO_GLOBAL_DLLIST(p, ht);
HANDLE_UNBLOCK_INTERRUPTIONS();

ht->nNumOfElements++;
- ZEND_HASH_IF_FULL_DO_RESIZE(ht); /* If the Hash table is
full, resize it */
+ ZEND_HASH_IF_FULL_DO_RESIZE(ht);
return SUCCESS;
}

@@ -357,11 +350,10 @@
if (flag & HASH_NEXT_INSERT) {
h = ht->nNextFreeElement;
}
- nIndex = h & ht->nTableMask;
+ nIndex = ZEND_HASH_BUCKET(ht, h);

- p = ht->arBuckets;
- while (p != NULL) {
- if ((p->nKeyLength == 0) && (p->h == h)) {
+ ZEND_HASH_ITERATE_UNTIL_FREE(ht, nIndex, p) {
+ if (EQ_INDEX(p, h)) {
if (flag & HASH_NEXT_INSERT || flag & HASH_ADD) {
return FAILURE;
}
@@ -386,7 +378,6 @@
}
return SUCCESS;
}
- p = p->pNext;
}
p = (Bucket *) pemalloc_rel(sizeof(Bucket) - 1, ht->persistent);
if (!p) {
@@ -402,7 +393,7 @@
CONNECT_TO_BUCKET_DLLIST(p, ht->arBuckets);

HANDLE_BLOCK_INTERRUPTIONS();
- ht->arBuckets = p;
+ ZEND_HASH_FILL_BUCKET(ht, nIndex, p);
CONNECT_TO_GLOBAL_DLLIST(p, ht);
HANDLE_UNBLOCK_INTERRUPTIONS();

@@ -414,27 +405,28 @@
return SUCCESS;
}

-
static int zend_hash_do_resize(HashTable *ht)
{
- Bucket **t;
+ void *t;
+ uint dest_size = ht->nTableSize <= ZEND_HASH_SMALL_TABLE_SIZE ?
(ht->nTableSize << 2) : (ht->nTableSize << 1);

IS_CONSISTENT(ht);

- if ((ht->nTableSize << 1) > 0) { /* Let's double the table size
*/
- t = (Bucket **) perealloc_recoverable(ht->arBuckets,
(ht->nTableSize << 1) * sizeof(Bucket *), ht->persistent);
- if (t) {
- HANDLE_BLOCK_INTERRUPTIONS();
- ht->arBuckets = t;
- ht->nTableSize = (ht->nTableSize << 1);
- ht->nTableMask = ht->nTableSize - 1;
- zend_hash_rehash(ht);
- HANDLE_UNBLOCK_INTERRUPTIONS();
- return SUCCESS;
+ if (dest_size > 0) { /* Let's change the table size */
+ t = perealloc(ht->arBuckets, dest_size * (sizeof(Bucket *) +
1), ht->persistent);
+ if (!t) {
+ return FAILURE;
}
- return FAILURE;
+ HANDLE_BLOCK_INTERRUPTIONS();
+ ht->nTableSize = dest_size;
+ ht->nTableMask = ht->nTableSize - 1;
+ ht->arBuckets = (Bucket**)t;
+ ht->arEmpty = ((uchar*)t) + (ht->nTableSize *
(sizeof(Bucket*)));
+ zend_hash_rehash(ht);
+ HANDLE_UNBLOCK_INTERRUPTIONS();
+ return SUCCESS;
}
- return SUCCESS;
+ return FAILURE;
}

ZEND_API int zend_hash_rehash(HashTable *ht)
@@ -444,17 +436,41 @@

IS_CONSISTENT(ht);

- memset(ht->arBuckets, 0, ht->nTableSize * sizeof(Bucket *));
+ memset(ht->arEmpty, 0, ht->nTableSize * 1);
+
p = ht->pListHead;
while (p != NULL) {
- nIndex = p->h & ht->nTableMask;
+ nIndex = ZEND_HASH_BUCKET(ht, p->h);
+ ZEND_HASH_FIND_FREE(ht, nIndex);
+ ZEND_HASH_FILL_BUCKET(ht, nIndex, p);
CONNECT_TO_BUCKET_DLLIST(p, ht->arBuckets);
- ht->arBuckets = p;
p = p->pListNext;
}
return SUCCESS;
}

+inline void zend_hash_free_bucket(HashTable *ht, uint nIndex) {
+ uint nIndexSlot = nIndex;
+ uint nSourceIndex;
+
+ ZEND_HASH_MAKE_ELEMENT_EMPTY(ht, nIndexSlot);
+ for(;;) {
+ ZEND_HASH_NEXT_INDEX(ht, nIndexSlot);
+ if (!ZEND_HASH_BUCKET_IS_OCCUPIED(ht, nIndexSlot)) {
+ break;
+ }
+ nSourceIndex = ZEND_HASH_BUCKET(ht,
ht->arBuckets->h);
+ if ((nIndex < nIndexSlot) ?
+ ((nIndex < nSourceIndex) && (nSourceIndex <=
nIndexSlot)) :
+ ((nIndex < nSourceIndex) || (nSourceIndex <=
nIndexSlot))) {
+ continue;
+ }
+ ZEND_HASH_FILL_BUCKET(ht, nIndex, ht->arBuckets);
+ nIndex = nIndexSlot;
+ ZEND_HASH_MAKE_ELEMENT_EMPTY(ht, nIndexSlot);
+ }
+}
+
ZEND_API int zend_hash_del_key_or_index(HashTable *ht, const char *arKey, uint
nKeyLength, ulong h, int flag)
{
uint nIndex;
@@ -465,27 +481,17 @@
if (flag == HASH_DEL_KEY) {
h = zend_inline_hash_func(arKey, nKeyLength);
}
- nIndex = h & ht->nTableMask;
+ nIndex = ZEND_HASH_BUCKET(ht, h);

- p = ht->arBuckets;
- while (p != NULL) {
- if ((p->h == h)
- && (p->nKeyLength == nKeyLength)
- && ((p->nKeyLength == 0) /* Numeric index (short
circuits the memcmp() check) */
- || !memcmp(p->arKey, arKey, nKeyLength))) { /*
String index */
+ ZEND_HASH_ITERATE_UNTIL_FREE(ht, nIndex, p) {
+ if (EQ_HASH_KEYS(p, arKey, h, nKeyLength)) {
HANDLE_BLOCK_INTERRUPTIONS();
- if (p == ht->arBuckets) {
- ht->arBuckets = p->pNext;
- } else {
- p->pLast->pNext = p->pNext;
- }
- if (p->pNext) {
- p->pNext->pLast = p->pLast;
- }
+
+ zend_hash_free_bucket(ht, nIndex);
+
if (p->pListLast != NULL) {
p->pListLast->pListNext = p->pListNext;
- } else {
- /* Deleting the head of the list */
+ } else {
ht->pListHead = p->pListNext;
}
if (p->pListNext != NULL) {
@@ -503,11 +509,11 @@
pefree(p->pData, ht->persistent);
}
pefree(p, ht->persistent);
+
HANDLE_UNBLOCK_INTERRUPTIONS();
ht->nNumOfElements--;
return SUCCESS;
}
- p = p->pNext;
}
return FAILURE;
}
@@ -559,7 +565,7 @@
}
pefree(q, ht->persistent);
}
- memset(ht->arBuckets, 0, ht->nTableSize*sizeof(Bucket *));
+ memset(ht->arEmpty, 0, ht->nTableSize*1);
ht->pListHead = NULL;
ht->pListTail = NULL;
ht->nNumOfElements = 0;
@@ -571,31 +577,31 @@

/* This function is used by the various apply() functions.
* It deletes the passed bucket, and returns the address of the
- * next bucket. The hash *may* be altered during that time, the
+ * next bucket. The hash *may* be altered during that time, the
* returned value will still be valid.
*/
static Bucket *zend_hash_apply_deleter(HashTable *ht, Bucket *p)
{
Bucket *retval;

- HANDLE_BLOCK_INTERRUPTIONS();
- if (p->pLast) {
- p->pLast->pNext = p->pNext;
- } else {
- uint nIndex;
+ uint nIndex;

- nIndex = p->h & ht->nTableMask;
- ht->arBuckets = p->pNext;
- }
- if (p->pNext) {
- p->pNext->pLast = p->pLast;
- } else {
- /* Nothing to do as this list doesn't have a tail */
+ nIndex = ZEND_HASH_BUCKET(ht, p->h);
+ for (; ht->arBuckets != p; ZEND_HASH_NEXT_INDEX(ht, nIndex)) {
+#ifdef ZEND_DEBUG
+ if (!ZEND_HASH_BUCKET_IS_OCCUPIED(ht, nIndex)) {
+ zend_output_debug_string(1, "No element to delete");
+ }
+#endif
}

+ HANDLE_BLOCK_INTERRUPTIONS();
+
+ zend_hash_free_bucket(ht, nIndex);
+
if (p->pListLast != NULL) {
p->pListLast->pListNext = p->pListNext;
- } else {
+ } else {
/* Deleting the head of the list */
ht->pListHead = p->pListNext;
}
@@ -655,9 +661,9 @@
SET_INCONSISTENT(HT_DESTROYED);
}

-/* This is used to recurse elements and selectively delete certain entries
- * from a hashtable. apply_func() receives the data and decides if the entry
- * should be deleted or recursion should be stopped. The following three
+/* This is used to recurse elements and selectively delete certain entries
+ * from a hashtable. apply_func() receives the data and decides if the entry
+ * should be deleted or recursion should be stopped. The following three
* return codes are possible:
* ZEND_HASH_APPLY_KEEP - continue
* ZEND_HASH_APPLY_STOP - stop iteration
@@ -674,7 +680,7 @@
p = ht->pListHead;
while (p != NULL) {
int result = apply_func(p->pData TSRMLS_CC);
-
+
if (result & ZEND_HASH_APPLY_REMOVE) {
p = zend_hash_apply_deleter(ht, p);
} else {
@@ -698,7 +704,7 @@
p = ht->pListHead;
while (p != NULL) {
int result = apply_func(p->pData, argument TSRMLS_CC);
-
+
if (result & ZEND_HASH_APPLY_REMOVE) {
p = zend_hash_apply_deleter(ht, p);
} else {
@@ -878,17 +884,12 @@
IS_CONSISTENT(ht);

h = zend_inline_hash_func(arKey, nKeyLength);
- nIndex = h & ht->nTableMask;
-
- p = ht->arBuckets;
- while (p != NULL) {
- if ((p->h == h) && (p->nKeyLength == nKeyLength)) {
- if (!memcmp(p->arKey, arKey, nKeyLength)) {
- *pData = p->pData;
- return SUCCESS;
- }
+ nIndex = ZEND_HASH_BUCKET(ht, h);
+ ZEND_HASH_ITERATE_UNTIL_FREE(ht, nIndex, p) {
+ if (EQ_HASH_KEYS(p, arKey, h, nKeyLength)) {
+ *pData = p->pData;
+ return SUCCESS;
}
- p = p->pNext;
}
return FAILURE;
}
@@ -905,17 +906,13 @@

IS_CONSISTENT(ht);

- nIndex = h & ht->nTableMask;
+ nIndex = ZEND_HASH_BUCKET(ht, h);

- p = ht->arBuckets;
- while (p != NULL) {
- if ((p->h == h) && (p->nKeyLength == nKeyLength)) {
- if (!memcmp(p->arKey, arKey, nKeyLength)) {
- *pData = p->pData;
- return SUCCESS;
- }
+ ZEND_HASH_ITERATE_UNTIL_FREE(ht, nIndex, p) {
+ if (EQ_HASH_KEYS(p, arKey, h, nKeyLength)) {
+ *pData = p->pData;
+ return SUCCESS;
}
- p = p->pNext;
}
return FAILURE;
}
@@ -930,16 +927,12 @@
IS_CONSISTENT(ht);

h = zend_inline_hash_func(arKey, nKeyLength);
- nIndex = h & ht->nTableMask;
+ nIndex = ZEND_HASH_BUCKET(ht, h);

- p = ht->arBuckets;
- while (p != NULL) {
- if ((p->h == h) && (p->nKeyLength == nKeyLength)) {
- if (!memcmp(p->arKey, arKey, nKeyLength)) {
- return 1;
- }
+ ZEND_HASH_ITERATE_UNTIL_FREE(ht, nIndex, p) {
+ if (EQ_HASH_KEYS(p, arKey, h, nKeyLength)) {
+ return 1;
}
- p = p->pNext;
}
return 0;
}
@@ -956,16 +949,12 @@

IS_CONSISTENT(ht);

- nIndex = h & ht->nTableMask;
+ nIndex = ZEND_HASH_BUCKET(ht, h);

- p = ht->arBuckets;
- while (p != NULL) {
- if ((p->h == h) && (p->nKeyLength == nKeyLength)) {
- if (!memcmp(p->arKey, arKey, nKeyLength)) {
- return 1;
- }
+ ZEND_HASH_ITERATE_UNTIL_FREE(ht, nIndex, p) {
+ if (EQ_HASH_KEYS(p, arKey, h, nKeyLength)) {
+ return 1;
}
- p = p->pNext;
}
return 0;

@@ -979,15 +968,13 @@

IS_CONSISTENT(ht);

- nIndex = h & ht->nTableMask;
+ nIndex = ZEND_HASH_BUCKET(ht, h);

- p = ht->arBuckets;
- while (p != NULL) {
- if ((p->h == h) && (p->nKeyLength == 0)) {
+ ZEND_HASH_ITERATE_UNTIL_FREE(ht, nIndex, p) {
+ if (EQ_INDEX(p, h)) {
*pData = p->pData;
return SUCCESS;
}
- p = p->pNext;
}
return FAILURE;
}
@@ -1000,14 +987,12 @@

IS_CONSISTENT(ht);

- nIndex = h & ht->nTableMask;
+ nIndex = ZEND_HASH_BUCKET(ht, h);

- p = ht->arBuckets;
- while (p != NULL) {
- if ((p->h == h) && (p->nKeyLength == 0)) {
+ ZEND_HASH_ITERATE_UNTIL_FREE(ht, nIndex, p) {
+ if (EQ_INDEX(p, h)) {
return 1;
}
- p = p->pNext;
}
return 0;
}
@@ -1041,13 +1026,12 @@
Bucket *p;

IS_CONSISTENT(ht);
- p = ht->arBuckets;
- while (p != NULL) {
+ uint nIndex = ZEND_HASH_BUCKET(ht, ptr->h);
+ ZEND_HASH_ITERATE_UNTIL_FREE(ht, nIndex, p) {
if (p == ptr->pos) {
ht->pInternalPointer = p;
return 1;
}
- p = p->pNext;
}
return 0;
}
@@ -1065,7 +1049,7 @@
}


-/* This function will be extremely optimized by remembering
+/* This function will be extremely optimized by remembering
* the end of the list
*/
ZEND_API void zend_hash_internal_pointer_end_ex(HashTable *ht, HashPosition
*pos)
@@ -1106,7 +1090,7 @@
}


-/* This function should be made binary safe */
+/* This function should be made binary safe */
ZEND_API int zend_hash_get_current_key_ex(const HashTable *ht, char
**str_index, uint *str_length, ulong *num_index, zend_bool duplicate,
HashPosition *pos)
{
Bucket *p;
@@ -1176,6 +1160,8 @@
ZEND_API int zend_hash_update_current_key_ex(HashTable *ht, int key_type,
const char *str_index, uint str_length, ulong num_index, int mode, HashPosition
*pos)
{
Bucket *p;
+ Bucket *q;
+ uint nIndex;

p = pos ? (*pos) : ht->pInternalPointer;

@@ -1184,18 +1170,18 @@
if (p) {
if (key_type == HASH_KEY_IS_LONG) {
str_length = 0;
- if (!p->nKeyLength && p->h == num_index) {
+ if (EQ_INDEX(p, num_index)) {
return SUCCESS;
}

if (mode != HASH_UPDATE_KEY_ANYWAY) {
- Bucket *q = ht->arBuckets;
+ nIndex = ZEND_HASH_BUCKET(ht, num_index);
int found = 0;

- while (q != NULL) {
+ ZEND_HASH_ITERATE_UNTIL_FREE(ht, nIndex, q) {
if (q == p) {
found = 1;
- } else if (!q->nKeyLength && q->h ==
num_index) {
+ } else if (EQ_INDEX(q, num_index)) {
if (found) {
if (mode &
HASH_UPDATE_KEY_IF_BEFORE) {
break;
@@ -1220,28 +1206,26 @@
}
}
}
- q = q->pNext;
}
}

zend_hash_index_del(ht, num_index);
} else if (key_type == HASH_KEY_IS_STRING) {
if (p->nKeyLength == str_length &&
- memcmp(p->arKey, str_index, str_length) == 0) {
+ memcmp(p->arKey, str_index, str_length)
== 0) {
return SUCCESS;
}

if (mode != HASH_UPDATE_KEY_ANYWAY) {
ulong h = zend_inline_hash_func(str_index,
str_length);
- Bucket *q = ht->arBuckets;
+ nIndex = ZEND_HASH_BUCKET(ht, h);
int found = 0;

- while (q != NULL) {
+ ZEND_HASH_ITERATE_UNTIL_FREE(ht, nIndex, q) {
if (q == p) {
found = 1;
- } else if (q->h == h && q->nKeyLength
== str_length &&
- memcmp(q->arKey, str_index,
str_length) == 0) {
- if (found) {
+ } else if (EQ_HASH_KEYS(q, str_index,
h, str_length)) {
+ if (found) {
if (mode &
HASH_UPDATE_KEY_IF_BEFORE) {
break;
} else {
@@ -1265,7 +1249,6 @@
}
}
}
- q = q->pNext;
}
}

@@ -1276,13 +1259,12 @@

HANDLE_BLOCK_INTERRUPTIONS();

- if (p->pNext) {
- p->pNext->pLast = p->pLast;
- }
- if (p->pLast) {
- p->pLast->pNext = p->pNext;
- } else{
- ht->arBuckets = p->pNext;
+ nIndex = ZEND_HASH_BUCKET(ht, p->h);
+ ZEND_HASH_ITERATE_UNTIL_FREE(ht, nIndex, q) {
+ if (p == q) {
+ zend_hash_free_bucket(ht, nIndex);
+ break;
+ }
}

if (p->nKeyLength != str_length) {
@@ -1324,8 +1306,10 @@
p->h = zend_inline_hash_func(str_index, str_length);
}

- CONNECT_TO_BUCKET_DLLIST(p, ht->arBuckets);
- ht->arBuckets = p;
+ nIndex = ZEND_HASH_BUCKET(ht, p->h);
+ CONNECT_TO_BUCKET_DLLIST(p, ht->arBuckets);
+ ZEND_HASH_FIND_FREE(ht, nIndex);
+ ZEND_HASH_FILL_BUCKET(ht, nIndex, p);
HANDLE_UNBLOCK_INTERRUPTIONS();

return SUCCESS;
@@ -1406,13 +1390,13 @@
IS_CONSISTENT(ht1);
IS_CONSISTENT(ht2);

- HASH_PROTECT_RECURSION(ht1);
- HASH_PROTECT_RECURSION(ht2);
+ HASH_PROTECT_RECURSION(ht1);
+ HASH_PROTECT_RECURSION(ht2);

result = ht1->nNumOfElements - ht2->nNumOfElements;
if (result!=0) {
- HASH_UNPROTECT_RECURSION(ht1);
- HASH_UNPROTECT_RECURSION(ht2);
+ HASH_UNPROTECT_RECURSION(ht1);
+ HASH_UNPROTECT_RECURSION(ht2);
return result;
}

@@ -1423,29 +1407,29 @@

while (p1) {
if (ordered && !p2) {
- HASH_UNPROTECT_RECURSION(ht1);
- HASH_UNPROTECT_RECURSION(ht2);
+ HASH_UNPROTECT_RECURSION(ht1);
+ HASH_UNPROTECT_RECURSION(ht2);
return 1; /* That's not supposed to happen */
}
if (ordered) {
if (p1->nKeyLength==0 && p2->nKeyLength==0) { /*
numeric indices */
result = p1->h - p2->h;
if (result!=0) {
- HASH_UNPROTECT_RECURSION(ht1);
- HASH_UNPROTECT_RECURSION(ht2);
+ HASH_UNPROTECT_RECURSION(ht1);
+ HASH_UNPROTECT_RECURSION(ht2);
return result;
}
} else { /* string indices */
result = p1->nKeyLength - p2->nKeyLength;
if (result!=0) {
- HASH_UNPROTECT_RECURSION(ht1);
- HASH_UNPROTECT_RECURSION(ht2);
+ HASH_UNPROTECT_RECURSION(ht1);
+ HASH_UNPROTECT_RECURSION(ht2);
return result;
}
result = memcmp(p1->arKey, p2->arKey,
p1->nKeyLength);
if (result!=0) {
- HASH_UNPROTECT_RECURSION(ht1);
- HASH_UNPROTECT_RECURSION(ht2);
+ HASH_UNPROTECT_RECURSION(ht1);
+ HASH_UNPROTECT_RECURSION(ht2);
return result;
}
}
@@ -1453,22 +1437,22 @@
} else {
if (p1->nKeyLength==0) { /* numeric index */
if (zend_hash_index_find(ht2, p1->h,
&pData2)==FAILURE) {
- HASH_UNPROTECT_RECURSION(ht1);
- HASH_UNPROTECT_RECURSION(ht2);
+ HASH_UNPROTECT_RECURSION(ht1);
+ HASH_UNPROTECT_RECURSION(ht2);
return 1;
}
} else { /* string index */
if (zend_hash_quick_find(ht2, p1->arKey,
p1->nKeyLength, p1->h, &pData2)==FAILURE) {
- HASH_UNPROTECT_RECURSION(ht1);
- HASH_UNPROTECT_RECURSION(ht2);
+ HASH_UNPROTECT_RECURSION(ht1);
+ HASH_UNPROTECT_RECURSION(ht2);
return 1;
}
}
}
result = compar(p1->pData, pData2 TSRMLS_CC);
if (result!=0) {
- HASH_UNPROTECT_RECURSION(ht1);
- HASH_UNPROTECT_RECURSION(ht2);
+ HASH_UNPROTECT_RECURSION(ht1);
+ HASH_UNPROTECT_RECURSION(ht2);
return result;
}
p1 = p1->pListNext;
@@ -1476,9 +1460,9 @@
p2 = p2->pListNext;
}
}
-
- HASH_UNPROTECT_RECURSION(ht1);
- HASH_UNPROTECT_RECURSION(ht2);
+
+ HASH_UNPROTECT_RECURSION(ht1);
+ HASH_UNPROTECT_RECURSION(ht2);
return 0;
}

@@ -1538,9 +1522,8 @@

for (i = 0; i < ht->nTableSize; i++) {
p = ht->arBuckets;
- while (p != NULL) {
+ if (ZEND_HASH_BUCKET_IS_OCCUPIED(ht, i)) {
zend_output_debug_string(0, "%s <==> 0x%lX\n",
p->arKey, p->h);
- p = p->pNext;
}
}

Index: Zend/zend_hash.h
===================================================================
--- Zend/zend_hash.h 2010-12-30 17:09:14.000000000 +0100
+++ Zend/zend_hash.h 2010-12-31 01:03:46.000000000 +0100
@@ -48,18 +48,17 @@
typedef void (*dtor_func_t)(void *pDest);
typedef void (*copy_ctor_func_t)(void *pElement);
typedef void (*copy_ctor_param_func_t)(void *pElement, void *pParam);
+typedef unsigned char uchar;

struct _hashtable;

typedef struct bucket {
- ulong h; /* Used for
numeric indexing */
+ ulong h;
uint nKeyLength;
void *pData;
void *pDataPtr;
struct bucket *pListNext;
struct bucket *pListLast;
- struct bucket *pNext;
- struct bucket *pLast;
char arKey; /* Must be last element */
} Bucket;

@@ -72,6 +71,7 @@
Bucket *pListHead;
Bucket *pListTail;
Bucket **arBuckets;
+ uchar *arEmpty;
dtor_func_t pDestructor;
zend_bool persistent;
unsigned char nApplyCount;
@@ -124,6 +124,31 @@

ZEND_API int zend_hash_add_empty_element(HashTable *ht, const char *arKey,
uint nKeyLength);

+#define ZEND_HASH_BUCKET(ht, index) \
+ (((index) + ((index)<<4)) & (ht)->nTableMask)
+
+#define EQ_KEYS(a,b,n) (!memcmp((a), (b), (n)))
+#define EQ_HASH_KEYS(ptr,key,hash,len) (EXPECTED(((ptr)->h == (hash)) && \
+ ((ptr)->nKeyLength == (len)) && \
+ ((len) == 0 || EQ_KEYS((ptr)->arKey,(key),(len)))))
+
+#define ZEND_HASH_NEXT_INDEX(ht, index) \
+ (index) = (((index)+1) & (ht)->nTableMask)
+#define ZEND_HASH_BUCKET_IS_OCCUPIED(ht, index) \
+ ((ht)->arEmpty != 0)
+// ((ht)->arBuckets != NULL)
+#define ZEND_HASH_GET_BUCKET_AND_CHECK_OCCUPIED(ht, index, bucket) \
+ (bucket) = (ht)->arBuckets, ZEND_HASH_BUCKET_IS_OCCUPIED((ht),
(index))
+#define ZEND_HASH_ITERATE_UNTIL_FREE(ht, index, bucket) \
+ for (; ZEND_HASH_GET_BUCKET_AND_CHECK_OCCUPIED((ht), (index),
(bucket)); ZEND_HASH_NEXT_INDEX((ht), (index)))
+#define ZEND_HASH_FIND_FREE(ht, index) \
+ for (; ZEND_HASH_BUCKET_IS_OCCUPIED((ht), (index));
ZEND_HASH_NEXT_INDEX((ht), (index)))
+#define ZEND_HASH_FILL_BUCKET(ht, index, element) \
+ (ht)->arBuckets = element; \
+ (ht)->arEmpty = 1;
+#define ZEND_HASH_MAKE_ELEMENT_EMPTY(ht, index) \
+ (ht)->arEmpty = 0;
+// (ht)->arBuckets = 0;

#define ZEND_HASH_APPLY_KEEP 0
#define ZEND_HASH_APPLY_REMOVE 1<<0
@@ -225,54 +250,12 @@

ZEND_API int zend_hash_rehash(HashTable *ht);

-/*
- * DJBX33A (Daniel J. Bernstein, Times 33 with Addition)
- *
- * This is Daniel J. Bernstein's popular `times 33' hash function as
- * posted by him years ago on comp.lang.c. It basically uses a function
- * like ``hash(i) = hash(i-1) * 33 + str''. This is one of the best
- * known hash functions for strings. Because it is both computed very
- * fast and distributes very well.
- *
- * The magic of number 33, i.e. why it works better than many other
- * constants, prime or not, has never been adequately explained by
- * anyone. So I try an explanation: if one experimentally tests all
- * multipliers between 1 and 256 (as RSE did now) one detects that even
- * numbers are not useable at all. The remaining 128 odd numbers
- * (except for the number 1) work more or less all equally well. They
- * all distribute in an acceptable way and this way fill a hash table
- * with an average percent of approx. 86%.
- *
- * If one compares the Chi^2 values of the variants, the number 33 not
- * even has the best value. But the number 33 and a few other equally
- * good numbers like 17, 31, 63, 127 and 129 have nevertheless a great
- * advantage to the remaining numbers in the large set of possible
- * multipliers: their multiply operation can be replaced by a faster
- * operation based on just one shift plus either a single addition
- * or subtraction operation. And because a hash function has to both
- * distribute good _and_ has to be very fast to compute, those few
- * numbers should be preferred and seems to be the reason why Daniel J.
- * Bernstein also preferred it.
- *
- *
- * -- Ralf S. Engelschall <r***@*ngelschall.com>
- */
-
static inline ulong zend_inline_hash_func(const char *arKey, uint nKeyLength)
{
- register ulong hash = 5381;
+ ulong hash = 5381;
+ const ulong* ar = (ulong*)arKey;

- /* variant with the hash unrolled eight times */
- for (; nKeyLength >= 8; nKeyLength -= 8) {
- hash = ((hash << 5) + hash) + *arKey++;
- hash = ((hash << 5) + hash) + *arKey++;
- hash = ((hash << 5) + hash) + *arKey++;
- hash = ((hash << 5) + hash) + *arKey++;
- hash = ((hash << 5) + hash) + *arKey++;
- hash = ((hash << 5) + hash) + *arKey++;
- hash = ((hash << 5) + hash) + *arKey++;
- hash = ((hash << 5) + hash) + *arKey++;
- }
+ if (nKeyLength < sizeof(ulong)) {
switch (nKeyLength) {
case 7: hash = ((hash << 5) + hash) + *arKey++; /*
fallthrough... */
case 6: hash = ((hash << 5) + hash) + *arKey++; /*
fallthrough... */
@@ -284,10 +267,17 @@
case 0: break;
EMPTY_SWITCH_DEFAULT_CASE()
}
- return hash;
+ } else {
+ const ulong* arEnd = (ulong*)(arKey + nKeyLength -
sizeof(ulong));
+// memcpy(&hash, arKey, sizeof(ulong)); // hash = *arEnd, but this
could be from unaligned address
+ hash = *arEnd;
+ for (; ar < arEnd; ar++) {
+ hash = ((hash << 5) + hash) + *ar;
+ }
+ }
+ return (hash<<16)|(hash%65165);
}

-
ZEND_API ulong zend_hash_func(const char *arKey, uint nKeyLength);

#if ZEND_DEBUG

Index: ext/spl/spl_array.c
===================================================================
--- ext/spl/spl_array.c 2010-12-30 17:07:19.000000000 +0100
+++ ext/spl/spl_array.c 2010-12-30 17:11:02.000000000 +0100
@@ -115,12 +115,11 @@
/* IS_CONSISTENT(ht);*/

/* HASH_PROTECT_RECURSION(ht);*/
- p = ht->arBuckets;
- while (p != NULL) {
+ uint nIndex = ZEND_HASH_BUCKET(ht, intern->pos_h);
+ ZEND_HASH_ITERATE_UNTIL_FREE(ht, nIndex, p) {
if (p == intern->pos) {
return SUCCESS;
}
- p = p->pNext;
}
/* HASH_UNPROTECT_RECURSION(ht); */
spl_array_rewind(intern TSRMLS_CC);Index: apc_bin.c
===================================================================
--- apc_bin.c 2010-11-30 11:18:31.000000000 +0100
+++ apc_bin.c 2010-12-30 17:11:03.000000000 +0100
@@ -412,12 +412,6 @@
if((*bp_prev)->pListLast) {
apc_swizzle_ptr(bd, ll, &(*bp_prev)->pListLast);
}
- if((*bp_prev)->pNext) {
- apc_swizzle_ptr(bd, ll, &(*bp_prev)->pNext);
- }
- if((*bp_prev)->pLast) {
- apc_swizzle_ptr(bd, ll, &(*bp_prev)->pLast);
- }
apc_swizzle_ptr(bd, ll, bp_prev);
}
for(i=0; i < ht->nTableSize; i++) {
Index: apc_compile.c
===================================================================
--- apc_compile.c 2010-11-30 11:18:31.000000000 +0100
+++ apc_compile.c 2010-12-30 17:12:28.000000000 +0100
@@ -800,6 +800,7 @@
Bucket* newp = NULL;
int first = 1;
apc_pool* pool = ctxt->pool;
+ void* t;

assert(src != NULL);

@@ -810,14 +811,17 @@
memcpy(dst, src, sizeof(src));

/* allocate buckets for the new hashtable */
- CHECK((dst->arBuckets = apc_pool_alloc(pool, (dst->nTableSize *
sizeof(Bucket*)))));
+ CHECK((t = apc_pool_alloc(pool, (dst->nTableSize * (sizeof(Bucket*) +
1)))));

- memset(dst->arBuckets, 0, dst->nTableSize * sizeof(Bucket*));
+ memset(t, 0, dst->nTableSize * (sizeof(Bucket*) + 1));
+
+ dst->arBuckets = (Bucket**)t;
+ dst->arEmpty = ((uchar*)t) + (src->nTableSize * (sizeof(Bucket*)));
dst->pInternalPointer = NULL;
dst->pListHead = NULL;

for (curr = src->pListHead; curr != NULL; curr = curr->pListNext) {
- int n = curr->h % dst->nTableSize;
+ int n = ZEND_HASH_BUCKET(dst, curr->h);

if(check_fn) {
va_list args;
@@ -863,16 +867,8 @@
#endif

/* insert 'newp' into the linked list at its hashed index */
- if (dst->arBuckets) {
- newp->pNext = dst->arBuckets;
- newp->pLast = NULL;
- newp->pNext->pLast = newp;
- }
- else {
- newp->pNext = newp->pLast = NULL;
- }
-
- dst->arBuckets = newp;
+ ZEND_HASH_FIND_FREE(dst, n);
+ ZEND_HASH_FILL_BUCKET(dst, n, newp);

/* copy the bucket data using our 'copy_fn' callback function */
CHECK((newp->pData = copy_fn(NULL, curr->pData, ctxt TSRMLS_CC)));
@@ -1917,11 +1913,9 @@
uint i;

for (i = 0; i < ht->nTableSize; i++) {
- if(!ht->arBuckets) break;
- p = ht->arBuckets;
- while (p != NULL) {
+ if (ZEND_HASH_BUCKET_IS_OCCUPIED(ht, i)) {
+ p = ht->arBuckets;
fixup(p, src, dst);
- p = p->pNext;
}
}
}--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit:http://www.php.net/unsub.php

Offline

#3 Dec. 31, 2010 17:58:16

Gustavo L.
Registered: 2009-11-02
Reputation: +  0  -
Profile   Send e-mail  

[PHP-DEV] Zend engine's hashtable performance tweaks


On Fri, 31 Dec 2010 14:40:44 -0000, Marcin Babij<marcin.ba...@nasza-klasa.pl> wrote:Why we do this?
We run profiling on our production servers and found out that zend_hash_*
functions take 10-20% CPU time of request. So there is some room for easy
improvements.

What was done?- Hash function in zend_hash.h was rebuilt and became much faster,without losing the most important properties.- Hashtable implementation was changed from Simple chaining to Open
addressing with linear probing, but with linked bucket, not included in
hash array, which causes:
-- Bucket structure to lose 2 pointers.-- Searching works similar, but don't have to jump with pointers storedin different memory locations, inserting, deleting and rehashing don'tneedto update linked list, but must search for first empty bucket, which is
fast, because it scans continuous memory.
-- Load factor decreases from 1.0 to 0.5-0.75 to make less collisions and
faster hashtable, which in turn increases memory footprint a little.- Open addressing doesn't change significantly performance, but nextthing was to create new array (arEmpty), which is of size nTableSizebytes,which keeps track of used/empty buckets and makes inserting and rehashing
much faster. In future it can be tested as bit-array with size of
nTableSize/8 bytes.Nice work. Only two comments:This is much faster than just testing the value of the Bucket pointer forNULL (BTW, NULL pointer != 0 bit pattern, but I guess this is a lost causein PHP's codebase...)? I'm a bit surprised; though perhaps this couldallow more cache hits (and indicates changing it into a bit array couldvery well be beneficial).A patch against trunk would be nicer since that's where there's a chancethis will be merged to. Also, since you're changing the hash function,html_table_gen.php must changed and run to regenerate html_tables.h. Seehttp://lxr.php.net/xref/PHP_TRUNK/ext/standard/html_tables/html_table_gen.php#722(ignore the comment; there's obviously nothing unrolled).--
Gustavo Lopes

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit:http://www.php.net/unsub.php

Offline

  • Root
  • » PHP
  • » [PHP-DEV] Zend engine's hashtable performance tweaks [RSS Feed]

Board footer

Moderator control

Enjoy the 19th of August
PoweredBy

The Forums are managed by develissimo stuff members, if you find any issues or misplaced content please help us to fix it. Thank you! Tell us via Contact Options
Leave a Message
Welcome to Develissimo Live Support