Electronics & Programming

develissimo

Open Source electronics development and programming

  • You are not logged in.
  • Root
  • » PHP
  • » [PHP-DEV] Performance of buffer based functionality (JSON, AES, serialize()) [RSS Feed]

#1 Nov. 25, 2010 17:52:30

Ilia A.
Registered: 2009-11-02
Reputation: +  0  -
Profile   Send e-mail  

[PHP-DEV] Performance of buffer based functionality (JSON, AES, serialize())


I think there is much to gain by improving the serialization speed in
PHP. It is used everywhere from caches like memcache, to sessions or
manual data input into DB. I would say that there are very few
non-trivial apps that would not benefit from a more compact and faster
serializer.

In our specific work-use-case switching to igbinary improved the speed
of the overall page generation by 2-3%.

On Thu, Nov 25, 2010 at 12:47 PM, Andi Gutmans <a***@*end.com> wrote:
> Hi,
>
> Completely different topic :)
>
> I've been looking a bit into performance around json encoding,
> hashing+encryption (aes) and serialize()/unserialize(). Data that is
> marshaled and often transmitted over the wire.
>
> I know there have been some high-end apps that have benefited from some
> custom serializers, etc... (typically platform dependent).
> I wonder if people here think improvements in these areas would move the
> needle for the majority of mainstream apps or not.
>
> Thanks,
>
> Andi
>

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit:http://www.php.net/unsub.php

Offline

#2 Nov. 25, 2010 19:15:12

Pierre J.
Registered: 2009-11-02
Reputation: +  0  -
Profile   Send e-mail  

[PHP-DEV] Performance of buffer based functionality (JSON, AES, serialize())


hi,

For the record here, igbinary is a very good example of such optimization:http://opensource.dynamoid.com/On Thu, Nov 25, 2010 at 6:47 PM, Andi Gutmans <a***@*end.com> wrote:
> Hi,
>
> Completely different topic :)
>
> I've been looking a bit into performance around json encoding,
> hashing+encryption (aes) and serialize()/unserialize(). Data that is
> marshaled and often transmitted over the wire.
>
> I know there have been some high-end apps that have benefited from some
> custom serializers, etc... (typically platform dependent).
> I wonder if people here think improvements in these areas would move the
> needle for the majority of mainstream apps or not.
>
> Thanks,
>
> Andi
>



--
Pierre

@pierrejoye |http://blog.thepimp.net|http://www.libgd.org--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit:http://www.php.net/unsub.php

Offline

#3 Nov. 25, 2010 19:47:55

Jonah H.
Registered: 2009-11-02
Reputation: +  0  -
Profile   Send e-mail  

[PHP-DEV] Performance of buffer based functionality (JSON, AES, serialize())


On Thu, Nov 25, 2010 at 2:14 PM, Pierre Joye <pierre.***@*mail.com> wrote:

> For the record here, igbinary is a very good example of such optimization:
>
>http://opensource.dynamoid.com/igbinary is a nice extension indeed. However, for those of us who have
environments which include multiple programming languages, custom
serializations become a PITA. As such, we generally go with something more
portable such as Avro or straight JSON. Awhile back, I had done some work
rewriting the JSON serialization functions to use the fast (and BSD
licensed) yajl JSON parser (https://github.com/lloyd/yajl). Initial
benchmarks showed a 4-7% performance improvement in
serialization/deserialization.

I'll see if I can dig it up--hopefully it's not on my dead computer.

--
Jonah H. Harris
Blog:http://www.oracle-internals.com/

Offline

#4 Nov. 25, 2010 19:52:32

Pierre J.
Registered: 2009-11-02
Reputation: +  0  -
Profile   Send e-mail  

[PHP-DEV] Performance of buffer based functionality (JSON, AES, serialize())


On Thu, Nov 25, 2010 at 8:47 PM, Jonah H. Harris <jonah.har***@*mail.com> wrote:
> On Thu, Nov 25, 2010 at 2:14 PM, Pierre Joye <pierre.***@*mail.com> wrote:
>>
>> For the record here, igbinary is a very good example of such optimization:
>>
>>http://opensource.dynamoid.com/>
> igbinary is a nice extension indeed.  However, for those of us who have
> environments which include multiple programming languages, custom
> serializations become a PITA.  As such, we generally go with something more
> portable such as Avro or straight JSON.  Awhile back, I had done some work
> rewriting the JSON serialization functions to use the fast (and BSD
> licensed) yajl JSON parser (https://github.com/lloyd/yajl).  Initial
> benchmarks showed a 4-7% performance improvement in
> serialization/deserialization.

Good point indeed. That makes me think about bson
(http://bsonspec.org/), which is used by mongodb for example.

--
Pierre

@pierrejoye |http://blog.thepimp.net|http://www.libgd.org--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit:http://www.php.net/unsub.php

Offline

#5 Nov. 26, 2010 00:16:30

Ilia A.
Registered: 2009-11-02
Reputation: +  0  -
Profile   Send e-mail  

[PHP-DEV] Performance of buffer based functionality (JSON, AES, serialize())


Just read over the BSON spec, looks fairly interesting, the only bit
that appears to be missing for PHP purposes is object support. We
would need to introduce custom type on top of standard BSON. However
from compactness and consistency standpoint it looks fairly appealing.

On Thu, Nov 25, 2010 at 2:51 PM, Pierre Joye <pierre.***@*mail.com> wrote:
> On Thu, Nov 25, 2010 at 8:47 PM, Jonah H. Harris <jonah.har***@*mail.com>
> wrote:
>> On Thu, Nov 25, 2010 at 2:14 PM, Pierre Joye <pierre.***@*mail.com> wrote:
>>>
>>> For the record here, igbinary is a very good example of such optimization:
>>>
>>>http://opensource.dynamoid.com/>>
>> igbinary is a nice extension indeed.  However, for those of us who have
>> environments which include multiple programming languages, custom
>> serializations become a PITA.  As such, we generally go with something more
>> portable such as Avro or straight JSON.  Awhile back, I had done some work
>> rewriting the JSON serialization functions to use the fast (and BSD
>> licensed) yajl JSON parser (https://github.com/lloyd/yajl).  Initial
>> benchmarks showed a 4-7% performance improvement in
>> serialization/deserialization.
>
> Good point indeed. That makes me think about bson
> (http://bsonspec.org/), which is used by mongodb for example.
>
> --
> Pierre
>
> @pierrejoye |http://blog.thepimp.net|http://www.libgd.org>
> --
> PHP Internals - PHP Runtime Development Mailing List
> To unsubscribe, visit:http://www.php.net/unsub.php>
>

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit:http://www.php.net/unsub.php

Offline

#6 Nov. 27, 2010 07:14:37

Andi G.
Registered: 2009-11-02
Reputation: +  0  -
Profile   Send e-mail  

[PHP-DEV] Performance of buffer based functionality (JSON, AES, serialize())


It's nice but as long as the browsers don't implement it natively then it's
less useful for server to client communication.
Of course can still be quite useful with custom I/O or data sources that
implement it natively i.e. mongodb.

> -----Original Message-----
> From: Ilia Alshanetsky
> Sent: Thursday, November 25, 2010 4:16 PM
> To: Pierre Joye
> Cc: Jonah H. Harris; Andi Gutmans; intern***@*ists.php.net
> Subject: Re: Performance of buffer based functionality (JSON, AES,
> serialize())
>
> Just read over the BSON spec, looks fairly interesting, the only bit that
> appears
> to be missing for PHP purposes is object support. We would need to introduce
> custom type on top of standard BSON. However from compactness and
> consistency standpoint it looks fairly appealing.
>
> On Thu, Nov 25, 2010 at 2:51 PM, Pierre Joye <pierre.***@*mail.com> wrote:
> > On Thu, Nov 25, 2010 at 8:47 PM, Jonah H. Harris <jonah.har***@*mail.com>
> wrote:
> >> On Thu, Nov 25, 2010 at 2:14 PM, Pierre Joye <pierre.***@*mail.com>
> wrote:
> >>>
> >>> For the record here, igbinary is a very good example of such optimization:
> >>>
> >>>http://opensource.dynamoid.com/> >>
> >> igbinary is a nice extension indeed.  However, for those of us who
> >> have environments which include multiple programming languages,
> >> custom serializations become a PITA.  As such, we generally go with
> >> something more portable such as Avro or straight JSON.  Awhile back,
> >> I had done some work rewriting the JSON serialization functions to
> >> use the fast (and BSD
> >> licensed) yajl JSON parser (https://github.com/lloyd/yajl).  Initial
> >> benchmarks showed a 4-7% performance improvement in
> >> serialization/deserialization.
> >
> > Good point indeed. That makes me think about bson
> > (http://bsonspec.org/), which is used by mongodb for example.
> >
> > --
> > Pierre
> >
> > @pierrejoye |http://blog.thepimp.net|http://www.libgd.org> >
> > --
> > PHP Internals - PHP Runtime Development Mailing List To unsubscribe,
> > visit:http://www.php.net/unsub.php> >
> >

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit:http://www.php.net/unsub.php

Offline

#7 Nov. 28, 2010 16:16:21

Jonathan B.
Registered: 2009-11-02
Reputation: +  0  -
Profile   Send e-mail  

[PHP-DEV] Performance of buffer based functionality (JSON, AES, serialize())


On Thu Nov 25 12:47 PM, Andi Gutmans wrote:
>
> I know there have been some high-end apps that have benefited from
> some custom serializers, etc... (typically platform dependent).
> I wonder if people here think improvements in these areas would move
> the needle for the majority of mainstream apps or not.
>

Like people have mentioned, improving (un)serialize speed would be a huge
benefit, especially for caching data sets or large objects.

>From experience, it would seem valuable to have:
1) serialize_text($var)

The existing serialize() minus the NULL bytes on private properties. It has
been a source problems for developers serializing an object with private
properties and storing it in a database (the string may get cutoff).

I'm not sure why there's a NULL byte in 'zend_mangle_property_name', instead
the char "_" could be used to mark a private property in the serialized
text.
The unserialize could be BC compatible accepting both NULL and "_" around a
private property.

2) serialize_binary($var)

An efficient and compact serialization using techniques from igbinary.

A potential problem with igbinary I've noticed is it packs a double as a 64
bit integer.
That could be a problem if you serialize on a platform that has an IEEE 754
binary representation and unserialize on a non-IEEE platform but I don't
know if php compiles on architectures that are non-IEEE.

It could also be interesting to pack integers as varints:http://code.google.com/apis/protocolbuffers/docs/encoding.html#varintshttp://protobuf-c.googlecode.com/svn/trunk/src/google/protobuf-c/protobuf-c.
c

That's most likely slower though then what igbinary does with integers



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit:http://www.php.net/unsub.php

Offline

#8 Nov. 30, 2010 08:28:19

Julien P.
Registered: 2009-11-02
Reputation: +  0  -
Profile   Send e-mail  

[PHP-DEV] Performance of buffer based functionality (JSON, AES, serialize())


I guess serialize mechanism cant use any char that can be part of a
PHP variable. And "_" can. As property names respect binary
compatibility, the only char that can be used to mark private
properties is actually the NULL byte. Ping me if I'm wrong.

But I'm +1 for improving the serialize() speed, I had problems
recently with it, and igbinary came to save me as well :)

Julien.Pauli

On Sun, Nov 28, 2010 at 5:15 PM, Jonathan Bond-Caron <jbo***@*penmv.com> wrote:
> On Thu Nov 25 12:47 PM, Andi Gutmans wrote:
>>
>> I know there have been some high-end apps that have benefited from
>> some custom serializers, etc... (typically platform dependent).
>> I wonder if people here think improvements in these areas would move
>> the needle for the majority of mainstream apps or not.
>>
>
> Like people have mentioned, improving (un)serialize speed would be a huge
> benefit, especially for caching data sets or large objects.
>
> From experience, it would seem valuable to have:
> 1) serialize_text($var)
>
> The existing serialize() minus the NULL bytes on private properties. It has
> been a source problems for developers serializing an object with private
> properties and storing it in a database (the string may get cutoff).
>
> I'm not sure why there's a NULL byte in 'zend_mangle_property_name', instead
> the char "_" could be used to mark a private property in the serialized
> text.
> The unserialize could be BC compatible accepting both NULL and "_" around a
> private property.
>
> 2) serialize_binary($var)
>
> An efficient and compact serialization using techniques from igbinary.
>
> A potential problem with igbinary I've noticed is it packs a double as a 64
> bit integer.
> That could be a problem if you serialize on a platform that has an IEEE 754
> binary representation and unserialize on a non-IEEE platform but I don't
> know if php compiles on architectures that are non-IEEE.
>
> It could also be interesting to pack integers as varints:
>http://code.google.com/apis/protocolbuffers/docs/encoding.html#varints>http://protobuf-c.googlecode.com/svn/trunk/src/google/protobuf-c/protobuf-c.
> c
>
> That's most likely slower though then what igbinary does with integers
>
>
>
> --
> PHP Internals - PHP Runtime Development Mailing List
> To unsubscribe, visit:http://www.php.net/unsub.php>
>

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit:http://www.php.net/unsub.php

Offline

#9 Dec. 2, 2010 17:12:15

Jonathan B.
Registered: 2009-11-02
Reputation: +  0  -
Profile   Send e-mail  

[PHP-DEV] Performance of buffer based functionality (JSON, AES, serialize())


On Tue Nov 30 03:26 AM, Julien Pauli wrote:
> I guess serialize mechanism cant use any char that can be part of a
> PHP variable. And "_" can. As property names respect binary
> compatibility, the only char that can be used to mark private
> properties is actually the NULL byte. Ping me if I'm wrong.
>

Right, what I was proposing didn't make sense. After digging through the
source, say we have:
class Foo {
public $a = 1;
protected $b = 2;
private $c = 3;
}

Currently this is:
O:3:"Foo":3:{s:1:"a";i:1;s:4:"�*�b";i:2;s:6:"�Foo�c";i:3;}

An alternative could be:

O:3:"Foo":3:{s:1:"a";i:1;*;s:4:"b";i:2;_;s:6:"c";i:3;}

Where "*;" is a marker for protected, "_;" is a marker for private

It would involve some trickery in ext/standard/var_unserializer.re :
"*;" {
/* prepend �*� to the next key so that we have zend_symtable_find("�*�b")
*/
}

"_;" {
/* prepend �Foo� to the next key so that we have
zend_symtable_find("�Foo�c") */
}

Just a thought if someone wants to refactor it / look into performance, I
believe that approach would support both:

O:3:"Foo":3:{s:1:"a";i:1;*;s:4:"b";i:2;_;s:6:"c";i:3;}
O:3:"Foo":3:{s:1:"a";i:1;s:4:"�*�b";i:2;s:6:"�Foo�c";i:3;}



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit:http://www.php.net/unsub.php

Offline

  • Root
  • » PHP
  • » [PHP-DEV] Performance of buffer based functionality (JSON, AES, serialize()) [RSS Feed]

Board footer

Moderator control

Enjoy the 21st of August
PoweredBy

The Forums are managed by develissimo stuff members, if you find any issues or misplaced content please help us to fix it. Thank you! Tell us via Contact Options
Leave a Message
Welcome to Develissimo Live Support