Electronics & Programming

develissimo

Open Source electronics development and programming

  • You are not logged in.
  • Root
  • » PHP
  • » [PHP-DEV] Experiments with a threading library for Zend: spawning a new executor [RSS Feed]

#1 Jan. 18, 2011 09:18:28

Stas M.
Registered: 2009-11-02
Reputation: +  0  -
Profile   Send e-mail  

[PHP-DEV] Experiments with a threading library for Zend: spawning a new executor


Hi!1) any hints or clues from people familiar with the Zend subsystems -
such as memory management, and the various stacks, to provide hints as
to how to set them up "correctly"Zend Engine keeps all state (including memory manager state, etc.)separate in each thread, which means once you've created a new thread ithas to run initializations for the data structures. It should happenautomatically when you build the engine in threaded mode(--enable-maintainer-zts).You can not share any data between the engine threads - unless youcommunicate it through some channel external to the engine - and even inthis case you should use a copy, never the original pointer.This also means you can not use PHP functions, classes, etc. from onethread in another one.I'm not sure what you tried to do in your code, so hard to say whatexactly went wrong there.Another caveat: while Zend Engine makes a lot of effort to keep thestate localized and thus be thread-safe, not all libraries PHP is usingdo so, so running multithreaded PHP with these libraries may causevarious trouble.--
Stanislav Malyshev, Software Architect
SugarCRM:http://www.sugarcrm.com/(408)454-6900 ext. 227

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit:http://www.php.net/unsub.php

Offline

#2 Jan. 18, 2011 21:17:14

Sam V.
Registered: 2009-11-02
Reputation: +  0  -
Profile   Send e-mail  

[PHP-DEV] Experiments with a threading library for Zend: spawning a new executor


On 18/01/11 22:17, Stas Malyshev wrote:
>> 1) any hints or clues from people familiar with the Zend subsystems -
>> such as memory management, and the various stacks, to provide hints as
>> to how to set them up "correctly"
>
> Zend Engine keeps all state (including memory manager state, etc.)
> separate in each thread, which means once you've created a new thread
> it has to run initializations for the data structures. It should
> happen automatically when you build the engine in threaded mode
> (--enable-maintainer-zts).

Yes, I expected the two functions - tsrm_new_interpreter() and
init_executor() to do that, as it is the function called in
php_request_startup() in main/main.c

It seems to do a lot of the work, and as far as I could tell there is no
TSRM function to reap an individual thread etc.

There is also zend_startup() - which seems to do a bit more. If anyone
knowledgeable would care to give or point to an overview, that would be
very useful.

> You can not share any data between the engine threads - unless you
> communicate it through some channel external to the engine - and even
> in this case you should use a copy, never the original pointer.

Sure, I'm expecting to have to pass in all data as deep copies as well
as the return value from the function. This is useful for
array_map-like functions. The parallel_for API, while it worked in the
context of HipHop, is unlikely to work with Zend; there doesn't seem to
be an interpreter under the sun which has successfully pulled off
threading with shared data.

Another possible application would be a parallel_include() type call,
which would call a given PHP file for each member of an array (or a PDO
result set), buffering the output from each, and inserting into the
output stream in sequence once each fragment is done (hopefully
interacting well with normal output buffering, if you didn't want the
results sent yet). This would allow a large number of results to be
rendered in parallel on multicore systems.

> This also means you can not use PHP functions, classes, etc. from one
> thread in another one.

I hope it will be possible to share already compiled code between
threads; this may mean disabling "eval" inside the thread or otherwise
hobbling the compiler to avoid separate threads trying to modify the
optree at once. If a shared optree cannot be achieved, then I guess it
would have to go back to the APC, but it would be good to avoid
overheads where possible to keep the thread startup cost low.

Even extremely restricted parallelism can help speed up some types of
work, so limitations I am happy to accept.

> I'm not sure what you tried to do in your code, so hard to say what
> exactly went wrong there.
> Another caveat: while Zend Engine makes a lot of effort to keep the
> state localized and thus be thread-safe, not all libraries PHP is
> using do so, so running multithreaded PHP with these libraries may
> cause various trouble.

Yes, currently I am not looking at calling individual module startup
functions to avoid this problem (and save time on thread startup). It
seems that there is a facility for limiting the available functions
visible to the created executor, too, which may make this easy to make
"safe".

Thanks for your feedback,
Sam

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit:http://www.php.net/unsub.php

Offline

#3 Jan. 18, 2011 21:52:06

Stefan M.
Registered: 2009-11-02
Reputation: +  0  -
Profile   Send e-mail  

[PHP-DEV] Experiments with a threading library for Zend: spawning a new executor


Hi Sam:

I am following the discussion very interested, but just a question for
clarification:

On 18 Jan 2011, at 22:16, Sam Vilain wrote:
> there doesn't seem to
> be an interpreter under the sun which has successfully pulled off
> threading with shared data.
Could you explain what you mean with that statement?

Sorry, but that's my topic, and the most well know interpreters that 'pulled
off' threading with shared data are for Java. The interpreter I am working on
is for manycore systems (running on a 64-core Tilera chip) and executes
Smalltalk (https://github.com/smarr/RoarVM).

Best regards
Stefan


--
Stefan Marr
Software Languages Lab
Vrije Universiteit Brussel
Pleinlaan 2 / B-1050 Brussels / Belgiumhttp://soft.vub.ac.be/~smarrPhone: +32 2 629 2974
Fax: +32 2 629 3525


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit:http://www.php.net/unsub.php

Offline

#4 Jan. 18, 2011 22:11:03

Stas M.
Registered: 2009-11-02
Reputation: +  0  -
Profile   Send e-mail  

[PHP-DEV] Experiments with a threading library for Zend: spawning a new executor


Hi!Sorry, but that's my topic, and the most well know interpreters that
'pulled off' threading with shared data are for Java. The interpreterGiven to what complications Java programmers should go to make theirthreaded code work, I have a lot of doubt that 95% of PHP users would beable to write correct threaded programs. Reasoning about threadedprograms is very hard, and IMHO putting it into the beginners languagewould be a mistake.--
Stanislav Malyshev, Software Architect
SugarCRM:http://www.sugarcrm.com/(408)454-6900 ext. 227

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit:http://www.php.net/unsub.php

Offline

#5 Jan. 18, 2011 22:37:12

Hannes L.
Registered: 2009-11-02
Reputation: +  0  -
Profile   Send e-mail  

[PHP-DEV] Experiments with a threading library for Zend: spawning a new executor


Hello,

I don't think a language becomes a "beginners language" just because many
new programmers use it. And it's still not a good argument for not including
new features.

As long as the new thread doesn't share any memory/variables with the
spawning context, no "reasoning" is required at all. It's when you start
sharing objects that things get complex. Just a simple threading
implementation with a strictly defined way to IPC would be very helpful.
It's not super useful in web application programming as handling web
requests is already packaged into small units of work.. web requests. So in
that sense a web application is already "multi threaded". However it's
interesting for CGI scripts. The other week I wrote a PHP CGI proxy for
example. Because PHP didn't have threading, I had to bother with select
polling.

Hannes

On 18 January 2011 23:10, Stas Malyshev <smalys***@*ugarcrm.com> wrote:

> Hi!
>
>
> Sorry, but that's my topic, and the most well know interpreters that
>> 'pulled off' threading with shared data are for Java. The interpreter
>>
>
> Given to what complications Java programmers should go to make their
> threaded code work, I have a lot of doubt that 95% of PHP users would be
> able to write correct threaded programs. Reasoning about threaded programs
> is very hard, and IMHO putting it into the beginners language would be a
> mistake.
>
> --
> Stanislav Malyshev, Software Architect
> SugarCRM:http://www.sugarcrm.com/> (408)454-6900 ext. 227
>
> --
> PHP Internals - PHP Runtime Development Mailing List
> To unsubscribe, visit:http://www.php.net/unsub.php>
>

Offline

#6 Jan. 19, 2011 02:55:37

Ben S.
Registered: 2009-11-02
Reputation: +  0  -
Profile   Send e-mail  

[PHP-DEV] Experiments with a threading library for Zend: spawning a new executor


Strongly second this. PHP is not a toy language restricted to beginners. If it hasadvanced features, beginners simply don't need to use them.If anything, I would argue that PHP is a language unsuited to beginners (and otherscripting languages), as it is so flexible it doesn't enforce good programmingpractice. Java is much more a 'beginner language' because it has much strictersyntax, type checking, exception handling, etc., which force and even teach peopleto program well in some regards (or at least do something to raise their awarenessthat they're programming sloppily!). Mind you, it's pretty easy to write bad codein any language....Ben.



On 19/01/11 9:36 AM, Hannes Landeholm wrote:Hello,

I don't think a language becomes a "beginners language" just because many
new programmers use it. And it's still not a good argument for not including
new features.

As long as the new thread doesn't share any memory/variables with the
spawning context, no "reasoning" is required at all. It's when you start
sharing objects that things get complex. Just a simple threading
implementation with a strictly defined way to IPC would be very helpful.
It's not super useful in web application programming as handling web
requests is already packaged into small units of work.. web requests. So in
that sense a web application is already "multi threaded". However it's
interesting for CGI scripts. The other week I wrote a PHP CGI proxy for
example. Because PHP didn't have threading, I had to bother with select
polling.

Hannes

On 18 January 2011 23:10, Stas Malyshev<smalys***@*ugarcrm.com> wrote:Hi!


Sorry, but that's my topic, and the most well know interpreters that'pulled off' threading with shared data are for Java. The interpreterGiven to what complications Java programmers should go to make their
threaded code work, I have a lot of doubt that 95% of PHP users would be
able to write correct threaded programs. Reasoning about threaded programs
is very hard, and IMHO putting it into the beginners language would be a
mistake.

--
Stanislav Malyshev, Software Architect
SugarCRM:http://www.sugarcrm.com/(408)454-6900 ext. 227

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit:http://www.php.net/unsub.php--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit:http://www.php.net/unsub.php

Offline

#7 Jan. 19, 2011 03:15:32

Sam V.
Registered: 2009-11-02
Reputation: +  0  -
Profile   Send e-mail  

[PHP-DEV] Experiments with a threading library for Zend: spawning a new executor


On 19/01/11 10:50, Stefan Marr wrote:
> On 18 Jan 2011, at 22:16, Sam Vilain wrote:
>> there doesn't seem to
>> be an interpreter under the sun which has successfully pulled off
>> threading with shared data.
> Could you explain what you mean with that statement?
>
> Sorry, but that's my topic, and the most well know interpreters that 'pulled
> off' threading with shared data are for Java. The interpreter I am working on
> is for manycore systems (running on a 64-core Tilera chip) and executes
> Smalltalk (https://github.com/smarr/RoarVM).

You raise a very good point. My statement is too broad and should
probably apply only to dynamic languages, executed on reference counted
VMs. Look at some major ones - PHP, Python, Ruby, Perl, most JS engines
- none of them actually thread properly. Well, Perl's "threading" does
run full speed, but actually copies every variable on the heap for each
new thread, massively bloating the process.

So the question is why should this be so, if C++ and Java, even
interpreted on a JVM, can do it?

In general, Java's basic types typically correspond with types that can
be dealt with atomically by processors, or are small enough to be passed
by value. This already makes things a lot easier.

I've had another reason for the differences explained to me. I'm not
sure I understand it fully enough to be able to re-explain it, but I'll
try anyway. As I grasped the concept, the key to making VMs fully
threadable with shared state, is to first allow reference addresses to
change, such as via generational garbage collection. This allows you to
have much clearer "stack frames", perhaps even really stored on the
thread-local/C stack, as opposed to most dynamic language interpreters
which barely use the C stack at all. Then, when the long-lived objects
are discovered at scope exit time they can be safely moved into the next
memory pool, as well as letting access to "old" objects be locked (or
copied, in the case of Software Transactional Memory). Access to
objects in your own frame can therefore be fast, and the number of locks
that have to be held reduced.

Perhaps to support/refute this argument, in your JVM, how do you handle:

- memory allocation: object references' timeline and garbage collection
- call stack frames and/or return continuations - the C stack or the heap?
- atomicity of functions (that's the "synchronized" keyword?)
- timely object destruction

I put it forward that the overall design of the interpreter, and
therefore what is possible in terms of threading, is highly influenced
by these factors.

When threading in C or C++ for instance (and this includes HipHop-TBB),
the call stack frame is on the C stack, so shared state is possible so
long as you pass heap pointers around and synchronise appropriately.
The "virtual" machine is of a different nature, and it can work. For
JVMs, as far as I know references are temporary and again the nature of
the execution environment is different.

For VMs where there is basically nothing on the stack, and everything on
the heap, it becomes a lot harder. To talk about a VM I know better,
Perl has about 6 internal stacks all represented on the heap; a function
call/return stack, a lexical scope stack to represent what is in scope,
a variable stack (the "tmps" stack) for variables declared in those
scopes and for timely destruction, a stack to implement local($var)
called the "save" stack, a "mark" stack used for garbage collection, ok
well only 5 but I think you get my point. From my reading of the PHP
internals so far there are similar set there too, so comparisons are
quite likely to be instructive. It's a bit hard figuring out everything
that is going on internally (all these internal void* types don't help
either), and whether or not there is some inherent property of reference
counting, or whether it just makes a shared state model harder, is a
question I'm not sure is easy to answer.

In any case, full shared state is not required for a large set of useful
parallelism APIs, and in fact contains a number of pitfalls which are
difficult to explain, debug and fix. I'm far more interested in simple
acceleration of tight loops - to make use of otherwise idle CPU cores
(perhaps virtual as in hyperthreading) to increase throughput - and APIs
like "map" express this well. The idea is that the executor can start
up with no variables in scope, though hopefully shared code segments,
call some function on the data it is passed in, and pass the answers
back to the main thread and then set about cleaning itself up.

Sam

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit:http://www.php.net/unsub.php

Offline

#8 Jan. 19, 2011 04:48:03

Stas M.
Registered: 2009-11-02
Reputation: +  0  -
Profile   Send e-mail  

[PHP-DEV] Experiments with a threading library for Zend: spawning a new executor


Hi!Yes, I expected the two functions - tsrm_new_interpreter() and
init_executor() to do that, as it is the function called in
php_request_startup() in main/main.cAs far as I remember, you need to run the whole request startup for thethe thread, otherwise there will be unitilialized pieces. TSRM magicwill create needed per-thread structures and call ctors, but ctorsusually just null out stuff, you'd still need to fill it in.Another possible application would be a parallel_include() type call,
which would call a given PHP file for each member of an array (or a PDO
result set), buffering the output from each, and inserting into the
output stream in sequence once each fragment is done (hopefully
interacting well with normal output buffering, if you didn't want the
results sent yet). This would allow a large number of results to be
rendered in parallel on multicore systems.That's what webservers do already, don't they? :)I hope it will be possible to share already compiled code between
threads; this may mean disabling "eval" inside the thread or otherwiseThe main problems you will be facing are the following:1. All ZE structures are per-thread. This means using one thread'sstructures in another will be non-trivial task, as all code assumes thatcurrent thread's structures are used.2. Even if you manage to hack around it by always passing the tsrm_lspointers, etc. - memory managers are per-thread too. Which means youwill be using data in one thread that is controlled by MM residing inanother thread. Without locking.3. You may think this is not very bad, since you'll be using stuffthat's quite static, like classes and functions - they don't getdeallocated inside request, so who cares which MM uses them? However,while classes themselves don't, structures containing them - hashtables- can change, be rebuilt, etc. and if it happens in a wrong moment,you're in trouble.4. Next problem with using classes/functions is that they can containvariables - zvals, as default properties, static variables, etc. SinceZF is refcounting, these zvals may be modified by anybody who uses thesevariables - even just for reading. Again, no locking. Which, again,means trouble.5. Then come resources and module globals. Imagine some function touchesin some way some resource - connection, file, etc. - that another threadis using at the same time, without locking? Modules generally assumeresources belong to their respective threads, so you'll need to runmodule initializations for each thread separately.hobbling the compiler to avoid separate threads trying to modify the
optree at once. If a shared optree cannot be achieved, then I guess it
would have to go back to the APC, but it would be good to avoid
overheads where possible to keep the thread startup cost low.Because of the things described above, it will be very challenging toavoid those startup costs.Even extremely restricted parallelism can help speed up some types of
work, so limitations I am happy to accept.If you restrict it to using only copied data and never running any PHPcode, it might work. Alternatively, you might launch independent engineinstances that don't share structures and have them communicate, likeErlang does. Though, unlike Erlang, PHP engine would not help you muchin this, I'm afraid.--
Stanislav Malyshev, Software Architect
SugarCRM:http://www.sugarcrm.com/(408)454-6900 ext. 227

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit:http://www.php.net/unsub.php

Offline

#9 Jan. 19, 2011 04:52:04

Stas M.
Registered: 2009-11-02
Reputation: +  0  -
Profile   Send e-mail  

[PHP-DEV] Experiments with a threading library for Zend: spawning a new executor


Hi!If anything, I would argue that PHP is a language unsuited to beginners (and
other
scripting languages), as it is so flexible it doesn't enforce good programming
practice. Java is much more a 'beginner language' because it has much stricterContrary to popular belief, people usually don't start with programmingto be taught good practices and become enlightened in the ways of Art.They usually start because they need their computers to do something forthem. And scripting languages are often the easiest way to make thathappen.Java, on the other hand, forces you to deal with exceptions, patterns,interfaces, generics, covariants and contravariants, locking, etc. whichyou neither want nor need to know, only because somebody somewheredecided that it's right for you.--
Stanislav Malyshev, Software Architect
SugarCRM:http://www.sugarcrm.com/(408)454-6900 ext. 227

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit:http://www.php.net/unsub.php

Offline

#10 Jan. 19, 2011 05:02:44

Stas M.
Registered: 2009-11-02
Reputation: +  0  -
Profile   Send e-mail  

[PHP-DEV] Experiments with a threading library for Zend: spawning a new executor


Hi!like "map" express this well. The idea is that the executor can start
up with no variables in scope, though hopefully shared code segments,For that you would probably need to put some severe restrictions on yourcode, such as:1. No usage of default properties or statics in classes or functions.2. No assigning of constants to any variable (comparison and operatorsmay be ok, not sure how refcounts work out)3. No defining new functions or classes or including new filesThis probably could still do something useful - such as run 3 sqlqueries in parallel and return the result - but I'm not sure how youcould enforce such conditions... If you do not, you'll have some"interesting" race conditions leading to variables disappearing,leaking, being assigned wrong values, etc.--
Stanislav Malyshev, Software Architect
SugarCRM:http://www.sugarcrm.com/(408)454-6900 ext. 227

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit:http://www.php.net/unsub.php

Offline

  • Root
  • » PHP
  • » [PHP-DEV] Experiments with a threading library for Zend: spawning a new executor [RSS Feed]

Board footer

Moderator control

Enjoy the 21st of October
PoweredBy

The Forums are managed by develissimo stuff members, if you find any issues or misplaced content please help us to fix it. Thank you! Tell us via Contact Options
Leave a Message
Welcome to Develissimo Live Support