Creating reactive applications is hard. As a community, we PHP developers have been writing blocking code behind web servers like Apache and Nginx, for ages. We’re just not used to using the event-loop and reducing/removing blocking IO. When we design our scripts to be volatile, we don’t have to think [as much] about high memory and CPU usage. Sure, we still need to optimise the odd bottleneck, but it’s nothing compared to running our own web server. High memory consumption is one of the main arguments against building reactive applications. Yet other platforms, like Node, suffer the same problems. They’ve just had them for longer. The ecosystem is more mature and stable. Fortunately, there are tools to help us reduce memory and CPU usage where it matters. We already have React nor Ratchet to give us non-blocking HTTP and socket operations. We can combine these with threaded/concurrent processing extensions. Trying Gearman Gearman is a mature extension of deferring intensive processing to workers. These workers execute in parallel threads to the server/manager that spawns them.
As I’m working on a Macbook, I need to run this command to install Gearman: → brew install php56-gearman This installs a PHP extension, and the Gearman daemon as a dependency.
Before we look at any Gearman code, let’s think about a situation in which it would be useful. Imagine we wanted to ping a remote server, once a second, for 5 seconds. We might do something like: print "starting\n"; This code prints ping x every second. There’s an artificial delay of 1 second, and it’s not even pinging an actual server. This process is blocking because nothing can happen while the sleep() function is running. Imagine how much worse it would be if the ping operation took longer than a second.
So let’s move this processing outside of the main script. We need to split the code into 2 files; gearman-client.php and gearman-worker.php. First the client: $client = new GearmanClient(); We begin by creating a GearmanClient. This connects to the local Gearman service, thanks to the addServer() method. We attach a data callback, since we need a way for the worker to communicate with the server. Then we add a ping task and call the runTasks() method. This sends the task to any running Gearman workers. The worker is slightly more complicated: $worker = new GearmanWorker(); We create a new GearmanWorker and add the local server, just like we did for the client. Then we add a ping function, which receives a job. The callback is like the blocking code we had before, except that we also call the sendData() method. This triggers the data callback we added in gearman-client.php. We need to run both of these scripts, at the same time: tab 1 → php gearman-worker.php You should see the these messages, in tab 1: waiting …and these messages, in tab 2: sending
Since we’re using a data callback, this process could be asynchronous. If we print something extra, we’ll see a problem: print "sending\n"; The gearman-client.php output will be: sending Callback or not, this process is blocking. Gearman does background tasks, but it’s harder to track their progress. We need to introduce another (albeit temporary) loop: print "sending\n"; This time gearman-client.php will output: print "starting\n"; 0Ok, so this is now kind of asynchronous, but still blocking. We’ll remove the loop later. First we need to find a way to fix/replace the data callback… Sending DataTurns out it’s not possible (eek!) to send data from a background worker task to the client that initiated that task. Fortunately we already know an elegant way to communicate with these background tasks: sockets. Let’s recap some basic structure, beginning with creating a server.php file: print "starting\n"; 1If this is unfamiliar to you, check out the previous socket post, where I explain how it all works. We also need a stripped-down version of the server: print "starting\n"; 2If we want to use sockets, for communicating with the workers, we need to store their status and/or completion values: print "starting\n"; 3We can simulate the worker, with the following console JS: print "starting\n"; 4Now we need to plug the gearman-client.php code into the server. First we need to hold onto the client instance: print "starting\n"; 5Then we need to add a new message type: print "starting\n"; 6You’ll immediately notice that this just as blocking as it was before. If you connect to the socket (from console) and send a ping message type; you’ll wait 5–6 seconds for any feedback. That’s because the sending is blocking for all that time. We’re also not seeing what the final value is. That is until we change the worker to communicate the final value, through a socket connection. We’ll use a custom web socket client: print "starting\n"; 7
Then we need to change gearman-worker.php to send completion messages to the socket server: print "starting\n"; 8
The worker script should now output: print "starting\n"; 9The worker is running the loop, connecting to the web socket to send a completion value. It’s also returning the confirmation message. Next, we need to remove the blocking loop from the onMessage() method. We need to store the event loop: $client = new GearmanClient(); 0We’ll need to provide this, back in server.php: $client = new GearmanClient(); 1Then we can add timers, which are like the setTimeout() function from JS: $client = new GearmanClient(); 2Restart all the things, and send another ping message (from the console). You should see before the ‘received’ message. Then nothing for 5 seconds. Then job: H:[network id]:[task id] followed by value: done. How exciting is that?! Wrapping UpThere’s a lot more we can do to streamline this process. We could abstract much of the worker logic, so creating new worker classes required less boilerplate. We could abstract the timer logic so deferring tasks to Gearman would require less boilerplate. But I think we’ll leave it there, for now. If you’re stuck with any of this, send me a tweet or make a Github issue. I’ll be glad to help!
|