787
votes

I can find lots of information on how Long Polling works (For example, this, and this), but no simple examples of how to implement this in code.

All I can find is cometd, which relies on the Dojo JS framework, and a fairly complex server system..

Basically, how would I use Apache to serve the requests, and how would I write a simple script (say, in PHP) which would "long-poll" the server for new messages?

The example doesn't have to be scaleable, secure or complete, it just needs to work!

17

17 Answers

517
votes

It's simpler than I initially thought.. Basically you have a page that does nothing, until the data you want to send is available (say, a new message arrives).

Here is a really basic example, which sends a simple string after 2-10 seconds. 1 in 3 chance of returning an error 404 (to show error handling in the coming Javascript example)

msgsrv.php

<?php
if(rand(1,3) == 1){
    /* Fake an error */
    header("HTTP/1.0 404 Not Found");
    die();
}

/* Send a string after a random number of seconds (2-10) */
sleep(rand(2,10));
echo("Hi! Have a random number: " . rand(1,10));
?>

Note: With a real site, running this on a regular web-server like Apache will quickly tie up all the "worker threads" and leave it unable to respond to other requests.. There are ways around this, but it is recommended to write a "long-poll server" in something like Python's twisted, which does not rely on one thread per request. cometD is an popular one (which is available in several languages), and Tornado is a new framework made specifically for such tasks (it was built for FriendFeed's long-polling code)... but as a simple example, Apache is more than adequate! This script could easily be written in any language (I chose Apache/PHP as they are very common, and I happened to be running them locally)

Then, in Javascript, you request the above file (msg_srv.php), and wait for a response. When you get one, you act upon the data. Then you request the file and wait again, act upon the data (and repeat)

What follows is an example of such a page.. When the page is loaded, it sends the initial request for the msgsrv.php file.. If it succeeds, we append the message to the #messages div, then after 1 second we call the waitForMsg function again, which triggers the wait.

The 1 second setTimeout() is a really basic rate-limiter, it works fine without this, but if msgsrv.php always returns instantly (with a syntax error, for example) - you flood the browser and it can quickly freeze up. This would better be done checking if the file contains a valid JSON response, and/or keeping a running total of requests-per-minute/second, and pausing appropriately.

If the page errors, it appends the error to the #messages div, waits 15 seconds and then tries again (identical to how we wait 1 second after each message)

The nice thing about this approach is it is very resilient. If the clients internet connection dies, it will timeout, then try and reconnect - this is inherent in how long polling works, no complicated error-handling is required

Anyway, the long_poller.htm code, using the jQuery framework:

<html>
<head>
    <title>BargePoller</title>
    <script src="http://ajax.googleapis.com/ajax/libs/jquery/1.2.6/jquery.min.js" type="text/javascript" charset="utf-8"></script>

    <style type="text/css" media="screen">
      body{ background:#000;color:#fff;font-size:.9em; }
      .msg{ background:#aaa;padding:.2em; border-bottom:1px #000 solid}
      .old{ background-color:#246499;}
      .new{ background-color:#3B9957;}
    .error{ background-color:#992E36;}
    </style>

    <script type="text/javascript" charset="utf-8">
    function addmsg(type, msg){
        /* Simple helper to add a div.
        type is the name of a CSS class (old/new/error).
        msg is the contents of the div */
        $("#messages").append(
            "<div class='msg "+ type +"'>"+ msg +"</div>"
        );
    }

    function waitForMsg(){
        /* This requests the url "msgsrv.php"
        When it complete (or errors)*/
        $.ajax({
            type: "GET",
            url: "msgsrv.php",

            async: true, /* If set to non-async, browser shows page as "Loading.."*/
            cache: false,
            timeout:50000, /* Timeout in ms */

            success: function(data){ /* called when request to barge.php completes */
                addmsg("new", data); /* Add response to a .msg div (with the "new" class)*/
                setTimeout(
                    waitForMsg, /* Request next message */
                    1000 /* ..after 1 seconds */
                );
            },
            error: function(XMLHttpRequest, textStatus, errorThrown){
                addmsg("error", textStatus + " (" + errorThrown + ")");
                setTimeout(
                    waitForMsg, /* Try again after.. */
                    15000); /* milliseconds (15seconds) */
            }
        });
    };

    $(document).ready(function(){
        waitForMsg(); /* Start the inital request */
    });
    </script>
</head>
<body>
    <div id="messages">
        <div class="msg old">
            BargePoll message requester!
        </div>
    </div>
</body>
</html>
41
votes

I've got a really simple chat example as part of slosh.

Edit: (since everyone's pasting their code in here)

This is the complete JSON-based multi-user chat using long-polling and slosh. This is a demo of how to do the calls, so please ignore the XSS problems. Nobody should deploy this without sanitizing it first.

Notice that the client always has a connection to the server, and as soon as anyone sends a message, everyone should see it roughly instantly.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<!-- Copyright (c) 2008 Dustin Sallings <[email protected]> -->
<html lang="en">
  <head>
    <title>slosh chat</title>
    <script type="text/javascript"
      src="http://code.jquery.com/jquery-latest.js"></script>
    <link title="Default" rel="stylesheet" media="screen" href="style.css" />
  </head>

  <body>
    <h1>Welcome to Slosh Chat</h1>

    <div id="messages">
      <div>
        <span class="from">First!:</span>
        <span class="msg">Welcome to chat. Please don't hurt each other.</span>
      </div>
    </div>

    <form method="post" action="#">
      <div>Nick: <input id='from' type="text" name="from"/></div>
      <div>Message:</div>
      <div><textarea id='msg' name="msg"></textarea></div>
      <div><input type="submit" value="Say it" id="submit"/></div>
    </form>

    <script type="text/javascript">
      function gotData(json, st) {
        var msgs=$('#messages');
        $.each(json.res, function(idx, p) {
          var from = p.from[0]
          var msg = p.msg[0]
          msgs.append("<div><span class='from'>" + from + ":</span>" +
            " <span class='msg'>" + msg + "</span></div>");
        });
        // The jQuery wrapped msgs above does not work here.
        var msgs=document.getElementById("messages");
        msgs.scrollTop = msgs.scrollHeight;
      }

      function getNewComments() {
        $.getJSON('/topics/chat.json', gotData);
      }

      $(document).ready(function() {
        $(document).ajaxStop(getNewComments);
        $("form").submit(function() {
          $.post('/topics/chat', $('form').serialize());
          return false;
        });
        getNewComments();
      });
    </script>
  </body>
</html>
32
votes

Tornado is designed for long-polling, and includes a very minimal (few hundred lines of Python) chat app in /examples/chatdemo , including server code and JS client code. It works like this:

  • Clients use JS to ask for an updates since (number of last message), server URLHandler receives these and adds a callback to respond to the client to a queue.

  • When the server gets a new message, the onmessage event fires, loops through the callbacks, and sends the messages.

  • The client-side JS receives the message, adds it to the page, then asks for updates since this new message ID.

25
votes

I think the client looks like a normal asynchronous AJAX request, but you expect it to take a "long time" to come back.

The server then looks like this.

while (!hasNewData())
    usleep(50);

outputNewData();

So, the AJAX request goes to the server, probably including a timestamp of when it was last update so that your hasNewData() knows what data you have already got. The server then sits in a loop sleeping until new data is available. All the while, your AJAX request is still connected, just hanging there waiting for data. Finally, when new data is available, the server gives it to your AJAX request and closes the connection.

17
votes

Here are some classes I use for long-polling in C#. There are basically 6 classes (see below).

  1. Controller: Processes actions required to create a valid response (db operations etc.)
  2. Processor: Manages asynch communication with the web page (itself)
  3. IAsynchProcessor: The service processes instances that implement this interface
  4. Sevice: Processes request objects that implement IAsynchProcessor
  5. Request: The IAsynchProcessor wrapper containing your response (object)
  6. Response: Contains custom objects or fields
15
votes

This is a nice 5-minute screencast on how to do long polling using PHP & jQuery: http://screenr.com/SNH

Code is quite similar to dbr's example above.

12
votes

Here is a simple long-polling example in PHP by Erik Dubbelboer using the Content-type: multipart/x-mixed-replace header:

<?

header('Content-type: multipart/x-mixed-replace; boundary=endofsection');

// Keep in mind that the empty line is important to separate the headers
// from the content.
echo 'Content-type: text/plain

After 5 seconds this will go away and a cat will appear...
--endofsection
';
flush(); // Don't forget to flush the content to the browser.


sleep(5);


echo 'Content-type: image/jpg

';

$stream = fopen('cat.jpg', 'rb');
fpassthru($stream);
fclose($stream);

echo '
--endofsection
';

And here is a demo:

http://dubbelboer.com/multipart.php

11
votes

I used this to get to grips with Comet, I have also set up Comet using the Java Glassfish server and found lots of other examples by subscribing to cometdaily.com

9
votes

Take a look at this blog post which has code for a simple chat app in Python/Django/gevent.

9
votes

Below is a long polling solution I have developed for Inform8 Web. Basically you override the class and implement the loadData method. When the loadData returns a value or the operation times out it will print the result and return.

If the processing of your script may take longer than 30 seconds you may need to alter the set_time_limit() call to something longer.

Apache 2.0 license. Latest version on github https://github.com/ryanhend/Inform8/blob/master/Inform8-web/src/config/lib/Inform8/longpoll/LongPoller.php

Ryan

abstract class LongPoller {

  protected $sleepTime = 5;
  protected $timeoutTime = 30;

  function __construct() {
  }


  function setTimeout($timeout) {
    $this->timeoutTime = $timeout;
  }

  function setSleep($sleep) {
    $this->sleepTime = $sleepTime;
  }


  public function run() {
    $data = NULL;
    $timeout = 0;

    set_time_limit($this->timeoutTime + $this->sleepTime + 15);

    //Query database for data
    while($data == NULL && $timeout < $this->timeoutTime) {
      $data = $this->loadData();
      if($data == NULL){

        //No new orders, flush to notify php still alive
        flush();

        //Wait for new Messages
        sleep($this->sleepTime);
        $timeout += $this->sleepTime;
      }else{
        echo $data;
        flush();
      }
    }

  }


  protected abstract function loadData();

}
8
votes

Thanks for the code, dbr. Just a small typo in long_poller.htm around the line

1000 /* ..after 1 seconds */

I think it should be

"1000"); /* ..after 1 seconds */

for it to work.

For those interested, I tried a Django equivalent. Start a new Django project, say lp for long polling:

django-admin.py startproject lp

Call the app msgsrv for message server:

python manage.py startapp msgsrv

Add the following lines to settings.py to have a templates directory:

import os.path
PROJECT_DIR = os.path.dirname(__file__)
TEMPLATE_DIRS = (
    os.path.join(PROJECT_DIR, 'templates'),
)

Define your URL patterns in urls.py as such:

from django.views.generic.simple import direct_to_template
from lp.msgsrv.views import retmsg

urlpatterns = patterns('',
    (r'^msgsrv\.php$', retmsg),
    (r'^long_poller\.htm$', direct_to_template, {'template': 'long_poller.htm'}),
)

And msgsrv/views.py should look like:

from random import randint
from time import sleep
from django.http import HttpResponse, HttpResponseNotFound

def retmsg(request):
    if randint(1,3) == 1:
        return HttpResponseNotFound('<h1>Page not found</h1>')
    else:
        sleep(randint(2,10))
        return HttpResponse('Hi! Have a random number: %s' % str(randint(1,10)))

Lastly, templates/long_poller.htm should be the same as above with typo corrected. Hope this helps.

8
votes

This is one of the scenarios that PHP is a very bad choice for. As previously mentioned, you can tie up all of your Apache workers very quickly doing something like this. PHP is built for start, execute, stop. It's not built for start, wait...execute, stop. You'll bog down your server very quickly and find that you have incredible scaling problems.

That said, you can still do this with PHP and have it not kill your server using the nginx HttpPushStreamModule: http://wiki.nginx.org/HttpPushStreamModule

You setup nginx in front of Apache (or whatever else) and it will take care of holding open the concurrent connections. You just respond with payload by sending data to an internal address which you could do with a background job or just have the messages fired off to people that were waiting whenever the new requests come in. This keeps PHP processes from sitting open during long polling.

This is not exclusive to PHP and can be done using nginx with any backend language. The concurrent open connections load is equal to Node.js so the biggest perk is that it gets you out of NEEDING Node for something like this.

You see a lot of other people mentioning other language libraries for accomplishing long polling and that's with good reason. PHP is just not well built for this type of behavior naturally.

4
votes

Why not consider the web sockets instead of long polling? They are much efficient and easy to setup. However they are supported only in modern browsers. Here is a quick reference.

3
votes

The WS-I group published something called "Reliable Secure Profile" that has a Glass Fish and .NET implementation that apparently inter-operate well.

With any luck there is a Javascript implementation out there as well.

There is also a Silverlight implementation that uses HTTP Duplex. You can connect javascript to the Silverlight object to get callbacks when a push occurs.

There are also commercial paid versions as well.

2
votes

For a ASP.NET MVC implementation, look at SignalR which is available on NuGet.. note that the NuGet is often out of date from the Git source which gets very frequent commits.

Read more about SignalR on a blog on by Scott Hanselman

2
votes

You can try icomet(https://github.com/ideawu/icomet), a C1000K C++ comet server built with libevent. icomet also provides a JavaScript library, it is easy to use as simple as

var comet = new iComet({
    sign_url: 'http://' + app_host + '/sign?obj=' + obj,
    sub_url: 'http://' + icomet_host + '/sub',
    callback: function(msg){
        // on server push
        alert(msg.content);
    }
});

icomet supports a wide range of Browsers and OSes, including Safari(iOS, Mac), IEs(Windows), Firefox, Chrome, etc.

0
votes

Simplest NodeJS

const http = require('http');

const server = http.createServer((req, res) => {
  SomeVeryLongAction(res);
});

server.on('clientError', (err, socket) => {
  socket.end('HTTP/1.1 400 Bad Request\r\n\r\n');
});

server.listen(8000);

// the long running task - simplified to setTimeout here
// but can be async, wait from websocket service - whatever really
function SomeVeryLongAction(response) {
  setTimeout(response.end, 10000);
}

Production wise scenario in Express for exmaple you would get response in the middleware. Do you what you need to do, can scope out all of the long polled methods to Map or something (that is visible to other flows), and invoke <Response> response.end() whenever you are ready. There is nothing special about long polled connections. Rest is just how you normally structure your application.

If you dont know what i mean by scoping out, this should give you idea

const http = require('http');
var responsesArray = [];

const server = http.createServer((req, res) => {
  // not dealing with connection
  // put it on stack (array in this case)
  responsesArray.push(res);
  // end this is where normal api flow ends
});

server.on('clientError', (err, socket) => {
  socket.end('HTTP/1.1 400 Bad Request\r\n\r\n');
});

// and eventually when we are ready to resolve
// that if is there just to ensure you actually 
// called endpoint before the timeout kicks in
function SomeVeryLongAction() {
  if ( responsesArray.length ) {
    let localResponse = responsesArray.shift();
    localResponse.end();
  }
}

// simulate some action out of endpoint flow
setTimeout(SomeVeryLongAction, 10000);
server.listen(8000);

As you see, you could really respond to all connections, one, do whatever you want. There is id for every request so you should be able to use map and access specific out of api call.