Multiple CGI Perl scripts

Question

This is kind of a theoretical question. I'm trying to develop a Perl application based on the producer-consumer paradigm. One of the scripts creates a file with data, while the other reads the data and has to present it in a HTML. There's also a third file, a HTML form, that starts the producer perl file.

What I don't know is how to run both the producer and the consumer at the same time using CGI, and I couldn't find information about it online (at least not how I searched for it).

I would like to know if you could tell me where to find this kind of information so I could test the app in the Apache server.

Thanks in advance

If you are concerned about separation of concerns, don't go with CGI. Watch this lightning talk and then use a PSGI/Plack webframework. It's more modern, easier to implement and maintain, and you can still do your separation. The key point is you separate using modules that produce and consume. Then you bundle them in one application. If you still want it to be CGI, bundle those modules in one CGI script. But don't try to do multiple programs. That's going to be hell to set up and maintain. It's not how stuff is done. — simbabque
@simbabque, I agree the CGI is and obsolete technology (and btw thanks for the video), but I think my issue can't be solved only by approaching CGI, but instead the HTML form and the Perl scripts. I just wondered if there was a way of starting both scripts at the same time. The HTML form only starts the producer because there I could only fill the <a> tag with one URL, but is there a way of starting the other script also? — tulians
No, there is not. Only if you put them behind a single point of entry. Essentially, turn them into modules. Or make one script that forks out to one, waits for it to complete, and then forks out to the other. But that would (to say it with Sawyer's words) really require double-gloving. ;) — simbabque
What I'm trying to say is, you can have a single point of entry that serves all things, including the HTML. That can be a CGI app, but it could also be a PSGI app. In any case, the idea that one web request triggers two processes, where one waits for the other to finish and then sends the response is wrong. Web doesn't work that way. If you insist on working with patterns, work with MVC instead. — simbabque
You need one feature to store data. You need another feature to view the stored data. Why wouldn't you use some sort of database? Then your database engine could handle all the thorny stuff about file locking, concurrent access, etc. SQLite might be appropriate. — Mitch Jackson

simbabque simbabque · Accepted Answer · 2016-08-31T22:24:09

Disclaimer: I think what this question boils down to is how to have two different components of a program interact with each other to create one application that is accessible from the web. If that is not what you want, just treat this as food for thought.

Common Gateway Interface

You are talking about CGI scripts in your question. (Emphasis mine).

I'm trying to develop a Perl application based on the producer-consumer paradigm. One of the scripts creates a file with data, while the other reads the data and has to present it in a HTML.

In general, CGI works in a way that a request goes through a web server, and is passed on to an application. That application might be written in Perl. If it is a Perl script, then that script is run by the perl interpreter. The web server starts that process. It can access the request information through the CGI, which is mostly environment variables. When the process is done, it writes data to STDOUT, which the web server takes as a response and sends back.

+-----------+        +-------------+                     +----------------+
|           | +----> |             | +-----Request-----> |                |
|  Browser  |        | Web server  |                     |  perl foo.cgi  |
|           | <----+ |             | <-----Response----+ |                |
+-----------+        +-------------+                     +----------------+

Now because there is only one process involved behind the web server, you cannot have two scripts. There is just no way for the server to communicate with two things at the same time. That's not how CGI works.

A combined approach

Instead, you need to wrap your two scripts into a single point of entry and turn them into some kind of components. Then you can have them talk to each other internally, while on the outside the web server is only waiting for one program to finish.

+-----------+        +-------------+                     +-----------------+
|           | +----> |             | +-----Request-----> |                 |
|  Browser  |        | Web server  |                     |  perl foo.cgi   |
|           | <----+ |             | <-----Response----+ |                 |
+-----------+        +-------------+                     | +-------------+ |
                                                         | |  Producer   | |
                                                         | +-----+-------+ |
                                                         |       |         |
                                                         |       |         |
                                                         |       V         |
                                                         | +-------------+ |
                                                         | | Consumer    | |
                                                         | +-------------+ |
                                                         |                 |
                                                         +-----------------+

To translate this into Perl, let's first determine some terminology.

script: a Perl program that is in a .pl file and that does not have its own package
module: a Perl module that is in a .pm file and that has a package with a namespace that fits to the file name

Let's assume you have these two Perl scripts that we call producer.pl and consumer.pl. They are heavily simplified and do not take any arguments into account.

producer.pl

#!/usr/bin/perl
use strict;
use warnings 'all';
use CGI;

open my $fh, '>', 'product.data' or die $!;
print $fh "lots of data\n";
close $fh;

consumer.pl

#!/usr/bin/perl
use strict;
use warnings 'all';
use CGI;

my $q = CGI->new;
print $q->header('text/plain');

open my $fh, '<', 'product.data' or die $!;
while my $line (<$fh>) {
    print $line;
}

exit;

This is as simplified as it gets. There is one script that creates data and one that consumes it. Now we need to make these two interact without actually running them.

Let's jump ahead and assume that we have already refactored both of these scripts and turned them into modules. We'll see how that works a bit later. We can now use those modules in our new foo.pl script. It will process the request, ask the producer for the data and let the consumer turn the data into the format the reader wants.

foo.pl

#!/usr/bin/perl
use strict;
use warnings 'all';
use Producer; # this was producer.pl
use Consumer; # this was consumer.pl
use CGI;

my $q = CGI->new;

my $params; # those would come from $q and are the parameters for the producer

my $product = Producer::produce($params);
my $output = Consumer::consume($product);

print $q->header;
print $output;

exit;

This is very straightforward. We read the parameter from the CGI, pass them to the producer, and pass the product to the consumer. That gives us output, which we print out so it goes back to the server, which sends a response.

Let's take a look at how we turned the two scripts into simple modules. Those do not need to be object oriented, though that might be preferred. Note that the spelling of the file names is now different. Module names conventionally start with capital letters.

Producer.pm

package Producer;
use strict;
use warnings 'all';

sub produce {
    my @args = @_;

    return "lots of data\n";
}

1;

Consumer.pm

package Consumer;
use strict;
use warnings 'all';

sub consume {
    my ($data) = @_;

    return $data; # this is really simple
}

1;

Now we have two modules that do the same as the scripts if you call the right function. All I did was put a namespace (package) at the top and wrap the code in a sub. I also removed the CGI part.

In our example, it's not necessary for the producer to write to a file. It can just return the data structure. The consumer in turn doesn't need to read from a file. It just takes a variable with the data structure and does stuff to it to present it.

If you stick to consistent function names (like produce and consume, just better), you can even write multiple producers or consumers. We have basically defined an interface here. That gives us the possibility to refactor the internals of the code without breaking compatibility, but also to stick in completely different producers or consumers. You can switch from the one-line-string producer to one that looks up stuff in a database in a heartbeat, as long as you stick to your interface.

Essentially, what we just did can also be shown like this:

+--foo.pl---------------------------+
|                                   |
|  +------+        +-------------+  |
|  |      | +----> |             |  |
|  |      |        |  Producer   |  |
|  |      | <----+ |             |  |
|  | main |        +-------------+  |
|  | foo  |                         |
|  | body |        +-------------+  |
|  |      | +----> |             |  |
|  |      |        |  Consumer   |  |
|  |      | <----+ |             |  |
|  +------+        +-------------+  |
|                                   |
+-----------------------------------+

This might look slightly familiar. It is essentially the Model-View-Controller (MVC) pattern. In a web context, the model and the view often only talk to each other through the controller, but it's pretty much the same.

Our producer is a data model. The consumer turns the data into a website that the user can see, so it's the view. The main program inside foo.pl that glues both of them together controls the flow of data. It's the controller.

The initial website that triggers the whole thing could be either part of the program, and be shown if no parameters are passed, or could be a stand-alone .html file. That's up to you.

All of this is possible with plain old CGI. You don't need to use any web frameworks for it. But as you grow your application, you'll see that a modern framework makes your life easier.

^{The diagrams where created with http://asciiflow.com/}

Multiple CGI Perl scripts

2 Answers