1
votes

What I'm Trying To Achieve

Given a top level URL such as http://www.myapp.com/

If somebody requests:

http://www.myapp.com/questions/id/1

I want to get the "questions/id/1" part and pass it to the appropriate Perl script to retrieve the required resource.

What I've Achieved

Currently I know how to do it if there's an extra level in the url, like so:

http://www.myapp.com/model/questions/id/1

Where "model" is simply a typical Perl CGI script except it has with no ".pl" extension and the Apache perl handler is configured to handle this and the "questions/id/1" part is interpreted as as the path info (via CGI->path_info()) passed to "model".

I go this idea from Andrew Hanenkamp's article on the subject on onlamp.com: Developing RESTful Web Services in Perl

What's Still Wrong

This approach, however, doesn't work for top level url since there's no place to put "model" or whatever the handler is called. I tried to set default documents settings in the config so that http://www.myapp.com/ to defaults to http://www.myapp.com/modle.

So that typing http://www.myapp.com/questions/id/1 is handled as http://www.myapp.com/model/questions/id/1

However, Apache thinks this is a 404 error.

How can I accomplish this? Is this even a job for Perl or better handled at the Apache config level or maybe tapping into some Apache API to catch 404 errors and extract the "/questions/id/1" part from there? I don't know. Maybe someone here knows. ;-)

Note: Apologies for the long worded and roundabout way of asking this question. I'm not quite sure what's the terminology required to elicit an on-topic answer. Mentioning a four letter word starting with R and ending with T while asking this question previously here and elsewhere has resulted heated discussions on the nature of the said for letter word and also well meaning but unhelpful suggestions to use some framework or another. I'm just curious to know how these frameworks implement this feature and I want to implement one in Perl to learn how to to do it.

Update: One of the suggested answers has led me to try mod_rewrite. Works. However, it's rather impractical to set Apache's conf manually for each mapping. Further searching has let me to the O'Reilly book, "Practical mod_perl". In it, Bekman and Cholet wrote about accessing mod_rewrite programmatically via the Apache CPAN package to set the rewrite. Appendix 10 : "mod_rewrite in Perl" Writing some code now to test this book's idea. In the mean time, if there are better ways to accomplish this, please do share.

Conclusion

After prototyping two uri routing / dispatcher (or Whatchamacallit) scripts, one relying on mod_rewrite to call it and the other on ErrorDocument; I've come to the conclusion that mod_rewrite is the correct way to implement this.

The ErrorDocument method is wrong as it does not preserve POST data from the redirected page. A no go as sending a POST request to http://www.myapp.com/questions to create a new resource is impossible as the POST data is unavailable to the custom error handler script called by ErrorDocument. Furthermore as each request by definition will not access a physical file, thus the Apache's error log will be full of spurios 404 file not found errors. This can be a huge problem is the transaction volume of the server is high. Thanks to everyone for pointing me in the right direction. Now my curiosity is satiated.

3
If anyone thinks the question is not worded correctly, do suggest changes. I'm stumped about the correct terminology to use. Thanks.GeneQ
you could add 'CGI' to the title - 'using Perl' is a bit vague!plusplus
Both your answers (yko, evil otto) are fine and valid approaches to getting this done. :-) But, all things being equal, I'm leaning towards rewrite at this point because I think anticipating an error to happen in order to reroute is an aesthetically less appealing method. Let's let the community to vote on these different approaches before deciding which answer to choose.GeneQ

3 Answers

1
votes

I strongly advise using a web development framework. This will require a bit of learning but, by the time you have finished, you will realise that your current approach is far from ideal. There are many simple frameworks out there that will let you get up and running quickly. I suggest using Dancer (www.perldancer.org). In Dancer, setting routes is as simple as the example below.

get '/' => sub { 'Hello world!' }

As an aside, it is very unlikely that you want to be using CGI. CGI typically has very poor performance and is basically defunct. I suggest you look into FastCGI and PSGI, both of which are available in Apache.

1
votes

You can catch the 404 error and handle it with a cgi script. In your top-level .htaccess file (docroot/.htaccess):

ErrorDocument 404 /cgi/my-script.sh

Then in your handler script (my-script.sh) the original path will be passed as the enviromnent variable REDIRECT_URL. But make sure your handler script sets the HTTP status correctly by adding in a Status: 200 OK header. Here's a sample cgi script (/cgi/my-script.sh):

#!/bin/sh

echo Status: 200 OK
echo Content-type: text/plain
echo
echo Your url was $REDIRECT_URL

(I'm not recommending using bourne shell for writing cgi scripts, but it illustrates the method).

You can put the ErrorDocument setting in your httpd.conf file also, but any change then requires a server restart; putting it in .htaccess makes it much easier to tweak at runtime.

Apache reference on custom error handling: http://httpd.apache.org/docs/2.1/custom-error.html

If you are using mod_perl, it's easier and cleaner (not to mention better performing) to set it up as a perl handler; again in the top-level .htaccess file (or httpd.conf):

SetHandler perl-script
PerlHandler MyThing::HandlerPackage

where MyThing::HandlerPackage is the perl module that defines an appropriate handler sub.

1
votes

You can define rewrite rules in .htaccess file or in Apache configuration file. See mod_rewrite for details.

In your case .htaccess file might looks like:

AddHandler cgi-script .cgi
Options +ExecCGI

RewriteEngine on

RewriteRule ^(questions/id/.*)$ model.cgi/$1 [L]

This should work if you need handle only /question/id/ requests. But you may want to handle all requests via your Perl application. So actually pass all requests to your app could be good idea. Something like:

RewriteRule ^(.*)$ app.cgi/$1 [L]

And let your application dispatch request than. Except... there's always some exception. You need to let Apache to serve static files for you. And you need Apache not to serve your application files (executables and configs) as static files. So, most modern Perl frameworks stores static in separate dir, 'public' for example. So you put all your images, css, js, etc in /public and configure Apache in following way:

 RewriteCond %{DOCUMENT_ROOT}/public/%{REQUEST_URI} -f
 RewriteRule ^(.*) public/$1 [L]

 RewriteCond %{DOCUMENT_ROOT}/%{REQUEST_URI} !-f
 RewriteRule ^(.*) myapp.cgi [L] 

This example taken from Mojolicious documentation.

Now Apache sends static files to user if user requested something that exists in 'public' and redirects all other requests to your app.

In your app you need to dispatch some requests to some actions. Actually, you need a module that would do this job for you. Routes::Tiny (also available on CPAN) written by Viacheslav Tykhanovskyi is a nice choice.

After all, if you want to know how does frameworks work, why don't you dive into one of them? Mojolicious and Dancer too big, but you may look into something smaller, like Lamework that works over PSGI protocol that becomes standard nowadays.