Here is a possible example of how to proxy outbound HTTP requests from an App Engine Standard application on NodeJS runtime via a Compute Engine VM running Squid, based on a slight modification of the available Google Cloud Platform documentation 1 2 and Quickstarts 3.
1. Create a Serverless VPC Access conector: Basically follow 2 to create the connector. After updating the gcloud
components and enabling the Serverless VPC Access API on your project running the following command should suffice:
gcloud compute networks vpc-access connectors create [CONNECTOR_NAME] \
--network [VPC_NETWORK] \
--region [REGION] \
--range [IP_RANGE]
2. Create a Compute Engine VM to use as proxy: Basically follow 1 to set up a Squid proxy server:
a. Reserve a static external IP address and assign it to a Compute Engine VM.
b. Add a Firewall rule to allow traffic on Squid's default port: 3128. This command should work if you are using the default VPC network: gcloud compute firewall-rules create [FIREWALL_RULE_NAME] --network default --allow tcp:3128
c. Install Squid on the VM with the following command sudo apt-get install squid3
.
d. Enable the acl localnet src
entries in the Squid config files for the VPC Access connector:
sudo sed -i 's:#\(http_access allow localnet\):\1:' /etc/squid/squid.conf
sudo sed -i 's:#\(acl localnet src [IP_RANGE]/28.*\):\1:' /etc/squid/squid.conf
For example: if you used 10.8.0.0 as value for the [IP_RANGE] field for creating the connector, it should look something like sudo sed -i 's:#\(acl localnet src 10.8.0.0/28.*\):\1:' /etc/squid/squid.conf
e. Start the server with sudo service squid start
3. Modifications on App Engine application: Based on the Quickstart for Node.js modify the following files in order to create an application that crawls a webpage using the request-promise library and displays the HTML of the webpage. The request is send to the webpage using the VPC Access connector and the VM as a proxy with the modifications of the app.yaml and app.js files.
a. package.json
...
"test": "mocha --exit test/*.test.js"
},
"dependencies": {
"express": "^4.16.3",
"request": "^2.88.0",
"request-promise": "^4.2.5"
},
"devDependencies": {
"mocha": "^7.0.0",
...
b. app.js
'use strict';
// [START gae_node_request_example]
const express = require('express');
const app = express();
app.get('/', (req, res) => {
res
.status(200)
.send('Hello, world!')
.end();
});
//Add a handler to test the web crawler
app.get('/test', (req, res) => {
var request = require('request-promise');
request('http://www.input-your-awesome-website.com')
.then(function (htmlString) {
res.send(htmlString)
.end();
})
.catch(function (err) {
res.send("Crawling Failed...")
.end();
});
});
// Start the server
const PORT = process.env.PORT || 8080;
app.listen(PORT, () => {
console.log(`App listening on port ${PORT}`);
console.log('Press Ctrl+C to quit.');
});
// [END gae_node_request_example]
c. app.yaml
runtime: nodejs10
vpc_access_connector:
name: "projects/[PROJECT]/locations/[REGION]/connectors/[CONNECTOR_NAME]"
env_variables:
HTTP_PROXY: "http://[Compute-Engine-IP-Address]:3128"
HTTPS_PROXY: "http://[Compute-Engine-IP-Address]:3128"
Each time you go to the /test
handler monitor that the requests go through the proxy by using sudo tail -f /var/log/squid/access.log
command from the VM and checking the changes on the logs.
Notes: The connector, application and VM need to be on the same region to work and these are the supported regions for the connector.
gcloud app deploy
instead ofgcloud beta app deploy
. – Daniel Ocandosudo sed -i 's:#\(acl localnet src [SERVERLESS_VPC_ACCESS_CONNECTOR_IP]/28.*\):\1:' /etc/squid/squid.conf
to add your Serveless VPC Access connector to the Squid's server ACL and then restart the server for the changes to take effect. – Daniel Ocandogcloud compute firewall-rules create NAME --network default --allow tcp:3128
(change NAME to the name you want for the Firewall Rule). – Daniel Ocando