1
votes

I'm iterating over 400,000 json messages that need to get sent from my NodeJS Azure Function, into Azure Service Bus. The Function is able to create the topic, and start publishing messages.

It starts to go through the loop and publish messages. I see a couple thousand land in the queue before the publish fails with the following error:

{ 
    Error: getaddrinfo ENOTFOUND ABC.servicebus.windows.net ABC.servicebus.windows.net:443 at errnoException (dns.js:53:10) at GetAddrInfoReqWrap.onlookup [as oncomplete] (dns.js:95:26)
    code: 'ENOTFOUND',
    errno: 'ENOTFOUND',
    syscall: 'getaddrinfo',
    hostname: 'ABC.servicebus.windows.net',
    host: 'ABC.servicebus.windows.net',
    port: '443' 
}

My code to publish the messages composes a message, and pushes it through the JS API. The body of the message is a small JSON object:

{
    "adult":false,
    "id":511351,
    "original_title":"Nossa Carne de Carnaval",
    "popularity":0,
    "video":false
}

The method in my Azure Function that is pushing this message into the Service Bus is as follows:

function publishNewMovies(context, movies) {
    var azure = require('azure');
    var moment = require('moment');

    var topic = process.env["NewMovieTopic"];
    var connectionString = process.env["AZURE_SERVICEBUS_CONNECTION_STRING"];

    context.log("Configuring Service Bus.");
    return new Promise((resolve, reject) => {
        var serviceBusService = azure.createServiceBusService(connectionString);
        serviceBusService.createTopicIfNotExists(topic, function(error) {
            if (error) {
                context.log(`Failed to get the Service Bus topic`);
                reject(error);
            }

            context.log("Service Bus setup.");

            // Delay the receiving of these messages by 5 minutes on any subscriber.
            var scheduledDate = moment.utc();
            scheduledDate.add('5', 'minutes');
            context.log("Sending new movie messages.");

            var message = {
                body: '',
                customProperties: { 
                    messageNumber: 0
                },
                brokerProperties: {
                    ScheduledEnqueueTimeUtc: scheduledDate.toString()
                }
            }

            for(index = 0; index < movies.length; index += 40) {
                message.brokerProperties.ScheduledEnqueueTimeUtc = scheduledDate.add('11', 'seconds').toString();
                for(batchIndex = 0; batchIndex < 40; batchIndex++) {
                    var currentIndex = index + batchIndex;
                    if (currentIndex >= movies.length) {
                        break;
                    }

                    message.customProperties.messageNumber = currentIndex;
                    message.body = JSON.stringify(movies[currentIndex]);

                    serviceBusService.sendTopicMessage(topic, message, function(error) {
                        if (error) {
                            context.log(`Failed to send topic message for ${message.body}: ${error}`);
                            reject(error);
                        }
                    })
                }
            }
        });
    });
}

This creates a message that is visible starting 5 minutes from the first Service Bus push. Then I batch send 40 messages for that scheduled time. Once the first batch is done, I schedule another 40 messages 11 seconds into the future. This is because there is another Azure Function that will be written that is going to listen to this Service Bus topic and make 3rd party API requests. I'm rate limited at 40 messages every 10 seconds.

I've read through the Azure Functions documentation on quotas and it looks like I might be hitting the 10,000 topic/queue limit. With this limit, how is someone supposed to push out large quantities of messages into the bus? Setting up multiple namespaces to get around this seems backwards when I'm sending the same message, just with different content - it belongs in the same namespace. The error that I'm receiving doesn't indicate I'm hitting a limit or anything. It sounds like it's failing to find the actual service-bus end-point for some reason.

Is the answer to this to handle partitioning? I can't find documentation on how to handle partitioning with NodeJs. Most of their API documentation is in C# which doesn't hasn't translate to NodeJs well for me.

Edit to show Bus metrics

Bus Metrics enter image description here

Topic Metrics enter image description here

1
Have a look at Azure Service Bus portal for details about the number of topics, size of the topics, number of messages, etc. Also you can monitor your resource using its Metrics. It looks like you are hitting the 10,000 topic entity limit per namespace at the Basic and Standard tiers.Roman Kiss
You are sending all the message to the same topic, so the 10,000 doesn't apply to you. A single topic can contain millions of messages (limited by max size, 1 GB by default). Isn't your function running for more than 5 minutes?Mikhail Shilkov
@JohnathonSullinger, I would like to correct my previously comment. Mikhail is correct, based on your code, your AF is using a single topic name configured in the app settings. Your json object is very small (let assume ~2kb), so 400,000 messages x 2k is less than 1GB, what it is a default size for topic/queue entity. I think, the metrics monitoring can give you more details what and when happen it. Try to change the topic name and run again.Roman Kiss
Azure Metrics doesn't show me any details other than the error happened. The function isn't timing out as it's running for less than 10 seconds before the error is thrown. The metrics on the topic doesn't show any failures, while the bus metrics shows a single error; no details are provided.Johnathon Sullinger

1 Answers

-1
votes

Can you elaborate why you are actually creating so many topics? This: var topic = process.env["NewMovieTopic"];

you can have 1 topic which can get millions of messages which then would be transferred to the individual subscriptions which you would add filter criteria too. So there should be no need for so many topics.

Usually topics, subscriptions and queues would be created in the management plane (portal, arm, PS or cli) runtime or data operations would be functions, cloud apps, VMs, so Service bus likely can easily handle your volume unless you have a very specific reason for creating these many topics?