0
votes

We have a use case where on any action from UI we need to read messages from google pub/sub Topic A synchronously and move those messages to Topic B.

Below is the code that has been written to handle this behavior and this is from Google Pub Sub docs to access a Topic synchronusly.

   public static int subscribeSync(String projectId, String subscriptionId, Integer numOfMessages, int count, String acknowledgementTopic) throws IOException {
    SubscriberStubSettings subscriberStubSettings =
            SubscriberStubSettings.newBuilder()
            .setTransportChannelProvider(
                    SubscriberStubSettings.defaultGrpcTransportProviderBuilder()
                    .setMaxInboundMessageSize(20 * 1024 * 1024) // 20MB (maximum message size).
                    .build())
            .build(); 

    try (SubscriberStub subscriber = GrpcSubscriberStub.create(subscriberStubSettings)) {
        String subscriptionName = ProjectSubscriptionName.format(projectId, subscriptionId);
        PullRequest pullRequest =
                PullRequest.newBuilder()
                .setMaxMessages(numOfMessages)
                .setSubscription(subscriptionName)
                .build(); 

        // Use pullCallable().futureCall to asynchronously perform this operation.
        PullResponse pullResponse = subscriber.pullCallable().call(pullRequest);
        List<String> ackIds = new ArrayList<>();
        for (ReceivedMessage message : pullResponse.getReceivedMessagesList()) {
            // START - CODE TO PUBLISH MESSAGE TO TOPIC B
            **publishMessage(message.getMessage(),acknowledgementTopic,projectId);**
            // END - CODE TO PUBLISH MESSAGE TO TOPIC B
            ackIds.add(message.getAckId());
        }
        // Acknowledge received messages.
        AcknowledgeRequest acknowledgeRequest =
                AcknowledgeRequest.newBuilder()
                .setSubscription(subscriptionName)
                .addAllAckIds(ackIds)
                .build();
        
        // Use acknowledgeCallable().futureCall to asynchronously perform this operation.
        subscriber.acknowledgeCallable().call(acknowledgeRequest);
        count=pullResponse.getReceivedMessagesList().size();
    }catch(Exception e) {
        log.error(e.getMessage());
    }
    return count;
}

Below is the sample code to publish messages to Topic B

public static void publishMessage(PubsubMessage pubsubMessage,String Topic,String projectId) {
    Publisher publisher = null;
    ProjectTopicName topicName =ProjectTopicName.newBuilder().setProject(projectId).setTopic(Topic).build();
    try {
        // Publish the messages to normal topic.
        publisher = Publisher.newBuilder(topicName).build();
    } catch (IOException e) {
        log.error(e.getMessage());
    }

    publisher.publish(pubsubMessage);

}

Is this the right way of handling this use case or this can be handled in someother way. We do not want to use Cloud Dataflow. Can someone let us know if this is fine or there is an issue. The code works but sometimes messages stay on Topic A even after hey are consumed synchronously. Thanks'

1
Can you explain more your use case and why you need to get the message from A and to publish to B then?guillaume blaquiere
@guillaume blaquiere This usecase is a requirement in which the messages have to moved from Topic A to B based on UI action like a button click.vikeng21

1 Answers

2
votes

There are some issues with the code as presented.

  1. You should really only use synchronous pull if there are specific reasons why you need to do so. In general, it is much better to use asynchronous pull via the client libraries. It will be more efficient and reduce the latency of moving messages from one topic to the other. You do not show how you call subscribeSync, but in order to process messages efficiently and ensure that you actually process all messages, you'd need to be calling it many times in parallel continuously. If you are going to stick with synchronous pull, then you should reuse the SubscriberStub object as recreating it for every call will be inefficient.

  2. You don't reuse your Publisher object. As a result, you are not able to take advantage of the batching that the publisher client can do. You should create the Publisher once and reuse it across your calls for publishes to the same topic. If the passed-in topic can differ across messages, then keep a map from topic to publisher and retrieve the right one from the map.

  3. You don't wait for the result of the call to publish. It is possible that this call fails, but you do not handle that failure. As a result, you could acknowledge the message on the first topic without it having actually been published, resulting in message loss.

With regard to your question about duplicates, Pub/Sub offers at-least-once delivery guarantees, so even with proper acking, it is still possible to receive messages again (typical duplicate rates are around 0.1%). There can be many different reasons for duplicates. In your case, since you are processing messages sequentially and recreating a publisher for every call, it could be that later messages are not acked before the ack deadline expires, which results in redelivery.