Cloud Run supports accepting WebSocket connections. While those connections aren't permanent long-live connections (they have 15min timeout in GA and up to 60min timeout in beta), they do prevent google from terminating container instances as long as at least one WebSocket connection is alive in the given container. You can have up to 250 WebSocket connections (or in general 250 HTTP connections at any given time) for each container.
This means that you can make your Java application subscribe to a topic from Google Pubsub as soon as it starts up, and wait for Pubsub messages which will then be relayed to any (or all) WebSocket clients that are connected to that given particular Cloud Run instance.
Google Cloud Pubsub supports a one-to-many subscription pattern, so you can make one message that was published to Pubsub topic to be published to all subscribers, which in this case will be each individual Google Cloud Run container instance that has active WebSocket connections.
- Java app will connect to Pubsub Topic when it starts up.
- Java app will accept WebSocket connections.
- Java app will relay messages from Pubsub subscriptions to corresponding clients based on what is in the message body with your filtering logic.
So your design is feasible with Google Cloud Run (with WebSocket supports now) & Google Cloud Pubsub. I do have some concerns so I am putting them in here.
My first concern would be 15min (60min in beta) HTTP timeout that Google has imposed on Google Cloud Run, which means that your clients' websocket connections will be dropped after that time threshold, and you will need to handle reconnection. In that glimpse of reconnection, some messages can be lost, so it would be difficult to achieve 100% guaranteed message delivery.
My second concern (which you can probably worry about far down in the road) is that due to the nature of the one-to-many fan-out architecture of Pubsub, a single published message to the PubSub topic will be relayed to all subscribers, which means all Cloud Run container instances will receive the message.
If that message is meant to be delivered for just one WebSocket in one of many containers, it can be a waste of cpu/network resources(cost), and this problem will only get bigger when there are many Cloud Run containers running at the same time and there is a large message volume. Of course, you can create a topic for each container or each "chatroom" but it can increase complexity, and I believe google puts some limits on # of topics that you can have as well as TPS limits on admin operations.
You might also want to take a look at Redis Pubsub, which allows you to subscribe to specific topics (and no topic create/destroy overhead). You could technically create a topic for each user, or each "chatroom", and let your Java app subscribe to the topic based on the connected WebSockets' interest. This may solve the second concern that I brought up above, as each container instance will only receive messages that are relevant to them... but the tradeoff of this approach would be that your Redis instance can be a bottleneck.