I'm trying to design an IoT system that will have many IoT devices sending different types of sensor data to a front-end load balancing server that then sends the messages to an ingestion system (currently thinking Google Cloud PubSub). Then there are consumers that consume the messages and write them to different databases and tables. Each sensor data type has its own database.
Where should fanout happen?
BEFORE pubsub system: If the frontend does the fanout, then it has to be scaled big enough to have enough processing power to look at the content of each message to figure out which topic to send it end. Then I will have a separate topic for each message and a consumer for each topic.
AFTER pubsub system: If I only have a single topic that the frontend just shoves all messages into regardless of their type, then that topic's consumer needs to be scaled to be able to consume and process each message to determine which database to write to. It would also mean that this one consumer code needs to have access to all the databases.
INSIDE pubsub system: Have pubsub do the fanout, so that even though publishers only publish to one topic, there are several subscriptions for that topic (one for each data type), and each consumer consumes from their own subscription and drops all the messages that are the datatype they are meant to consume. It seems like Kafka might be a better use for this.