2
votes

In my use case, i am hitting the 150 value limit on an sns subscription filter policy as described in Filter policy constraints at filter policies. In total, I expect to have about 500 to 1500 values that I would like to use as inclusion criteria in my filter policy.

It appears there is a limit of one filter policy per sns subscription as well. Applying a second filter policy json via the cli set-subscription-attributes overlays the first filter policy json. Finally, it appears there is a limit of one subscription on an sns topic per subscribing resource (such as an sqs queue) per my reading of sns subscribe. Using the cli subscribe multiple times for same topic and queue, returned the same subscription arn each time.

So my only options are to add more sqs queues whenever i hit the 150 value limit, each queue getting its own subscription to the sns topic -- or come up with a different filter policy, that would be less precise in my use case, and do the additional filtering inside my subscriber app to stay below the 150 value limit.

I did not see any SO threads on this. Am I missing something or has anyone found a better way around the 150 value filter policy limit via aws cli or sdk?

Additional background info: The subscriber app is an existing prod service that produces data quality metrics on newly arriving instances of parquet datasets, which are contained in an enterprise s3 data lake and have been on-boarded for this division-level service. As part of the on-boarding of lake datasets to this service, we add them to the filter policy of our subscription to a data lake SNS topic. This topic publishes a list of dataset attributes (s3 bucket, key, dataset name, time stamp, etc) to subscribers for all puts of lake dataset instances — spanning 000s of datasets and a large number of buckets. We do not control this enterprise-level SNS topic, but can subscribe to it. Currently, our subscriber app sees one message per day, per on-boarded dataset. The subscribing app runs in an auto-scaling group that scales based on the number of messages visible in our sqs queue. It has some functionality to discard non-conforming messages. Recently, we hit the filter policy limit when we attempted to extend the service to additional datasets in the lake. I am leaning toward changing our filter policy to include only messages for puts to our divisional-level s3 buckets, then doing dataset-level filtering inside the app. Have to see how this impacts the auto-scaling.

SUBSCRIPTION_ARN=`aws sns --profile myProfile subscribe --topic-arn arn:aws:sns:us-east-1:123456789012:mySNS --protocol sqs --notification-endpoint arn:aws:sqs:us-east-1:999999999999:mySQS --return-subscription-arn` 
aws sns --profile myProfile set-subscription-attributes --subscription-arn $SUBSCRIPTION_ARN --attribute-name FilterPolicy --attribute-value file:///myUser/github/repo/filter_policy1.json

where filter_policy1.json is limited to 150 dataset values and takes the form:

{
    "dataset": [
      "datasetname_1",
      "datasetname_5",
      "datasetname_256"
    ],
    "_SUCCESS": [
      "True"
    ]
}
1
What is your use-case that requires so many filters? Can you explain what you are wanting to achieve, without specifically referring to how you want to achieve it? For example, are you wanting to send messages to large groups of users, or individual users? How often are you sending these messages? What subscription types are they (app, SMS)? Please edit your question to add these details rather than replying in a comment.John Rotenstein
@JohnRotenstein - I added some additional context - hoping it is helpful for a responseBilboC

1 Answers

2
votes

Just to close this off for now...

Current:

SNS Topic => SQS Subscriber (150 value limit)

Solution: we decided to insert multiple lambda subscribers between the SNS Topic and SQS; each lambda runs identical code to write subscribed sns events to our SQS; not great, but allows our app to support several hundred datasets and stay with existing app architecture until we need something more scalable

SNS Topic => lambda1 Subscriber (1st 150 datasets) => SQS
SNS Topic => lambda2 Subscriber (2nd 150 datasets) => SQS
...