1
votes

So currently i have an application where i'm storing location data (lat,lng) along with other fields and who not. So what i love about mysql or sql in general is that i can get geospatial queries easily. e.g. select all rows that fall within a given radius and center point.

What i love about dynamodb is that it's damn near infinitely scalable on AWS, which is the service i'll be using, and fast. I would love to move all my data over to dynamodb and even insert new data there But i wouldn't be able to use those geospatial queries which is the most important part of my application. It's required.

I know about the geolibrary for dynamodb but its written in java and my backend is written in php so thats a no go, plus they don't seem to update or maintain that library.

One solution i was thinking of was to store just the coordinates in mysql and store the corresponding id along with the other data (including the lat and long values) in dynamodb.

With this i could achieve the geospatial query functionality i want while being able to scale everything well on amazon specifically because thats the host i'm using.

So basically i'd query all POIs within a given radius from mysql and with all the ids i'd use that to get all results from dynamodb. Sounds crazy or what?

But the potential downside of this is having to query one data source and then querying another one immediately after with the result from the first query. Maybe I'm over thinking and underestimating how fast these technologies have become.

So to sum up my requirements:

Must be on AWS

Must be able to perform geospatial queries

Must be able to connect to dynamodb and MySQL in PHP

Any help or suggestions would be greatly appreciated.

2

2 Answers

1
votes

My instinct says, don't use 2 datasources, only if you have a really specific case.

How much data do you have? Is MySQL ( or Aurora) really can't handle it? If your application is read heavy, it can easily scale with read replicas.

I have a few ideas for you which may brings you at least a bit closer:

  1. Why don't you implement your own geo-library in php? :D
  2. You can do a dummy search in the DB, where you not filtering by actual distance, but with an upper and lower boundary in lat. and long. ( So you not searching in a circle, but in a square. Then it's on you if your application is fine with it, or it filters the result, but that would be a much smaller dataset and an easy filter.
0
votes

Maybe CloudSearch can help you out. It offers geo spatial queries on lat long fields. It works well together with DynamoDB, and it has a PHP SDK (never tried that though, I use nodejs)

You write the items that have lat,long fields to DynamoDB. Each item (or item update/deletion) is uploaded to CloudSearch automatically via a DynamoDB stream. So now you have "automatic copies" of your DynamoDB items in CloudSearch, and you can use all query capabilities of CloudSearch, including geo queries (one limitation, it only queries in boxes, not in circles, so you will need some extra math)

You will need to create a DynamoDB stream that triggers a Lambda function that uploads every item to CloudSearch. You set this up once, and it will do its magic "forever".

This approach will only work if you accept a small delay between the moment you are writing in DynamoDB and the moment it is available in CloudSearch.

With this approach you still have 2 datasources, but they are entirely separated from the perspective of your app. One datasource is for querying and the other one for writing. Keeping them in sync is done automatically for you in the AWS cloud. Your app writes to DynamoDB, and queries from CloudSearch. And you have the scalability advantages that these AWS services offer.