3
votes

I'm planning to host a website and want to use HBase as my DB. The website is a photo sharing/hosting thing and I do not want to use any RDBMS. I want to get some experience on hosting, learning HBase and the issues that are faced and fixed by web developers and backend designers.

In short, I want to create and host a website in Python + HBase for the purpose of learning them

I've experience with EC2 and S3 and I would be using aws as the infrastructure.

What I'm thinking of reserving:

  • 3 default (1.7 G) instances for HBase
  • 3 more for webserver + memcached if necessary

I want to figure out if they are okay to start with. Of course, with time, I would do benchmarks, code optimisation and buy larger instances (if I can afford them) if required.

As of now, do the above specs look okay if targeted for 1000 users?

The users will be viewing photographs or adding comments apart from uploading their pics. Assume that one user upoloads 20 photos per week on average.

I'm looking for answers like : "No, HBase can run in just one medium size instance for thousand users...." Or "oh my god only 3 default servers for 1000 users..."

1

1 Answers

3
votes

1000 users total, or 1000 users concurrent? Your setup should be fine for 1000 users total.

As for HBase on EC2: I would highly recommend running HBase on Elastic Map Reduce (http://aws.typepad.com/aws/2012/06/apache-hbase-on-emr.html). Doing it that way would save you hours having to configure your own EC2 clusters. Believe me, I've done it both ways and can't recommend the EMR approach enough. :)