I'm new to this Big data world. As a course project, I'm working on sensor networks and want to store sensor data on HBase. Currently data are stored in MySQL database. I'm trying to load this data into HBase. But data are growing so fast and querying on that is getting very slow.
Here is the MYSQL table schema: SensorLog(sensorID, userID,time,date).
So this tables saves a sensor firing logs. For each user (45 users in total) there are 25 motion sensors in his apartment. Every time a user moves in his apartment a sensor will fire, and this event will be logged into this table. The main question is what sensors fired for a specific user in a specific time interval and day.
I came up with threeHBase schema, and I'm just wondering to ask your opinion about them. In these schema, I present time as seconds in a day, i.e. an integer number in the range of 0-86400.
Schema1: Rowkey: Date; Column-Family: Time { cq:(t0-t86400); cv:(userID,sensorID)}
Schema2: Rowkey: (Date,userID); Column-Family: Time { cq:(t0-t86400); cv:(sensorID)}
Schema3: Rowkey: (Date,userID); Column-Family: Time { cq:(s1-s25); cv:(time)}
Would you please let me know which schema is better and more efficient? I appreciate any help in advance.