I am currently working on a recommender application and I am using cassandra with hadoop and pig for map/reduce jobs. To take advantage of the column names properties our team has decided to store data using valueless columns and aggregate column names so for example all hits for a specific content are stored in a column family with a single row, and each column is a hit for the content using the following structure:
rowkey = 'single_row' {
id_content:hit_date, -
.
.
.
}
With this schema we obtain wide rows instead of skinny; the question is, how do i need to manipulate data in Pig in order to store data in cassandra with this schema?