When doing web development we can test our apps with tools and methodologies such as unit testing (jUnit, rspec, ...), TDD, BDD, cucumber, end-to-end/regression/integration tests, H2 (as in process database), ...
But in the Hadoop and Big Data world,
How do you test a hadoop/hive/pig code? By that I mean creating an automation for the situation that given I have a sample input, when I trigger some hive or pig script, then I verify the output is as expected.
With more details, Is there a way to get a quick feedback of these automated tests? More specifically, how to run in-memory HDFS? In Java with SQL databases, we use H2 to get this quick feedback.
Or more broadly, what are the testing strategies that people use in the Hadoop platform?