I'm trying to do a write up of Twitter4J for part of a uni project, but I'm getting hung up on a few things. From the Twitter4J api:
void sample()
Starts listening on random sample of all public statuses. The default access level provides a small proportion of the Firehose. The "Gardenhose" access level provides a proportion more suitable for data mining and research applications that desire a larger proportion to be statistically significant sample.
This implies that by default, a "default access" is provided to the stream, but another type of access, "Gardenhose access" is available. Is this correct? And if so, how do you access the higher Gardenhose access?
I'm asking as I've seen some answers on SO suggest that there is only one level of access - the Gardenhose, and I'm trying to clear this up once and for all.
In addition to this, I would like a reference (if possible) to the number of tweets the sample stream allows access to. I've read lots of people cite 1% for "default access" and 10% for "gardenhose access" - but I can't find this anywhere in the API.
So to sum up, two questions:
- Does the sample stream have a "default access" and a "gardenhose access", or just one of those?
- How much of the Twitter firehose stream can these levels of access gain?
If replying, please have links to reference-able API where possible.