"2-bit predictor" could be referring to either of two things, but much more likely one than the other.
The unlikely possibility is that they mean a branch table with only four entries, so two bits are used to associated a particular branch with an entry in the table. That's unlikely because a 4-entry table is so small that lots of branches would share the same table entries, so the branch predictor wouldn't be much more accurate than static branch prediction (e.g., always predicting backward branches as taken, since they're typically used to form loops).
The much more like possibility is using two bits to indicate whether a branch is likely to be taken or not. Some of the earliest microprocessors that included branch prediction (e.g., Pentium, PowerPC 604) worked roughly this way. The basic idea is that you keep a two-bit saturating counter, and make a prediction based on its current state. Intel called the states strongly not taken, weakly not taken, weakly taken, strongly taken. These would be numbered as (say) 0, 1, 2 and 3, so you can use a two-bit counter to track the states. Every time a branch is taken, you increment the number (unless it's already 3) and every time it's not taken, you decrement it (again, unless it's already 0). When you need to predict a branch if the counter is 0 or 1 you predict the branch not taken, and if it's 2 or 3 you predict it taken1.
A separate predictor entry used for each branch means each branch instruction in the program has its own entry in the branch prediction table. The alternative is some sort of mapping from branch instructions to table entries. For example, if you had a table with 220 entries, you could use 20 bits from a branch instruction's address, and use those bits as the index into the table. Assuming a machine with 32-bit addressing, and 32-bit instructions, you'd have up to 1024 branch instructions that could map to any one entry in the table (32-20-2 = 10, 210 = 1024). In reality you expect only a small percentage of instructions to be branches, some of the address space to be used for data, etc., so probably only a few branches would map to one entry in the table.
As far as the basic question of what it's asking for: they want a sequence of branch instructions that will (by what coincidence) be predicted more accurately when two branches map to the same slot in the branch predictor table than when/if each maps to a separate slot in the table. To go into just slightly more detail (but hopefully without giving away the whole puzzle), start with a pattern of branches where the branch predictor will usually be wrong. What the predictor basically does is assume that if the branch was taken the last time, that indicates that it's more likely to be taken this time (and conversely, if it wasn't taken last time, it probably won't be this time either).
So, you start with a pattern of branches exactly the opposite of that. Then, you want to add a second branch mapping to the same spot in the branch prediction table that will follow a pattern of branches that will adjust the data in the branch predictor table so that it more accurately reflects the upcoming branch rather than the previous branch.
1Technically, the Pentium didn't actually work this way, but it's how it was documented to work, and probably intended to work; the discrepancy in how it actually did work seems to have been a bug.