0
votes

How much data in gigabytes (GB) do I need to allocate if I want to store all 2^128 IPv6 addresses in a file? I'm just thinking if it's practical or viable to generate such data for eventual table storage in MySQL to match a data item so to track visiting counter from that particular IPv6 address.

e.g. ipv6 address => visitor counter

Or does anyone have a more practical solution to what I'm trying to do? I need long term storage so temporary storage of IPv6 is not part of the question.

2
Even trying to store all the addresses in a single /64 network will not be possible. Each address is 16 bytes, and there are 18,446,744,073,709,551,615 possible addresses in a /64 network, which would be 16 times that many bytes, and that would be only for one /64 network, of which there are 18,446,744,073,709,551,615 possible /64 networks. - Ron Maupin
There are 3.4×10^38 IPv6 addresses. There are somewhere around 10^80 atoms in the observable universe. So you'd need a good chunk of the universe to store them all… - deceze
Also note that IPv6 is designed to be anonymising to some degree and clients will proactively rotate IPs constantly. IPv6 is pretty useless for identifying individuals. - deceze

2 Answers

3
votes

How much data in gigabytes

More than you have. The IP addresses alone, in raw form, would take 5e30 GB. Plus space for counters. Even if you come up with some radical compression/optimization techniques, it probably still will be too big.


a more practical solution to what I'm trying to do

Ehm, instantiate counters lazily? Not all ipv6 addresses are allocated/assigned (and won't be, for the foreseeable future)

3
votes

You won't ever need to store all 2128 address - the vast majority of them are not allocated. If you wanted to store a 32-bit counter for every address you'd need 2100 GB!

For a slightly more realistic scenario, say you get about 32 billion unique IPs (you won't get that many). You need 16 bytes for the address and say 8 for a 64-bit counter, that's 24 B * 32 G = 768 GB. Say 2TB to be safe and account for indexes and whatnot.

32 million unique IPs? Maybe 2GB.

In cases like this, where you'll have somewhere between linear and polynomial growth, it's usually best just to save the data as it comes in. You'll have plenty of time to figure out another solution long before it becomes a problem. Overthinking it will just lead you to premature optimization.