2
votes

I have to filter packets from a pcap files and process them further. The files are very large, therefore it's not feasible to read the entire thing into memory at once. Scapy seems to be very sophisticated and I was able to iterate through packets with

with PcapReader(pcap) as pcap_reader:
    for pkt in pcap_reader:
        ...

Unfortunately I was not able to find a way to apply a filter (e.g. BPF) to neither the PcapReader so only matching packets will be iterated nor the pkt (which should be scapy.packet!?).

I saw that there is a function tdecode, which is a tshark decoder which takes a filter as arguments, but there is no way of saving the resulting packets into a variable but just to flood the terminal with all results.

Is there a way of filtering packets from a .pcap file with scapy and still iterating over the results?

2
Can't you first filter with tshark (which is pretty fast) and then process the new pcap further with scapy? If your post processing is expensive you're probably better off using libpcap directly.pchaigno

2 Answers

3
votes

Scapy is unbelievably slow, to the point where interactive use is the only use. It also does not allow filtering packets before full (in-python) dissection, which exasperates the problem.

You can use libpcap, either by writing a small C-extension yourself or by using a binding, as a replacement for PcapReader. libpcap allows you to specify a filter in BPF-syntax, which is applied on the incoming packets within the library or - when capturing live from a device - by the kernel itself. This will vastly improve your performance.

The basic layout would be to:

  • open_offline() a pcap-file
  • set the BPF-filter
  • read from the libpcap-supplied handle
  • pass the incoming packet data to scapy for further inspection

You can get quite sophisticated with that.

2
votes

You can use the tcpdump() function to have a BPF filter applied.

with PcapReader(tcpdump("myfile.cap", args=["-w", "-", "my BPF filter"],
                        getfd=True)) as pcap_reader:
    for pkt in pcap_reader:
        print pkt.summary()

If that's of some interest for you, maybe you could submit a feature request, so that PcapReader() accepts a filter parameter (and behaves more or less like sniff() with an offline= parameter.