5
votes

I have been looking for a way to get 802.11 Packets from a .cap file into an Array. So far I have found:

  • Scapy: which is kind of nice, documentation available, but too slow, when I try to open a file with size > 40 Mb, I just keeps hanging on until it consumes all my Ram (all 16 gigs of it) at which point my pc just blocks and I have to reboot it

  • Pyshark: doesn't have any of Scapy's problems, but documentation is too scarce, I can't find a way to handle and get attributes for 802.11 Packets

So I was thinking maybe there are better solutions out there, or maybe someone does have some experience with pyshark?

from scapy.all import *
import pyshark
from collections import defaultdict
import sys
import math
import numpy as np
counter=0
Stats = np.zeros((14))
filename='cap.cap'

a = rdpcap(filename)
print len(a)
for p in a:
        pkt = p.payload
        #Management packets
        if p.haslayer(Dot11) and p.type == 0:
                ipcounter = ipcounter +1
                Stats[p.subtype] = Stats[p.subtype] + 1

print Stats

Note: when I launch the program with a 10 Mega bytes input (for instance) it takes about 20 seconds or so, but it does work, I wonder why is that, why is it so different from pyshark and what kind of computations is it doing?

6
I have to work with python on this one, it's part of a bigger frameworkMrNoober
Perhaps you could show the program you wrote to open the file with Scapy. If so, we could help you understand why it didn't work.Robᵩ
Will Do in an Edit right away.MrNoober
At work I often open up larger files (around 60MB) with rdpcap() and it definitely does NOT take up 16GB of RAM. Have you tried removing everything else from your code and JUST have rdpcap(), with time measurements before and after? I simply cannot believe that opening up a 40MB pcap file requires more than 16GB of RAM.wookie919
By the way, a 10MB file taking 20 seconds is quite normal from my point of view. As you know, Scapy decomposes a packet into every possible headers and fields that it knows of and stores them in a nicely accessible data structure.wookie919

6 Answers

9
votes

You can patch scapy file named utils.py so that it won't load everything into memory

change :

def read_all(self,count=-1):
    """return a list of all packets in the pcap file
    """
    res=[]
    while count != 0:
        count -= 1
        p = self.read_packet()
        if p is None:
            break
        res.append(p)
    return res

to

def read_all(self,count=-1):
    """return an iterable of all packets in the pcap file
    """
    while count != 0:
        count -= 1
        p = self.read_packet()
        if p is None:
            break
        yield p
    return

credit goes to : http://comments.gmane.org/gmane.comp.security.scapy.general/4462

But link is now dead

2
votes

Scapy will load all the packets to your memory and create a packetList instance. I think there are two solutions to your problem.

  1. Capture packets with a filter. In my work, I have never captured more than 2MB packets since I only capture on one wireless channel once.
  2. Divide the huge packet file into several smaller parts. And then deal with them.

Hope it helps.

2
votes

If pyshark suits your needs, you can use it like so:

cap = pyshark.FileCapture('/tmp/mycap.cap')
for packet in cap:
    my_layer = packet.layer_name # or packet['layer name'] or packet[layer_index]

To see what available layers you have and what attributes they have, just print them (or use layer/packet.pretty_print()) or use autocomplete or look at packet.layer._all_fields. For instance packet.udp.srcport.

What is missing in the documentation?

Note that you can also apply a filter as an argument to the FileCapture instance (either a display filter or a BPF filter, see docs)

0
votes

Have you tried dpkt? It has a nice Reader interface which seems to lazy-load packets (I have loaded 100MB+ pcap files with it, no problem).

Sample:

from dpkt.pcap import Reader

with open(...) as f:
    for pkt in Reader(f):
        ...
0
votes

Thanks to @KimiNewt and After spending some time with the pyshark Source code, I got some understanding of the nuts and bolts of it

PS : opening a 450 MB file using pyShark doesn't take any time at all, and the data access is fairly easy. I don't see any downsides of using it at the moment, but I will try to keep this post up to date as I advance in my project.

This is a sample code of 802.11 packet parsing using pyShark, I hope it will help those working on similar projects.

import pyshark

#Opening the cap file
filename='data-cap-01.cap'
cap = pyshark.FileCapture(filename)

#Getting a list of all fields of this packet on the level of this specific layer
#looking somthing like this :['fc_frag', 'fc_type_subtype',..., 'fc_type']
print cap[0]['WLAN']._field_names

#Getting the value of a specific field, the packet type in
#this case (Control, Management or Data ) which will be represented by an Integer (0,1,2)
print cap[0]['WLAN'].get_field_value('fc_type')

I will be later on working on packet decryption for WEP and WPA and getting 3rd layer headers, so I might add that too.

0
votes
with PcapReader('filename.pcapng') as pcap_reader:
    for pkt in pcap_reader:
        #do something with the packet
        ...

this works GOOD!

PcapReader just like xrange() to range()