2
votes

I have a lab and I need to find the protocol for each packet of a huge pcap file. I am going to make a dictionary to hold them all but my first step is just to pull the information using dpkt. It looks like ip.get_proto is what I want but I missing some point. I am reading http://www.commercialventvac.com/dpkt.html#mozTocId839997

#!/usr/bin/python
# -*- coding: utf-8 -*-

import dpkt
import socket
import sys
import datetime

import matplotlib.pyplot as ploot 
import numpy as arrayNum 
from collections import Counter 

packets = 0 

protocolDist = {}  

f = open('bob.pcap')
#f = open('trace1.pcap')
pcap = dpkt.pcap.Reader(f) 

print "Maj Version:  " , dpkt.pcap.PCAP_VERSION_MAJOR  
print "Min Version:  " , dpkt.pcap.PCAP_VERSION_MINOR 
print "Link Layer "    , pcap.datalink() 
print "Snap Len:    "  , pcap.snaplen 

# How many packets does the trace contain? Count timestamps

# iterate through packets, we get a timestamp (ts) and packet data buffer (buf)
for ts,buf in pcap:
    packets += 1
    eth = dpkt.ethernet.Ethernet(buf)
    ip = eth.data
  # what is the timestamp of the first packet in the trace?
    if packets == 1:
        first = ts 
        print "The first timestamp is %f " % (first)    
        print ip.get_proto
        break 

# What is the average packet rate? (packets/second)     
# The last time stamp
last = ts
print "The last timestamp is %f " % (ts) 
print "The total time is %f " % (last - first)
print "There are %d " % (packets)
#print "The packets/second %f " % (packets/(last-first))    


# what is the protocol distribution?
# use dictionary 

f.close()
sys.exit(0)
2

2 Answers

4
votes

Check ip.p It returns a number corresponding to the protocol number. For ex, UDP has 17. ot chec

Cheers

3
votes

If you want to get the ip protocol number, you can use

ip.get_proto(ip.p)

This helper function translates the protocol numbers to a protocol class. Checkout https://www.iana.org/assignments/protocol-numbers/protocol-numbers.xhtml for the official list of IP protocols. Sometimes it's useful to get the representation in a human readable format. I find it useful to use __name__ to get the string.

proto = ip.get_proto(ip.p).__name__
print(proto)
>>> 'TCP'