1
votes

Here are some few lines of my logged file

10/21/2015 10:16:42 AM Following hmac:c35330404902c0b1bb5c6d0718407ea12b25a464433bd1e69152ccc0e0b89c9f  with is already in database so dropping
11/21/2015 10:16:42 AM The data for the duplicate Hmac is : HF 13300100012015-06-15 19:30:21+0000+ 12.61 0.010  1686.00
07/21/2015 10:16:42 AM Following hmac:84d9cdb2145b7c3e0fa2d099070b7bd291c652f30ca25c69240e33ebbd2b8677  with is already in database so dropping
07/21/2016 10:16:42 AM The data for the duplicate Hmac is : HF 13300100012015-06-15 20:16:18+0000+ 12.60 0.045  1686.00
07/20/2016 10:16:42 AM Following hmac:a24d19d340651e694bff854ae7469dd779b60037228bf047d8f372dee4a731e0  with is already in database so dropping
07/20/2016 10:16:42 AM The data for the duplicate Hmac is : HF 13300100012015-06-15 20:31:25+0000+ 12.62 0.045  1685.00
07/20/2016 10:16:42 AM Following hmac:4e239a4b69108833e9cbc987db2014f9137679860df0ca8efdf7d09c4897d369  with is already in database so dropping
07/19/2016 10:16:42 AM The data for the duplicate Hmac is : HF 13300100012015-06-15 20:46:27+0000+ 12.61 0.040  1685.00

My goal is to loop over the lines and return count of lines specific characters including hmac. I have calculate the total count already but i want to return the count of lines for last one year. Trying to extract the date portion of each line is giving me an error

ValueError: unconverted data remains:

which i have tried but cannot find the solution. Here is my code from datetime import date from datetime import time from datetime import datetime from datetime import timedelta import os

def fileCount(fileName):

    with open(fileName) as FileObj:

        Count = 0
        todayDate = date.today()
        OneYear = str(todayDate -  timedelta(days=365))
        OneMonth = str(todayDate -  timedelta(days=30))
        ThreeMonths = str(todayDate -  timedelta(days=90))

        while True:

            line = FileObj.readline()

            Lines = "-".join(line[:11].split("/"))

            convertDate = datetime.strptime(Lines, '%m-%d-%Y')

            print convertDate

            if not line:
                break
            if "Following hmac" in line:

                Count += 1

        print "The total count is ", Count

# Call The function
def main():

    filePath = 'file.txt'

    fileCount(filePath)

if __name__ == "__main__":

    main()

I want to extract the date to use it for the date arithmetic operations which will allow me to return counts for last three, six and 12 months.

1

1 Answers

0
votes

The stop index for your slice includes a trailing space which is not being accounted for in the date format you provided.

You should strip the space:

>>> datetime.strptime(Lines.rstrip(), '%m-%d-%Y')
datetime.datetime(2015, 10, 21, 0, 0)

Or change the index to 10 instead of 11 to exclude the space entirely:

Lines = "-".join(line[:10].split("/"))

Accounting for the extra space in your format is also another fix:

convertDate = datetime.strptime(Lines, '%m-%d-%Y ')

You could handle other errors such as a blank line or a line without the date string by using a try/except:

lines = "-".join(line[:10].split("/"))
try:
    convert_date = datetime.strptime(lines, '%m-%d-%Y')
    print convert_date
except ValueError:
    print 'This line has a problem:', lines