0
votes

i am working on a section of a project, that parses Logs from Postgres Database Server. The application is developed in C sharp Framework 4.0.

A log is produced and displayed on a DataGridView with the following columns

resultCode       Statement                  Starttime                   Duration
XT001         select * from PizzaMade    01-02-2012 03:10:14         00:04:10
  • there are many loglines with the sameformat.
  • The Datagridview is filled from another loop by parsing a Textfile.

My work is to generate stats from the data available in Grid in the following format

Statement                                 Count        countpercent                    Occurs_on

select * from PizzaMade                     3             1.42        01/02 at 03pm   [00:04:10], 01/02 at 04 [00:01:04]
select id,qty,table from PizzaMade          12             5.12           ...........

so basically the stats reflect the following

  • a) statement executed
  • b) Count number of times it appears in a grid
  • c) percentage of count which is basically the portion of totalcounts this statement occupies
  • d) a concatenated string containing starttime,duration

» The stats are generated as Datatable first, using a for loop

foreach(DataGridViewRow dr in LogGrid.Rows)
{
// search in the Datatable if the statement is present
// if it is then add count , add starttime and duration to the column
// if not then add a newrow

}

» After populating the datatable , i use a loop to calculate the Totalcount

int totalcount = 0;
foreach (DataRow drin StatTable.Rows)
{
totalcount = totalcount + Convert.ToInt32(dr["count"].ToString());
}

» After calculating the count, there is a loop to calculate the percentage

foreach (DataRow dr in StatTable.Rows)
{
    int c = Convert.ToInt32(dr["count"].ToString());
    dr["countpercent"] = (c/totalcount)*100;
}

Although everything seems ok, the whole method is sluggish with large number of rows.

  • Can you please suggest methods to improve the performance.

thanks arvind

2
You could try for getting total count to use Compute method of the DataTable, but I don't know if will increase performancemslliviu
i tested the compute method on sample data of 1million records, it seems foreach loop takes 545ms and compute method 1850ms. i.e. foreach loop is fasterarvind

2 Answers

1
votes

Since you're parsing text logs it might enhance performance by operating not in grid but in objects. Also, the grid could be bound to the List of parsed logs. It could be like this:

public class LogItem
{
    public string ResultCode { get;set;}
    public string Statement { get;set;}
    public DateTime StartTime { get;set;}
    public TimeSpan Duration { get;set;}
}

Then your grid could be bound to BindingList which contains all parsed log items. While having the list, you could access data in more uniform way:

foreach (string statement in logItems.Select(x => x.Statement).Distinct())
{
    int count = logItems.Count(x => x.Statement == statement);
    double percentage = count / logItems.Count(); 
    // any additional data
}

If you'd like to be extra performant and fancy, you could save all parsed log files to database and create queries for necessary data.

1
votes

i have some suggestions,

you can use something like this to your looping process using linq. basically linq uses well optimized query which gives great performance.

 DataTable obj = new DataTable();
        obj.Columns.Add("count",typeof(Int32));
        DataRow dr = obj.NewRow();
        dr[0] = "10";
        obj.Rows.Add(dr);

        DataRow dr1 = obj.NewRow();
        dr1[0] = "5";
        obj.Rows.Add(dr1);

        obj.Columns.Add("countpercentage");

        int intCount = (from DataRow drrow in obj.Rows
                        select drrow.Field<int>("count")).Sum();

        (from DataRow drtemp in obj.Rows
         select drtemp).ToList<DataRow>()
         .ForEach(x => x.SetField<string>("countpercentage", ((x.Field<Int32>("count")*100) / intCount).ToString()));