python - find biggest interval in list of time stamps ( Perl preferred ) -
i'd accept of interpreted languages perl, python, bash, etc.. i'd prefer perl because trying learn. have list of timestamps like:
17:31:16 17:31:16 17:31:18 17:31:29
i want find of largest intervals (top 5) between 2 consecutive lines, , return time stamps , line numbers. log file software build , trying determine steps took longest. example gave filtered, lines like: [15:57:42]: cc net/sunrpc/xprtsock.o if can give me program parses format little easier, , return line number biggest differences in time occurred.
this used isolate timestamps log
perl -lane 'print $1 if $_ =~ /^\[(\d+:\d+:\d+)\]:*/'
the type of output achieve like:
line 574 20:04:54 line 575 20:24:55 difference 00:20:01
if don't want solve problem, happy see pseudocode or advice @ all. have spent time , have no useful code show it.
i'd upgrade time-matching regex bit, capture components of time separately. have worry builds started before midnight , running until morning of next day?
#!/usr/bin/env perl use strict; use warnings; $oldtime = ""; # hh:mm:ss end of long interval $oldlineno = 0; # line number in file of second line $oldoffset = 0; # offset in seconds midnight of second command $olddiff = 0; # time taken longest command sub hhmmss { my($time) = @_; my(@tm) = (int($time/3600), int($time/60)%60, $time%60); return @tm; } while (<>) { chomp; next unless m/^((\d\d):(\d\d):(\d\d))\s+/; $newoffset = (($2 * 60) + $3) * 60 + $4; if ($oldoffset == 0) { $oldtime = $1; $olddiff = 0; $oldoffset = $newoffset; $oldlineno = $.; } elsif (($newoffset - $oldoffset) > $olddiff) { $oldtime = $1; $olddiff = $newoffset - $oldoffset; $oldoffset = $newoffset; $oldlineno = $.; } } if ($oldoffset != 0) { $prvlineno = $oldlineno - 1; $newoffset = $oldoffset - $olddiff; my(@tm) = hhmmss($newoffset); printf "line $prvlineno: %.2d:%.2d:%.2d\n", $tm[0], $tm[1], $tm[2]; print "line $oldlineno: $oldtime\n"; @tm = hhmmss($olddiff); printf "diff: %.2d:%.2d:%.2d\n", $tm[0], $tm[1], $tm[2]; }
given data file (data
) , script above (dt.pl
):
17:31:16 line1 17:31:18 line2 17:31:29 line3 17:33:59 line4 18:00:21 line5 18:21:03 line6 18:41:25 line7 19:51:54 line8 19:52:34 line9
the scriptlet below produces output shown:
$ in $(seq 1 9); sed ${i}q data | perl dt.pl; done | line 0: 17:31:16 line 1: 17:31:16 diff: 00:00:00 line 1: 17:31:16 line 2: 17:31:18 diff: 00:00:02 line 2: 17:31:18 line 3: 17:31:29 diff: 00:00:11 line 3: 17:31:29 line 4: 17:33:59 diff: 00:02:30 line 4: 17:33:59 line 5: 18:00:21 diff: 00:26:22 line 4: 17:33:59 line 5: 18:00:21 diff: 00:26:22 line 6: 18:00:21 line 7: 18:41:25 diff: 00:41:04 line 7: 18:41:25 line 8: 19:51:54 diff: 01:10:29 line 7: 18:41:25 line 8: 19:51:54 diff: 01:10:29 $
i'd love hear how thought problem before wrote of code.
this problem necessary keep record of (relevant parts of the) previous line information calculate difference between , current line. need keep current maximum difference, can't formally established until you've read second matching line. drives design. big repeat in code reduced nothing assigning 3 values unconditionally , fourth ($olddiff
) conditionally. after that, question of mechanics , tactics.
matching across multiple lines nuisancy process; have deal preserving appropriate state. partly, question of experience; after you've done kind of thing few dozen times, doesn't take long next time.
Comments
Post a Comment