linux - BASH : Sum size of same name directories -
first of all, bash noob, please gentle :)
i trying sum size of folders in different places have same name. looks :
root --- directory 1 ------ folder 1 --------subfolder 1 --------subfolder 2 ------ folder 2 --------subfolder 3 --------subfolder 4 ------ folder 3 --------subfolder 5 --------subfolder 6 --- directory 2 ------ folder 1 --------subfolder 1 --------subfolder 2 ------ folder 2 --------subfolder 3 --------subfolder 4 ------ folder 3 --------subfolder 5 --------subfolder 6 i trying sum size of subdirectories 1 6 , output .csv
at moment outputting sizes of subdirectories in 2 seperate csv files. 1 directory 1 , 1 directory 2
at moment have output sizes of subfodlers run need them :
du -h --max-depth=1 --block-size=gb * | grep "[\/]" | sort -n -r > ~/lists/disks/rc_job.csv the output :
40gb folder1/subfolder1 15gb folder1/subfolder2 10gb folder2/subfolder 3 ... i have 1 output directory 1 , 1 directory 2. sum size of subfolders directory 1 , 2 , have output looks
60gb subfolder1 25gb subfolder2 10gb subfolder3 where subfolder1 directory1/folder1/subfolder1 + directory2/folder1/subfolder1
this first post here not know if enough info. pleased provide more if necessary. pretty sure can done awl, haven't used yet.
cheers !
edit answer question in comments :
(part of the) output of du -h /net/rcq-rp/job/rcq/vault/image/film /net/rcq-rp/job/rcq/film --max-depth=1 --block-size=gb * :
1gb /net/rcq-rp/job/rcq/vault/image/film/nr106/nr106_0010 1gb /net/rcq-rp/job/rcq/vault/image/film/nr106/nr106_0020 1gb /net/rcq-rp/job/rcq/vault/image/film/nr106/nr106_0030 1gb /net/rcq-rp/job/rcq/vault/image/film/nr106/nr106_0035 1gb /net/rcq-rp/job/rcq/vault/image/film/nr106/nr106_0040 1gb /net/rcq-rp/job/rcq/vault/image/film/nr106/nr106_0045 2gb /net/rcq-rp/job/rcq/vault/image/film/nr106/nr106_0050 1gb /net/rcq-rp/job/rcq/vault/image/film/nr106/nr106_0060 1gb /net/rcq-rp/job/rcq/film/nr106/nr106_0010 1gb /net/rcq-rp/job/rcq/film/nr106/nr106_0020 1gb /net/rcq-rp/job/rcq/film/nr106/nr106_0030 1gb /net/rcq-rp/job/rcq/film/nr106/nr106_0035 1gb /net/rcq-rp/job/rcq/film/nr106/nr106_0040 1gb /net/rcq-rp/job/rcq/film/nr106/nr106_0045 1gb /net/rcq-rp/job/rcq/film/nr106/nr106_0050 1gb /net/rcq-rp/job/rcq/film/nr106/nr106_0060 ideally final output :
2gb nr106_0010 etc...
one way associative array. associative array maps series of keys values, example:
directory1 -> 10 gb directory2 -> 12 mb directory3 -> 40 kb the keys in associative array must unique. that's great! paths our directories unique. let's put them in associative array. show how in awk plenty of other languages have associative arrays (like perl, calls them hashes).
du | awk '{ val = $1; dir = $2; sizes[dir] = val }' (i took out arguments pass du simplicity)
what do? awk reads output of du line line; each line, adds element associative array sizes directory name index , size value. if our original input looked this
40gb folder1/subfolder1 15gb folder1/subfolder2 10gb folder2/subfolder1 our array this:
sizes[folder1/subfolder1] -> 40gb sizes[folder1/subfolder2] -> 15gb sizes[folder2/subfolder1] -> 10gb but in our final output want see values subdirectories. awk has functions string manipulation, let's tweak our code strip off leading directories:
du | awk '{ val = $1; dir = $2; sub(/^.*\//, "", dir); sizes[dir] = val }' the sub function strips off last / beginning of path. our array looks this:
sizes[subfolder2] -> 15gb sizes[subfolder1] -> 10gb great! have values subdirectories. there's 1 little problem. values aren't totals. since had more 1 subdirectory named subfolder1, overwrote first value (40gb) second 1 (10gb). when run index exists in our array, want add value existing value:
du | awk '{ val = $1; dir = $2; sub(/^.*\//, "", dir); sizes[dir] += val }' (i changed sizes[dir] = val, uses assignment, sizes[dir] += val, adds val whatever in sizes[dir])
awk magically takes care of things us, converting 15gb number 15. our array looks this:
sizes[subfolder2] -> 15 sizes[subfolder1] -> 50 which shows totals we're looking for. now, how display this? can loop through array , print out keys , values this:
du | awk '{ val = $1; dir = $2; sub(/^.*\//, "", dir); sizes[dir] += val } \ end { (dir in sizes) print dir, sizes[dir], "gb" }' and our results are
subfolder1 50 gb subfolder2 15 gb edit: here results using du output in updated question.
nr106_0060 2 gb nr106_0050 3 gb nr106_0045 2 gb nr106_0040 2 gb nr106_0035 2 gb nr106_0030 2 gb nr106_0020 2 gb nr106_0010 2 gb
Comments
Post a Comment