linux - BASH : Sum size of same name directories -
first of all, bash noob, please gentle :)
i trying sum size of folders in different places have same name. looks :
root --- directory 1 ------ folder 1 --------subfolder 1 --------subfolder 2 ------ folder 2 --------subfolder 3 --------subfolder 4 ------ folder 3 --------subfolder 5 --------subfolder 6 --- directory 2 ------ folder 1 --------subfolder 1 --------subfolder 2 ------ folder 2 --------subfolder 3 --------subfolder 4 ------ folder 3 --------subfolder 5 --------subfolder 6
i trying sum size of subdirectories 1 6 , output .csv
at moment outputting sizes of subdirectories in 2 seperate csv files. 1 directory 1 , 1 directory 2
at moment have output sizes of subfodlers run need them :
du -h --max-depth=1 --block-size=gb * | grep "[\/]" | sort -n -r > ~/lists/disks/rc_job.csv
the output :
40gb folder1/subfolder1 15gb folder1/subfolder2 10gb folder2/subfolder 3 ...
i have 1 output directory 1 , 1 directory 2. sum size of subfolders directory 1 , 2 , have output looks
60gb subfolder1 25gb subfolder2 10gb subfolder3
where subfolder1 directory1/folder1/subfolder1 + directory2/folder1/subfolder1
this first post here not know if enough info. pleased provide more if necessary. pretty sure can done awl, haven't used yet.
cheers !
edit answer question in comments :
(part of the) output of du -h /net/rcq-rp/job/rcq/vault/image/film /net/rcq-rp/job/rcq/film --max-depth=1 --block-size=gb *
:
1gb /net/rcq-rp/job/rcq/vault/image/film/nr106/nr106_0010 1gb /net/rcq-rp/job/rcq/vault/image/film/nr106/nr106_0020 1gb /net/rcq-rp/job/rcq/vault/image/film/nr106/nr106_0030 1gb /net/rcq-rp/job/rcq/vault/image/film/nr106/nr106_0035 1gb /net/rcq-rp/job/rcq/vault/image/film/nr106/nr106_0040 1gb /net/rcq-rp/job/rcq/vault/image/film/nr106/nr106_0045 2gb /net/rcq-rp/job/rcq/vault/image/film/nr106/nr106_0050 1gb /net/rcq-rp/job/rcq/vault/image/film/nr106/nr106_0060 1gb /net/rcq-rp/job/rcq/film/nr106/nr106_0010 1gb /net/rcq-rp/job/rcq/film/nr106/nr106_0020 1gb /net/rcq-rp/job/rcq/film/nr106/nr106_0030 1gb /net/rcq-rp/job/rcq/film/nr106/nr106_0035 1gb /net/rcq-rp/job/rcq/film/nr106/nr106_0040 1gb /net/rcq-rp/job/rcq/film/nr106/nr106_0045 1gb /net/rcq-rp/job/rcq/film/nr106/nr106_0050 1gb /net/rcq-rp/job/rcq/film/nr106/nr106_0060
ideally final output :
2gb nr106_0010 etc...
one way associative array. associative array maps series of keys values, example:
directory1 -> 10 gb directory2 -> 12 mb directory3 -> 40 kb
the keys in associative array must unique. that's great! paths our directories unique. let's put them in associative array. show how in awk
plenty of other languages have associative arrays (like perl, calls them hashes).
du | awk '{ val = $1; dir = $2; sizes[dir] = val }'
(i took out arguments pass du
simplicity)
what do? awk
reads output of du
line line; each line, adds element associative array sizes
directory name index , size value. if our original input looked this
40gb folder1/subfolder1 15gb folder1/subfolder2 10gb folder2/subfolder1
our array this:
sizes[folder1/subfolder1] -> 40gb sizes[folder1/subfolder2] -> 15gb sizes[folder2/subfolder1] -> 10gb
but in our final output want see values subdirectories. awk
has functions string manipulation, let's tweak our code strip off leading directories:
du | awk '{ val = $1; dir = $2; sub(/^.*\//, "", dir); sizes[dir] = val }'
the sub
function strips off last /
beginning of path. our array looks this:
sizes[subfolder2] -> 15gb sizes[subfolder1] -> 10gb
great! have values subdirectories. there's 1 little problem. values aren't totals. since had more 1 subdirectory named subfolder1
, overwrote first value (40gb) second 1 (10gb). when run index exists in our array, want add value existing value:
du | awk '{ val = $1; dir = $2; sub(/^.*\//, "", dir); sizes[dir] += val }'
(i changed sizes[dir] = val
, uses assignment, sizes[dir] += val
, adds val
whatever in sizes[dir]
)
awk
magically takes care of things us, converting 15gb number 15. our array looks this:
sizes[subfolder2] -> 15 sizes[subfolder1] -> 50
which shows totals we're looking for. now, how display this? can loop through array , print out keys , values this:
du | awk '{ val = $1; dir = $2; sub(/^.*\//, "", dir); sizes[dir] += val } \ end { (dir in sizes) print dir, sizes[dir], "gb" }'
and our results are
subfolder1 50 gb subfolder2 15 gb
edit: here results using du
output in updated question.
nr106_0060 2 gb nr106_0050 3 gb nr106_0045 2 gb nr106_0040 2 gb nr106_0035 2 gb nr106_0030 2 gb nr106_0020 2 gb nr106_0010 2 gb
Comments
Post a Comment