linux - BASH : Sum size of same name directories -


first of all, bash noob, please gentle :)

i trying sum size of folders in different places have same name. looks :

root --- directory 1  ------ folder 1 --------subfolder 1 --------subfolder 2 ------ folder 2 --------subfolder 3 --------subfolder 4 ------ folder 3 --------subfolder 5 --------subfolder 6  --- directory 2  ------ folder 1 --------subfolder 1 --------subfolder 2 ------ folder 2 --------subfolder 3 --------subfolder 4 ------ folder 3 --------subfolder 5 --------subfolder 6 

i trying sum size of subdirectories 1 6 , output .csv

at moment outputting sizes of subdirectories in 2 seperate csv files. 1 directory 1 , 1 directory 2

at moment have output sizes of subfodlers run need them :

du -h --max-depth=1 --block-size=gb * | grep "[\/]" | sort -n -r > ~/lists/disks/rc_job.csv 

the output :

40gb folder1/subfolder1  15gb folder1/subfolder2  10gb folder2/subfolder 3 ... 

i have 1 output directory 1 , 1 directory 2. sum size of subfolders directory 1 , 2 , have output looks

60gb subfolder1  25gb subfolder2  10gb subfolder3 

where subfolder1 directory1/folder1/subfolder1 + directory2/folder1/subfolder1

this first post here not know if enough info. pleased provide more if necessary. pretty sure can done awl, haven't used yet.

cheers !

edit answer question in comments :

(part of the) output of du -h /net/rcq-rp/job/rcq/vault/image/film /net/rcq-rp/job/rcq/film --max-depth=1 --block-size=gb * :

1gb /net/rcq-rp/job/rcq/vault/image/film/nr106/nr106_0010 1gb /net/rcq-rp/job/rcq/vault/image/film/nr106/nr106_0020 1gb /net/rcq-rp/job/rcq/vault/image/film/nr106/nr106_0030 1gb /net/rcq-rp/job/rcq/vault/image/film/nr106/nr106_0035 1gb /net/rcq-rp/job/rcq/vault/image/film/nr106/nr106_0040 1gb /net/rcq-rp/job/rcq/vault/image/film/nr106/nr106_0045 2gb /net/rcq-rp/job/rcq/vault/image/film/nr106/nr106_0050 1gb /net/rcq-rp/job/rcq/vault/image/film/nr106/nr106_0060 1gb /net/rcq-rp/job/rcq/film/nr106/nr106_0010 1gb /net/rcq-rp/job/rcq/film/nr106/nr106_0020 1gb /net/rcq-rp/job/rcq/film/nr106/nr106_0030 1gb /net/rcq-rp/job/rcq/film/nr106/nr106_0035 1gb /net/rcq-rp/job/rcq/film/nr106/nr106_0040 1gb /net/rcq-rp/job/rcq/film/nr106/nr106_0045 1gb /net/rcq-rp/job/rcq/film/nr106/nr106_0050 1gb /net/rcq-rp/job/rcq/film/nr106/nr106_0060 

ideally final output :

2gb nr106_0010  etc... 

one way associative array. associative array maps series of keys values, example:

directory1 -> 10 gb directory2 -> 12 mb directory3 -> 40 kb 

the keys in associative array must unique. that's great! paths our directories unique. let's put them in associative array. show how in awk plenty of other languages have associative arrays (like perl, calls them hashes).

du | awk '{ val = $1; dir = $2; sizes[dir] = val }' 

(i took out arguments pass du simplicity)

what do? awk reads output of du line line; each line, adds element associative array sizes directory name index , size value. if our original input looked this

40gb folder1/subfolder1 15gb folder1/subfolder2 10gb folder2/subfolder1 

our array this:

sizes[folder1/subfolder1] -> 40gb sizes[folder1/subfolder2] -> 15gb sizes[folder2/subfolder1] -> 10gb 

but in our final output want see values subdirectories. awk has functions string manipulation, let's tweak our code strip off leading directories:

du | awk '{ val = $1; dir = $2; sub(/^.*\//, "", dir); sizes[dir] = val }' 

the sub function strips off last / beginning of path. our array looks this:

sizes[subfolder2] -> 15gb sizes[subfolder1] -> 10gb 

great! have values subdirectories. there's 1 little problem. values aren't totals. since had more 1 subdirectory named subfolder1, overwrote first value (40gb) second 1 (10gb). when run index exists in our array, want add value existing value:

du | awk '{ val = $1; dir = $2; sub(/^.*\//, "", dir); sizes[dir] += val }' 

(i changed sizes[dir] = val, uses assignment, sizes[dir] += val, adds val whatever in sizes[dir])

awk magically takes care of things us, converting 15gb number 15. our array looks this:

sizes[subfolder2] -> 15 sizes[subfolder1] -> 50 

which shows totals we're looking for. now, how display this? can loop through array , print out keys , values this:

du | awk '{ val = $1; dir = $2; sub(/^.*\//, "", dir); sizes[dir] += val } \           end { (dir in sizes) print dir, sizes[dir], "gb" }' 

and our results are

subfolder1 50 gb subfolder2 15 gb 

edit: here results using du output in updated question.

nr106_0060 2 gb nr106_0050 3 gb nr106_0045 2 gb nr106_0040 2 gb nr106_0035 2 gb nr106_0030 2 gb nr106_0020 2 gb nr106_0010 2 gb 

Comments

Popular posts from this blog

c# - How Configure Devart dotConnect for SQLite Code First? -

java - Copying object fields -

c++ - Clear the memory after returning a vector in a function -