apache pig - I would like to transform a map into a field in a Pig Latin script -
the description of tuples of relation (a) follows: {a: int, b: int, c: map[]} map contains 1 chararray key not predictable. example, sample of tuples is:
(1, 100, [key.152#hello]) (8, 110, [key.3000#bonjour]) (5, 103, [key.1#hallo]) (5, 103, []) (8, 104, [key.11#buenosdias]) ...
i transform relation (a) b relation b description be: {a: int, b: int, c: chararray}
with sample, give:
(1, 100, hello) (8, 110, bonjour) (5, 103, hallo) (8, 104, buenosdias) ...
(i want filter empty maps too)
any ideas?
thank you.
though writing udf right solution, if want hack quick following solution using regex might help.
a = load 'sample.txt' (a:int, b:int, c:chararray); b = foreach generate a, b, flatten(strsplit(c, '#', 2)) (key:chararray, value:chararray); c = foreach b generate a, b, flatten(strsplit(value, ']', 2)) (value:chararray, ignore:chararray); d = filter c value not null; e = foreach d generate a, b, value; store e 'output/e';
for sample input
1 100 [key.152#hello] 8 110 [key.3000#bonjour] 5 103 [key.1#hallo] 5 103 [] 8 104 [key.11#buenosdias]
the above code produces following output:
1 100 hello 8 110 bonjour 5 103 hallo 8 104 buenosdias
Comments
Post a Comment