Converting a nested JSON into csv with jq.md
May 19, 2020 ยท View on GitHub
Query: Converting a nested JSON into csv with jq
I would like to convert the following JSON into a csv format using jq. I know there are tons of similar questions but I could not figure it out based on them.
{
"one": {
"firstName": "John",
"lastName": "Smith",
"gender": "man",
"age": 32,
"address": {
"streetAddress": "21 2nd Street",
"city": "New York",
"state": "NY",
"postalCode": "10021"
}
},
"two": {
"firstName": "Johnny",
"lastName": "Smithy",
"gender": "man",
"age": 33,
"address": {
"streetAddress": "22 2nd Street",
"city": "New York",
"state": "NY",
"postalCode": "10021"
}
}
}
The output should look like the following. I'm struggeling with the nested object as value of the address key.
number,firstName,lastName,gender,age,streetAddress,city,state,postalCode
one,John, Smith,man, 32, 21 2nd Street, New York, NY, 10021
two, Johnny, Smith,man, 33, 22 2nd Street, New York, NY, 10021
The Best I could do is the foillowing but it does not come close... Your help is much apreciated
jq --raw-output 'to_entries | map_values({ job: .key } + .value )'
A:
in this qeury, the best approach using jtc
would be to linearize each record and then template-interpolate it:
bash $ hdr='"number,firstName,lastName,gender,age,streetAddress,city,state,postalCode"'
bash $
bash $ <file.json jtc -w[:] -pi[:][address] -T'{{}}' /\
-w' ' -qqT"$hdr" -w'[:]<L>k' -T'"{L},{$c},{$e},{$d},{$a},{$h},{$b},{$g},{$f}"'
number,firstName,lastName,gender,age,streetAddress,city,state,postalCode
one,John,Smith,man,32,21 2nd Street,New York,NY,10021
two,Johnny,Smithy,man,33,22 2nd Street,New York,NY,10021
bash $
Explanation:
- the firs option set linearizes each of the top objects by moving into the destination points
addressobjects:
bash $ <file.json jtc -w[:] -pi[:][address] -T'{{}}'
{
"one": {
"age": 32,
"city": "New York",
"firstName": "John",
"gender": "man",
"lastName": "Smith",
"postalCode": "10021",
"state": "NY",
"streetAddress": "21 2nd Street"
},
"two": {
"age": 33,
"city": "New York",
"firstName": "Johnny",
"gender": "man",
"lastName": "Smithy",
"postalCode": "10021",
"state": "NY",
"streetAddress": "22 2nd Street"
}
}
-w[:]these insertion (destination) points-pi[:][address] -T'{{}}': moves (insert and then purge) eachaddressobject into destination points. Though it's a bit tricky: it requires a template{{}}- why? If there wasn't such template (which represents the entireaddressobject) thenaddressobject gets inserted into the iterable whereaddressobject already exists, so no actual insertion occurs. If such template applied - then insertion occurs of the standalone "lableless" object (thus, such template anonymizes the object) - and in such case the standalone object gets fully inserted.- the second option set:
-w' ' -T"$hdr": prints the header (walk-w' 'is dummy) via template-w'[:]<L>k' -T'"{L},{$c},{$e},{$d},{$a},{$h},{$b},{$g},{$f}"': walks each object ([:]) and memorizes its label (<L>k) into the namespaceL, then template-interpolates each record using auto-tokens and the namespaceLin the required order-qqdrops the outer quotation marks of the results