Extracting Values from a JSON string

Data Pipeline

Say you have a JSON string from which you’d like to extract specific values. This can be accomplished in Chartio’s data pipeline, adhering to SQLite syntax and the JSON_EXTRACT function presented in this article.

A non-nested JSON string extraction example

We have a table chart showcasing our JSON string column:

JSON string column

We want to extract our device id, yet, are puzzled on how to accomplish pulling this number out. This is where we turn to the data pipeline to help transform the current output. First, we have to utilize an add column step, title it, in this case Device Id, choose Custom formula, then use the function JSON_EXTRACT:

JSON

The formula can be broken down into two parts: First, we add the column containing the JSON string between double quotes and add a comma. Then, we set the path using single quotes. The path will include a $ sign (which gives us access into the JSON row), a period, and the label to the left of the value we want to pull. Because we chose device_id, we extracted the values to the right of “device_id”: in the string, giving us the output below:

Device

From here, we can continue to extract other values we’d want. If we wanted to extract countries, we would follow the same logic:

Country extract

Giving us the second column values we’ve extracted:

Country extract output

A nested JSON string extraction walkthrough

Our JSON string wasn’t nested in the previous example, if the values were, we would have to change our approach a bit. Let’s say we want to extract the same country values as our last example, but now, our JSON string is nested, indicated by the brackets (Note: we’ve changed the name of the column that has our JSON string to Device2):

Nesting the JSON string

Our function would now be:

Nested string output

If we were to use the previous formula, this would pull all the information between the brackets to the right of “deviceInformation” and our new column would yield:

{”device_id”: 0, “device_type”: “sensor-ipad”, “ip”: “68.161.225.1”, “cca3”: “USA”, “cn”: “United States”}

The 0 inside the bracket for our function’s path lets us pull values inside the first bracket of deviceInformation, where cn has the country value to the right of its colon.

The function path JSON_EXTRACT(“JSON Column”, ‘$.deviceInformation[0].cn’) outputs the country information in each row:

Extraction output

Another nested JSON string extraction walkthrough

Now, lets say our JSON string is nested even further, it may look something like this:

{“deviceInformation”:[{”device_id”: 0, “device_type”: “sensor-ipad”, “ip”: “68.161.225.1”, “countryInformation”:[{”cca3”: “USA”, “cn”: “United States”}]}], “temp”: 25, “signal”: 23, “battery_level”: 8, “c02_level”: 917, “timestamp” :1475600496 }

To extract the country values again, we would once again change our path in the add column step:

JSON_EXTRACT(“JSON Column”, ‘$.deviceInformation[0].countryInformation[0].cn’)

The first part of the function, $.deviceInformation[0] brings in:

{”device_id”: 0, “device_type”: “sensor-ipad”, “ip”: “68.161.225.1”, “countryInformation”:[{”cca3”: “USA”, “cn”: “United States”}]}

We still need to move past the other bracket to the right of countryInformation, which is the next part of our function’s path:

$.deviceInformation[0].countryInformation[0] brings in:

{”cca3”: “USA”, “cn”: “United States”}

Removing the brackets has given us access to easily extract the country values. The final step of our path is calling on the name that contains these values, cn, which would be:

$.deviceInformation[0].countryInformation[0].cn

The function path JSON_EXTRACT(“JSON Column”, ‘$.deviceInformation[0].countryInformation[0].cn’) would give an output of: United States.


Related Data Pipeline Help Articles

See more