Parsing Dynamic JSON Data in Power Query: A Quick Coding Guide

blog image

Learn how to efficiently convert any JSON data into a fully expanded table format using Power Query, reducing the need for custom code and streamlining your data processing.

Introduction: Understanding JSON Data and Power Query 

JSON (JavaScript Object Notation) is a lightweight data-interchange format that is easy for humans to read and write and easy for machines to parse and generate. It’s widely used in web services and APIs to exchange data between clients and servers. JSON’s structure typically includes nested arrays and objects, making it flexible for various types of data but also challenging to work with when it comes to data transformation and analysis. 

Power Query is a data connection technology that enables you to discover, connect, combine, and refine data across a wide variety of sources. It is integrated into Microsoft products like Excel and Power BI, providing users with a powerful tool to transform and prepare data for analysis. Power Query’s ability to automate data transformation tasks and handle large datasets makes it an essential tool for business intelligence professionals. 


The Challenge with JSON Data 

JSON (JavaScript Object Notation) is a common data format, especially in business contexts where data is exchanged between systems. While JSON’s nested structure is useful for organizing data, it can be challenging to work with tools like Power BI, where a flat table format is often required for analysis. Traditionally, handling varied JSON structures involves writing custom code for each case, which can be time-consuming and difficult to maintain.

To address this, we developed a Power Query solution that dynamically converts any JSON data into a fully expanded table format. This blog explains how the solution works, provides the code, and highlights the key logic behind it. 


The Problem: Handling Varied JSON Structures 

JSON files often contain nested records and lists, making it difficult to transform them into a usable table format. Manually writing different code for each variation of JSON data can be tedious and error-prone, especially when dealing with data from multiple sources or systems. 

Our goal was to create a solution that could handle any JSON structure, automatically expanding it into a flat table without the need for constant code adjustments. 


The Solution: Dynamic JSON Parsing with Power Query 

We created a two-part Power Query script—JsonParser and ColumnExpander—that dynamically parses and expands JSON data into a table. Here’s a brief overview of how each script works: 

JsonParser: Loading and Parsing JSON Data 

The JsonParser script is responsible for loading the JSON data from a SharePoint site, parsing it, and preparing it for expansion. 

let 
JsonParser = (FileNameWithoutExtension as text) =>
let
JsonFileName = FileNameWithoutExtension & ".json",
SharePointFilesList = SharePoint.Files("https://YourSharepointSiteURL/", [ApiVersion = 15]),
FileContent = SharePointFilesList{[Name=JsonFileName, #"Folder Path"="https://YourSharepointFolderURL"]}[Content],
JsonParsedData = Json.Document(FileContent, 1252),
DataAsTable = if JsonParsedData is record then Record.ToTable(JsonParsedData) else if JsonParsedData is list then Table.FromList(JsonParsedData, Splitter.SplitByNothing(), null, null, ExtraValues.Error) else null,
ExpandedData = if DataAsTable <> null then ColumnExpander(DataAsTable, 0) else null
in
ExpandedData
in
JsonParser
Key Points:
  • File Loading and Parsing: The script loads the specified JSON file from SharePoint and parses it into a table. It can handle both record and list structures. 
  • Initial Expansion: If the parsed data is valid, the script initiates the expansion process using the ColumnExpander function. 

ColumnExpander: Expanding All Columns 

The ColumnExpander script is designed to recursively expand any nested lists or records in the table, ensuring that the JSON data is fully flattened into a table format. 

let 
  ColumnExpander = (TableToExpand as table, ColumnToExpand as number) => 
    let 
      CurrentColumnIndex = ColumnToExpand, 
      CurrentColumnName = Table.ColumnNames(TableToExpand){CurrentColumnIndex}, 
      CurrentColumnContents = Table.Column(TableToExpand, CurrentColumnName), 
      HasRecords = List.Contains(List.Transform(CurrentColumnContents, each _ is record), true), 
      HasLists = List.Contains(List.Transform(CurrentColumnContents, each _ is list), true), 
      CanExpandColumn = HasRecords or HasLists, 
      IntermediateTable = 
        if CanExpandColumn then 
          if HasRecords then 
            let 
              RecordFieldsToExpand = List.Distinct( 
                List.Combine( 
                  List.Transform( 
                    CurrentColumnContents, 
                    each if _ is record then Record.FieldNames(_) else {} 
                  ) 
                ) 
              ), 
              NewExpandedColumnNames = List.Transform( 
                RecordFieldsToExpand, 
                each CurrentColumnName & "." & _ 
              ) 
            in 
              Table.ExpandRecordColumn( 
                TableToExpand, 
                CurrentColumnName, 
                RecordFieldsToExpand, 
                NewExpandedColumnNames 
              ) 
          else if HasLists then 
            Table.ExpandListColumn(TableToExpand, CurrentColumnName) 
          else 
            null 
        else 
          TableToExpand, 
      FullyExpandedTable = 
        if CanExpandColumn then 
          ColumnExpander(IntermediateTable, CurrentColumnIndex) 
        else if CurrentColumnIndex = (Table.ColumnCount(IntermediateTable) - 1) then 
          IntermediateTable 
        else 
          ColumnExpander(IntermediateTable, CurrentColumnIndex + 1) 
    in 
      FullyExpandedTable 
in 
  ColumnExpander
Key Points:
  • Recursive Expansion: The script checks each column for nested records or lists and expands them accordingly. 
  • Flexibility: The script continues to expand columns until all nested data is fully flattened, regardless of the depth or complexity of the original JSON structure. 

Why This Solution Matters 

This Power Query solution simplifies the process of working with JSON data. Instead of writing new code for each variation in your data, you can use these scripts to automatically flatten any JSON structure into a table. This saves time, reduces errors, and streamlines your data processing workflow. 


Conclusion 

Transforming JSON data into a usable format doesn’t have to be complicated. With the JsonParser and ColumnExpander scripts, you can automate the process of expanding JSON data into a table, allowing you to focus on what really matters—analyzing and using your data. 

If your business is dealing with complex JSON data and needs a reliable, scalable solution, contact us today to learn how we can help you streamline your data processing workflows. 


FAQs

Q1: What exactly is this solution designed to do? 

The solution suggested above is designed to dynamically transform JSON data of varying structures into a flat, table format using Power Query. It automates the expansion of nested records and lists within the JSON, eliminating the need for manual adjustments or custom coding for each new JSON structure. 

Q2: How does this solution differ from traditional JSON processing methods 

Traditional JSON processing often involves writing specific code for each unique JSON structure, which can be time-consuming and prone to errors. This solution, however, is flexible and dynamic, allowing it to handle any JSON structure automatically. It simplifies the process by recursively expanding all nested elements, making the data ready for analysis without the need for manual intervention. 

Q3: Can this solution handle any type of JSON data? 

Yes, it can dynamically expand any JSON data, whether it contains nested records, lists, or a mix of both. 

Q4: Is this solution scalable for large datasets? 

Absolutely. Power Query is optimized for handling large datasets efficiently, although performance may vary depending on the complexity of the JSON. 

Q5: How does this compare to other JSON parsing methods in Power BI? 

Most methods require manual adjustments for different JSON structures. This solution automates that process, making it more efficient and reducing the risk of errors. 

Q6: Can these scripts be customized for specific business needs? 

Yes, they can be tailored to meet the specific requirements of your business, whether you need custom naming conventions, additional data transformations, or integration with other systems. 

Leave a Reply

DMCA.com Protection Status

Get a Free Data Analysis Done!

Need experts help with your data? Drop Your Query And Get a 30 Minutes Consultation at $0.

They have the experience and agility to understand what’s possible and deliver to our expectations.

Drop Your Concern!