Tuesday, January 3, 2012

Hierarchy – Reflexive Transitive Closure



I have always wanted to make more use of the list box with the tree view option.  It has several advantages that make it a great way to select dimensions in QlikView:
  •  Ability to fit several selectable fields into a single object.
  •  Naturally shows the hierarchal relationship for the displayed fields.
  •  Intuitive to navigate for anybody who has come within 100 yards of a computer.






But many people, including myself have shied away from using the Tree View option because it is hard to get right.  What developers quickly realize is that selecting a non-leaf parent does not associate to the children of that parent.  This is because the QlikView function, Hierarchy, creates an expanded nodes table, but does not create the optimal Reflexive Transitive Closure style of this table.

Expanded Nodes table using Hierarchy function alone:
NodeID
Name1
Name2
Path
Depth
1
Agatha

Agatha
1
2
Agatha
Aaron
Agatha/Aaron
2
3
Agatha
Adam
Agatha/Adam
2

In above table if Agatha has no sales because she is the manager, when someone selects her in the Tree View list box the sales for children Aaron and Adam will not appear.  In addition, this behavior is even more confusing because the path changes depending on whether Agatha is expanded or not.  This results in the association sometimes working and sometimes not working.

Reflexive Transitive Closure:
NodeID
Name1
Name2
Path
Depth
2
Agatha
Aaron
Agatha
1
2
Agatha
Aaron
Agatha/Aaron
2
3
Agatha
Adam
Agatha
1
3
Agatha
Adam
Agatha/Adam
2

In this version, the association will work correctly every time.  This is because the NodeID (linking back to your transactions) has extra rows to account for the changing path as one clicks higher in the hierarchy.  Only the leaf nodes (NodeIDs existing in transactions) need to be included in the table.

Now the question is how would you create this table.  I have created a downloadable example with the actual script but here are the steps.
  • Load in transactions first because they will be used to identify the leaf nodes.
  • Use an exists(NodeID) or a join to add an IsLeaf flag to your adjacent nodes data.  You can load the adjacent nodes table using exists in one step.  I chose to use a join after the adjacent nodes load to keep the solution portable.
  • Use the Hierarchy function to flatten your adjacent nodes table into an expanded nodes table.  You will want to use the Path option and remember to leave in your IsLeaf field.  This leaves you with the traditional hierarchy table that includes all nodes. 
  • We only need the leaf nodes so now we resident load this table into a new table, only loading the leaf nodes and sorting by the depth of the levels. 
  • Now we create a looping structure going from the deepest level and working up to the highest parents.  Each loop adds records up the hierarchy so that a child has a corresponding path all the way up.  The loop will run for as many levels as you have minus 1. 
  • Then just drop any of the temporary tables you created and any extraneous fields not needed in the final data structure.







So this will create an expanded nodes table with reflexive transitive closure.  There are still some issues with the way QlikView handles this:
  •          You will want to select Suppress Horizontal Scroll Bar in the list box due to a rendering issue in AJAX. 
  •           There will still be some confusion in selecting because depending on whether the parent is expanded or not, the children might not look selected.  I have not been able to work through that and have not had successful results trying to tie a trigger to it.
  •           The last item is that if you have a model where the parents have transactions as well as the children (different levels), you cannot isolate the parent’s transactions in the list box, since we have forced the association to the children.  The solution here would be to either go back to the traditional hierarchy table or to use an additional list box to select a single parent.  I mostly find this to not be an issue since usually the transactions are only at the lowest level.




Please enjoy and I look forward to any comments or additional analysis on this topic.

2 comments:

  1. how can I view the network in graphics mode similar to an organizational chart of a company? I've tried to do with google api but it still fails.

    ReplyDelete
  2. I'm still new to QV and don't quite get how this works. It seems to load MUCH faster than using a character-delimited string in a pre-built table, besides being able to select each node (parent/child).

    I have a set of data that the user would like to have in a drilldown or tree structure. Date / Error Type / Error Cat / Error Detail. The date doesn't have to be in the tree (I hope), since it's selected from a dropdown, although it is in the data table. Each Error Detail is only associated with one Error Category; each Error Category is only associated with one Error Type, but there can be multiple Error Types associated with each Date.

    The tree table (a cut down version of a 14M-row table) has each of those fields, plus the delimited field. What would be the best way to go about this?

    Thanks,
    -- Matt

    ReplyDelete