3
votes

My end game is to create a tree visualization from a hierarchical JSON file using D3js.

The hierarchy I need to represent is this diagram, where A has children B,C,D ; B has children E,F,G; C has children H, I ; and D has no children. The nodes will have multiple key:value pairs.I've only listed 3 for simplicity.

                             -- name:E
                            |   type:dkBlue
                            |   id: 005
                            |
                            |-- name:F
            -- name:B ------|   type:medBlue 
            |  type:blue    |   id: 006
            |  id:002       |
            |               |-- name:G
            |                   type:ltBlue
 name:A ----|                   id:007     
 type:colors|
 id:001     |-- name:C  ----|-- name:H
            |   type:red    |   type:dkRed         
            |   id:003      |    id:008
            |               |  
            |               |
            |               |-- name:I
            |                   type:medRed
            |                   id:009
            |-- name:D
                type:green
                id: 004

My source data in R looks like:

nodes <-read.table(header = TRUE, text = "
ID name type
001 A   colors
002 B   blue
003 C   red
004 D   green
005 E   dkBlue
006 F   medBlue
007 G   ltBlue
008 H   dkRed
009 I   medRed
")

links <- read.table(header = TRUE, text = "
startID  relation endID    
001      hasSubCat 002
001      hasSubCat 003
001      hasSubCat 004
002      hasSubCat 005
002      hasSubCat 006
002      hasSubCat 007
003      hasSubCat 008
003      hasSubCat 009
")

I must convert it to the following JSON:

{"name": "A",
 "type": "colors",
 "id" : "001",
 "children": [
    {"name": "B",
      "type": "blue",
      "id"  : "002", 
      "children": [
          {"name": "E",
           "type": "dkBlue",
           "id"  : "003"},
          {"name": "F", 
           "type": "medBlue",
           "id": "004"},
          {"name": "G", 
           "type": "ltBlue",
           "id": "005"}
    ]},
    {"name": "C",
      "type": "red",
      "id"  : "006", 
      "children": [
          {"name": "H",
           "type": "dkRed",
           "id"  : "007"},
          {"name": "I", 
           "type": "dkBlue",
           "id": "008"}
    ]},
    {"name": "D",
      "type": "green",
      "id"  : "009"}
]}  

I would appreciate any help you may be able to offer!

[UPDATE 2017-04-18]

Based on Ian's references I looked into R's data.tree. I can recreate my hierarchy if I restructure my data as shown below. Note that I've lost the type of relation (hasSubcat) between each node, the value of which can vary for each link/edge in real life. I am willing to let that go (for now) if I can get a workable hierarchy. The revised data for data.tree:

df <-read.table(header = TRUE, text = "
paths  type     id 
A      colors   001
A/B    blue     002
A/B/E  dkBlue   005
A/B/F  medBlue  006
A/B/G  ltBlue   007
A/C    red      003
A/C/H  dkRed    008
A/C/I  medRed   009
A/D    green    004
")

myPaths <- as.Node(df, pathName = "paths")
myPaths$leafCount / (myPaths$totalCount - myPaths$leafCount)
print(myPaths, "type", "id", limit = 25)

The print displays the hierarchy I sketched out in the original post and even contains the key:values for each node. Nice!

  levelName    type id
1 A          colors  1
2  ¦--B        blue  2
3  ¦   ¦--E  dkBlue  5
4  ¦   ¦--F medBlue  6
5  ¦   °--G  ltBlue  7
6  ¦--C         red  3
7  ¦   ¦--H   dkRed  8
8  ¦   °--I  medRed  9
9  °--D       green  4

Once again I am at loss for how to translate this from the tree to nested JSON. The example here https://ipub.com/data-tree-to-networkd3/ , like most examples, assumes key:value pairs only on leaf nodes, not branch nodes. I think the answer is in creating a nested list to feed into JSONIO or JSONLITE, and I have no idea how to do that.

3
You may want to look at this: stackoverflow.com/questions/12818864/…Ian Wesley
Hi Ian, The example you cite gets me close, but I am struggling to adapt it to the point where I have the required Key:Value pairs for each "node" in the tree. The recursive approach in that example only provides key:value pairs for the terminal nodes.Tim
Tim, your problem is complicated enough that I would need to hack at it for a bit and unfortunately I don't have time at the moment. Someone with more skill than me could probably solve it faster. If you are having problems with the recursive approach another option would be to build a tree from the top down which is simpler to conceptualize. Here is the vinnetee for the data.tree package: cran.r-project.org/web/packages/data.tree/vignettes/…. You can add each child and then add attributes for each child by name. You can then export these to JSON using the following:Ian Wesley
ipub.com/data-tree-to-networkd3. I know this is not a great answer, but I hope it is helpful.Ian Wesley
'l <- ToListExplicit(myPaths,unname = TRUE) toJSON(l, pretty=TRUE)' Seems to match your JSON format. But I am probably missing something.Ian Wesley

3 Answers

3
votes

data.tree is very helpful and probably the better way to accomplish your objective. For fun, I will submit a more roundabout way to achieve your nested JSON using igraph and d3r.

nodes <-read.table(header = TRUE, text = "
ID name type
001 A   colors
002 B   blue
003 C   red
004 D   green
005 E   dkBlue
006 F   medBlue
007 G   ltBlue
008 H   dkRed
009 I   medRed
")

links <- read.table(header = TRUE, text = "
startID  relation endID    
001      hasSubCat 002
001      hasSubCat 003
001      hasSubCat 004
002      hasSubCat 005
002      hasSubCat 006
002      hasSubCat 007
003      hasSubCat 008
003      hasSubCat 009
")

library(d3r)
library(dplyr)
library(igraph)

# make it an igraph
gf <- graph_from_data_frame(links[,c(1,3,2)],vertices = nodes)

# if we know that this is a tree with root as "A"
#  we can do something like this
df_tree <- dplyr::bind_rows(
  lapply(
    all_shortest_paths(gf,from="A")$res,
    function(x){data.frame(t(names(unclass(x))), stringsAsFactors=FALSE)}
  )
)

# we can discard the first column
df_tree <- df_tree[,-1]
# then make df_tree[1,1] as 1 (A)
df_tree[1,1] <- "A"

# now add node attributes to our data.frame
df_tree <- df_tree %>%
  # let's get the last non-NA in each row so we can join with nodes
  mutate(
    last_non_na = apply(df_tree, MARGIN=1, function(x){tail(na.exclude(x),1)})
  ) %>%
  # now join with nodes
  left_join(
    nodes,
    by = c("last_non_na" = "name")
  ) %>%
  # now remove last_non_na column
  select(-last_non_na)

# use d3r to nest as we would like
nested <- df_tree %>%
  d3_nest(value_cols = c("ID", "type"))
1
votes

Consider walking down the levels iteratively converting dataframe columns to a multi-nested list:

library(jsonlite)
...
df2list <- function(i) as.vector(nodes[nodes$name == i,])

# GRANDPARENT LEVEL
jsonlist <- as.list(nodes[nodes$name=='A',])
# PARENT LEVEL       
jsonlist$children <- lapply(c('B','C','D'), function(i) as.list(nodes[nodes$name == i,]))
# CHILDREN LEVEL
jsonlist$children[[1]]$children <- lapply(c('E','F','G'), df2list)
jsonlist$children[[2]]$children <- lapply(c('H','I'), df2list)

toJSON(jsonlist, pretty=TRUE)

However, with this approach, you will notice some internal children of one-length elements are enclosed in brackets. Because R cannot have complex types inside a character vector the entire object must be a list type which output in brackets.

Hence, consider a cleanup of extra brackets with nested gsub which still renders valid json:

output <- toJSON(jsonlist, pretty=TRUE)

gsub('"\\]\n', '"\n', gsub('"\\],\n', '",\n', gsub('": \\["', '": "', output)))

Final Output

{
  "ID": "001",
  "name": "A",
  "type": "colors",
  "children": [
    {
      "ID": "002",
      "name": "B",
      "type": "blue",
      "children": [
        {
          "ID": "005",
          "name": "E",
          "type": "dkBlue"
        },
        {
          "ID": "006",
          "name": "F",
          "type": "medBlue"
        },
        {
          "ID": "007",
          "name": "G",
          "type": "ltBlue"
        }
      ]
    },
    {
      "ID": "003",
      "name": "C",
      "type": "red",
      "children": [
        {
          "ID": "008",
          "name": "H",
          "type": "dkRed"
        },
        {
          "ID": "009",
          "name": "I",
          "type": "medRed"
        }
      ]
    },
    {
      "ID": "004",
      "name": "D",
      "type": "green"
    }
  ]
} 
1
votes

a nice, if a bit difficult to wrap one's head around, way of doing this is with a self referential function as in the following...

nodes <- read.table(header = TRUE, colClasses = "character", text = "
ID name type
001 A   colors
002 B   blue
003 C   red
004 D   green
005 E   dkBlue
006 F   medBlue
007 G   ltBlue
008 H   dkRed
009 I   medRed
")

links <- read.table(header = TRUE, colClasses = "character", text = "
startID  relation endID    
001      hasSubCat 002
001      hasSubCat 003
001      hasSubCat 004
002      hasSubCat 005
002      hasSubCat 006
002      hasSubCat 007
003      hasSubCat 008
003      hasSubCat 009
")

convert_hier <- function(linksDf, nodesDf, sourceId = "startID", 
                         targetId = "endID", nodesID = "ID") {
  makelist <- function(nodeid) {
    child_ids <- linksDf[[targetId]][which(linksDf[[sourceId]] == nodeid)]

    if (length(child_ids) == 0) 
      return(as.list(nodesDf[nodesDf[[nodesID]] == nodeid, ]))

    c(as.list(nodesDf[nodesDf[[nodesID]] == nodeid, ]), 
      children = list(lapply(child_ids, makelist)))
  }

  ids <- unique(c(linksDf[[sourceId]], linksDf[[targetId]]))
  rootid <- ids[! ids %in% linksDf[[targetId]]]
  jsonlite::toJSON(makelist(rootid), pretty = T, auto_unbox = T)
}

convert_hier(links, nodes)

a few notes...

  1. I added colClasses = "character" to your read.table commands so that the ID numbers are not coerced to integers with no leading zeros and so that the strings are not converted into factors.
  2. I wrapped everything in the convert_hier function to make it easier to adapt to other scenarios, but the real magic is in the makelist function.