data.tree 0.2.0 ‘Elder’ on CRAN


Today, the latest version of the data.tree package was published to CRAN. This version has been named Elder, as in celtic symbolism the elder tree stands for transition, evolution, and continuation, at least according to this site. Transition, evolution, and continuation. Sigh, beautiful.

Back to the here and now: Thanks a lot to everyone who has made this possible, most notably Timelyportfolio, Vince Nijs, and Holger Jouanne-Diedrich with their helpful questions, suggestions, and testing.

What is data.tree?

The package is an implementation of an ordered, bi-directional tree data structure. If that sounds daunting, you might want to demystify on wikipedia. Or focus more on the why: Have you ever found yourself lost in lists-of-lists-of-lists? If so, data.tree offers you a clean, conceptually sound alternative.

In a nutshell, data.tree lets you do the following:

  1. Convert tables to a hierarchic in-memory data structure, aka tree, and back. This is useful for inspecting, printing, plotting, and transforming hierarchic data. For example, with data.tree, it is often very simple to convert a complex JSON structure to a data.frame and vice versa.
  2. Build hierarchic algorithms (machine learning, game complexity, genetics, etc.)
  3. Traverse a tree to collect values, aggregate child values, route, prune, and do other operations (read more on tree traversal here)

Example use cases for data.tree

Read the introduction, or the example applications vignettes for more information. The latter includes the following examples:

  1. TreeMap: Imagine you want to plot a treemap of the world population, by continent. How can we make sure that only the 7 largest countries are shown per continent, while the rest is grouped in a catch-all country?
  2. Portfolio Breakdown: We break down investment portfolio positions into asset classes and sub asset classes. We calculate duration and exposure per asset class / sub asset class.
  3. ID3: Implement this early machine-learning algorithm in a few lines of code
  4. Jenny Lind: Calculate and plot a decision tree, YAML conversion, and more
  5. Bubble Chart: Convert a complex JSON to a simple data.frame and plot Mike Bostock’s famous bubble example
  6. File Explorer: use the Listviewer htmlwidget to build an expandable tree widget that lets you browse through the files in your file system
  7. Gene Defect: simulate a multi-generation population inheriting a gene defect according to probabilistic rules, and estimate the probability distribution of the defect in the n-th generation
  8. Tic-Tac-Toe: Do a brute-force solution of this well-known 3×3 game

I will post some of these examples to this blog in the next few weeks.

What’s new?

Since the last CRAN release, tons of features have been added. Some of the highlights are:

  • you can now climb trees directly, e.g. acme$IT$R, and you can climb by any field, e.g. acme$Climb(position = c(1, 3))
  • you can not only assign values to a node, but also functions. This is an extremely powerful tool, sort of a hierarchic spreadsheet.
  • lots of conversions: ape/phylo, igraph, Newick, dendrogram, network, JSON, YAML, table, etc.
  • many handy methods: better Aggregate , Cumulate , Prune
  • added explicit traversal: Do t <- Traverse(root)  once, and then re-use the traveral t  on many Get , Set  and Do  calls
  • add support for default formatters

Besides that, the main focus was on cleaning up the interface, and improving the documentation. On the downside, there are a few breaking changes, namely:

  • level is now 1-based (the root will have level = 1)
  • depth is now called height
  • in Get  and similar functions.: filterFun is now called pruneFun , and there is a new arg filterFun
  • Find is now called Climb

For details, please see the NEWS file.

Unfortunately, not all planned features made it into the release. But rest assured that they are still in the pipeline. For example, I’d like to add a conversion from and to party class from partykit.

If you are interested in my work, if you have questions, suggestions, or additional interesting examples, pls drop me a line. Or even better, just create a pull request on github. There, you can also watch or star data.tree, e.g. if you want to be notified about changes in the package, or just to show how much of a fan you are.

A Short Note on the Featured Image

Obviously, I was looking for a an elder tree, but when I came across this Tim Allen look-alike from a construction company, I just couldn’t resist. My company’s name is ipub, and people often ask me what it’s got to do with beer. Or Apple, for that matter. Cider? Not really. Nothing, really. But still, I couldn’t resist!

 

Leave a Reply