Home > Development > An unexpected gotcha in ZooKeeper watches

An unexpected gotcha in ZooKeeper watches

March 9, 2011

I’ve been doing some work with ZooKeeper at work lately. Specifically, we’re trying to use it to manage cluster topology (this work is related to my previous post on Berkeley DB Java Edition). Using the ZooKeeper Java bindings, it looks like the watch functionality is a beautiful thing. It is, actually…however there is a gotcha.

ZooKeeper will let you watch any znode for one of four events: created, deleted, data changed and children changed. These four events encompass everything you would ever want to know about a znode. These events aren’t where the gotcha lies. The gotcha lies in when they are fired.

The created and deleted events are fired (predictably) whenever a watched znode is created or deleted. Data changed is fired whenever a watched node’s data is changed. Children changed is fired whenever the children of a watched node are changed. This seems like it would meet all your needs, unless…

Let’s say you have a znode named “/members” to represent the members of a cluster. Each member creates an ephemeral node “/members/member-N”. It would be useful to clients wishing to communicate with the cluster to be alerted anytime a member is added to or removed from /members. So you subscribe a watcher to /members. When /member/member-1 joins the group, your code watches for the created event…which never comes. Likewise, when /member/member-1 leaves the group, your code watches for the deleted event…which never comes. ZooKeeper doesn’t roll like that. Instead, ZooKeeper fires only the children changed event.

This seems all well and fine, you’ve been notified that a change has occurred. Unfortunately you don’t know if it’s a creation or a deletion that has triggered the children changed event. If you’re using an internal data structure to represent the current state of the cluster, you don’t know what path needs to be added to or removed from your internal representation of the cluster state. Instead, you must get the entire list of children for /members and update your entire internal state (either rewriting it entirely or doing a diff and update).

Anyway, this isn’t earth shattering and it’s not a problem that can’t be solved. But one day I spun my cycles for a bit wondering why I wasn’t seeing any creation or deletion events for children of a watched znode and I didn’t immediately know why. So I figured I’d share this and (hopefully) save someone some time.

For more information on ZooKeeper watches, read Animesh Kumar’s very nice writeup.

Categories: Development
  1. srikanth
    March 13, 2011 at 7:10 pm | #1

    Thanks. I was planning to use zookeeper for something similar. Your post has saved me some confusions I might have otherwise gone through.

Comments are closed.
Follow

Get every new post delivered to your Inbox.