# A more compact ERC mode line

I use ERC to read IRC in Emacs (as an aside, I recently had to explain to the intern on my team what IRC is). ERC comes with a nice feature called Channel Tracking. What this does is notice when there's activity in IRC channels which are not visible and update your mode line to give you a visual indication that there's new traffic waiting to be read.

Of course, just adding a list of channel names would quickly take up the entire mode line, so there is a need to produce short but meaningful abbreviations of some sort. The variable `erc-track-shorten-function`

is expected to contain a function that ERC will use for this purpose. The function takes a list of strings naming the active channels & returns the abbreviations to be used in the mode line.

By default, this is `erc-track-shorten-names`

, it is attributed to Alex Schroeder, and it works by looking for the shortest prefix of each active channel still unique among all channels. So, for instance, if my active channels are `'("#emacs" "#indieweb" "#indieweb-dev")`

, `erc-track-shorten-names`

returns `'("#e" "#indieweb" "#indieweb-")`

(#emacs is the only channel I have joined whose name begins with e).

That is certainly better, but it bothered me that "#indeweb" & "#indieweb-" shared so much text, even though "#in" would be enough to distinguish both of them among my channels. I sat down yesterday evening to see what I could do.

I first thought to post-process the output of `erc-track-shorten-names`

, looking for common prefixes. I did this by inserting the active channels into a trie, and looking for nodes that both had a value and had children. While playing around with that, I realized there was a much simpler way:

Let `ALL-CHANNELS`

be all my IRC channels, and `ACTIVE-CHANNELS`

those that are active.

1. insert ALL-CHANNELS into a trie T 2. for each CHAN in ACTIVE-CHANNELS DO 2.1 let AUX be the empty string 2.2. search T for CHAN; at each step, let I be the number of children for the current node; A. if I > 1, the current character is necessary to distinguish CHAN among all others; append the current character to AUX B. if the current node has a value, and that value isn't equal to CHAN, then this node is a prefix of CHAN; again append the current character to AUX in order to distinguish CHAN at the completion of the search, AUX will contain a minimal substring of CHAN that is still sufficient to distinguish it from all other channels

Let's look at a quick example: suppose I'm in the channels "#i3", "#indieweb", "#indieweb-dev", "#scheme" and "irc.freednode.net". This is a very crude sketch of the resulting trie; I'm indicating nodes with a single child by just putting them adjacent to one another, so for instance "n d" means the node corresponding to "n" has a single child: that corresponding to "d".

<root> --> # -> i -> 3 ("#i3") | | | | | +--> n d i e w e b ("#indieweb") | | | | | +-> - d e v ("#indieweb-dev") | | | +--> s c h e m e ("#scheme") | +--> i r c . f r e e n o d e . n e t ("irc.freenode.net")

The root node has two children: "#" and "i". "#" has two children "i", and "s". "#i" has a two children "3" and "n", and "#i3" has no children along with a node value of "#i3".

Applying the above algorithm to "#indieweb-dev" means we're going to walk through this trie down to the node "#indieweb-dev". `AUX`

starts as "", but at the nodes <root>, "#", & "i" we see multiple children, and so we append the current character at each step, giving "#in" by the time we make it onto that long run of single children. We append nothing more to `AUX`

until we reach the node "#indieweb" where the second rule applies: we've found a channel name that's a prefix of the active channel for which we are searching. That means we append the current character to `AUX`

to get "#in-". The rest of the nodes all have one child until we complete our search.

This algorithm yields the abbreviations `'("#i3" "#in", "#in-", "#s" and "i")`

– much tighter (thirteen characters in total, as opposed to twenty-five).

The implementation was pretty straightforward. My trie implementation is just a struct with an optional value & children. I use an association list for the children:

(cl-defstruct trie-node value children) (defun sp1ff/trie-create () (make-trie-node :value nil :children nil))

I'm not a good enough LISP programmer to assess the efficiency of this implementation. Maybe a hash table would be faster? Also, tries are notorious for using space inefficiently when naively implemented, and this implementation is truly naive. See here for what's involved in coding up a space-efficient implementation. Given my application, I suspect I'll be fine.

String insertion is the obvious algorithm; iterate over the string to be inserted, descending into the trie for each character. At the end, set the node's value to the string:

(defun sp1ff/trie-insert (node text) "Insert TEXT into the trie rooted at NODE." (cl-loop for c in (append text nil) do (let* ((child (assq c (trie-node-children node)))) (if child (setq node (cdr child)) (let ((new-node (make-trie-node :value nil :children nil))) (setf (trie-node-children node) (append (list (cons c new-node)) (trie-node-children node))) (setq node new-node))))) (setf (trie-node-value node) text))

Traversal is similarly simple; walk the tree as far as you can based on the search string, return what you've got.

(defun sp1ff/trie-find (node text &optional f) "Search the trie rooted at NODE for TEXT; if given invoke F at each step." (cl-loop for c in (append text nil) do (let* ((val (trie-node-value node)) (children (trie-node-children node)) (child (and children (assq c children)))) ;; NB `child' is the node corresponding to `c' (if (functionp f) (funcall f c val children)) (if child (setq node (cdr child)) (cl-return val)))) (if node (trie-node-value node) nil))

With those building blocks, I can now implement by own shortening function:

(defun sp1ff/erc/shorten-names (channel-names) "Given a list of strings CHANNEL-NAMES, produce a corresponding set of sub-strings, each sufficient to distinguish it's corresponding channel among all channels." (let ((trie (sp1ff/trie-create))) (mapcar (lambda (x) (sp1ff/trie-insert trie x)) (erc-all-buffer-names)) (mapcar (lambda (chan) (let ((aux "")) (sp1ff/trie-find trie chan (lambda (c val children) (let ((i (length children))) (if (or (> i 1) (and val (not (string= val chan)))) (setq aux (concat aux (string c))))))) aux)) channel-names)) (setq erc-track-shorten-function 'sp1ff/erc/shorten-names)

The result:

07/24/20 14:27