All trees with at most 30 vertices have a harmonious labelling

In order to attract more discussion, exceptionally I will write a post in English instead of in Chinese.

What is harmonious labelling?

Let $T=(V,E)$ be a tree and $n=Card(E)$ its number of edges. A harmonious labelling for the tree $T$ is a surjection

$f: V \rightarrow \{0, 1, \ldots, n-1 \}$

such that the induced edge labelling function

$g: E \rightarrow \{0, 1, \ldots, n-1 \}, \{ v_1, v_2 \} \mapsto f(v_1)+f(v_2) \mod n$

be a bijection.

Note that for a tree, $Card(V)=Card(E)+1$, therefore in a harmonious labelling $f$ there are exactly a pair of vertices sharing the same value. Furthermore, we can see that if $f$ is a harmonious labelling for a tree $T$, then $f+k \mod n$ for arbitrary k is also a harmonious labelling for $T$. By this symmetry we can set the replicated value in $f$ to 0. Such a labelling $f$ falls also in the definition of felicitous labelling (modulo n).

A tree allowing at least a harmonious labelling is called a harmonious tree. It is conjectured by Graham and Sloane that every tree is harmonious. Aldred and McKay performed a computer search to verify that every tree with at most 26 vertices is harmonious. There are also many theoretical approach towards the conjecture about harmonious tree. I suggest Gallian’s survey on graph labelling as a great source of information.

Recently I have worked out an algorithm to search for harmonious labelling given a certain tree. With this homebrew algorithm, using an exhaustive computer search, I’ve verified that every tree with at most 30 vertices has at least a harmonious labelling.

Description of algorithm

The algorithm is not a neat one, basically I glued up a few searching techniques as a whole, using one after another.

As usual, we can treat the problem of finding a harmonious labelling as a problem of constraint satisfaction. Constraints here are mostly global ones, and I cannot see a very good way to propagate them among vertices other than doing a backtracking search. In fact this is what I’ve done for graceful labelling.

The problem of global constraints is that they are weak for consistency method. When few values are determined, they can barely reduce any domain. But what if we try to do a partial backtracking search, then a constraint propagation with the reduced domains and constraints? At this time, for a part of vertices the value is determined, and for the rest their domains are reduced greatly. For the global constraints, they are still global, but stronger in the sense of power to reduce domains. Therefore, we are more likely to be able to propagate constraints efficiently in this situation.

As I am not an expert in constraint programming, the paragraph above may not be very accurate for someone doing constraint programming. But at least I think the idea is there. Please correct me if I’ve said something stupid.

Anyway, the actual strategy employed is as following: we first do a random partial backtracking search to find an assignment compatible to constraints only for internal nodes, then using this assignment to calculate domains and constraints for leaves of the tree and perform consistency method to get an answer. The random partial backtracking is speedy as we backtracks seldom due to relaxed constraint, and the consistency method afterwards is also speedy due to smaller domains and stronger constraints. In a whole, they go really fast.

Evidently this is nothing near a systematic exploration of the solution space, it is more like sampling, and it has absolutely no guarantee on finding a solution. Hence I run it several times to increase the probability of resolution.

After running this two-step constraint stuff, I run a full backtracking search with a bound on number of backtracks if the previous step didn’t succeed in resolution. If the full backtracking didn’t work neither, a hill-climbing tabu search is performed to wrap the whole thing up. These two techniques are recycled from the similar verification I’ve done for graceful labelling.

With this hybrid algorithm, it is easy to verify that every tree with at most 30 vertices is harmonious. Just enumerate every free tree, then find a harmonious labelling for each one of them.

Result

With this approach I am able to verify that every tree with at most 30 vertices is harmonious.

The verification of trees with 29 vertices took about 270 CPU hours, for the case of 30 vertices it took about 855 CPU hours. The CPU time measured here is all based on a single core of a C2D T7200 @2.0G. This seems to disprove my claim that the time taken to verify for all trees with n vertices is about $O(3.75^n)$. And now I am not sure what the correct base should be. Judging from the running time for case of 28, 29 and 30 vertices, it should be something like 3.17, but I need more evident to conclude. Maybe I should run this algorithm on randomly sampled free trees of different sizes to collect empirical runtime data.

Discussion

For searching harmonious labelling for trees, we have introduced a new technique which I would like to call it “two-step search” for convenience. In this part of the post I would like to talk about this technique.

First of all, I don’t know if this technique exists already. It just came up to my mind one day. I haven’t found any literature on similar technique. After all, I am not really familiar with constraint programming and I have only read a book and a few papers in this domain, plus half of an introduction course. It is really probable that I’ve reinvented the wheel again. If anyone knows something similar to this idea, please give me a pointer so that I can try to understand it better.

So the rest will be my personal thoughts. They may be wrong, so if anyone spots an error, please point it out.

This two-step search can be viewed as an extension of both backtracking and consistency method. If we see it as backtracking, the consistency method can be viewed as a tool to quickly check for conflicts. If we see it as consistency method, the backtracking part can be viewed as a tool to temporary introduce additional constraints for a better and quicker consistency checking and constraint propagation. Backtracking is cool for global constraints, consistency method is systematic and complete. If we can find a good middle point, it may be possible that resulting combination be good for solving problems with a lot of global constraints.

This two-step search can be easily turned into a systematic search, we just need to do the backtracking in a systematic way. But in this case, there may be a lot of redundant calculation in the consistency part, as the potential partial solutions provided by partial backtracking may only differ by a few values. Therefore I prefer to use this two-step search in a randomized, sampling fashion. It will work well if there are many solutions and they are dispersed in the searching space. I think this is the case for harmonious labelling.

The choice of middle point seems to be important too, as least for me. For harmonious trees, I’ve chosen to do backtracking on internal nodes and leave leaves to consistency method. This is majorly for cutting up dependence between remaining variables. There are two reasons to do this: I don’t want to write complicated consistency method, and the consistency method may run faster with fewer dependence. In this way, I am able to write a naïve constraint propagation routine for the consistency part. This choice, of course, sacrifices some performance of the backtracking part, because there may be more uncertainty in this part.

I don’t know if this technique can be applied to other problems. For graceful trees, because the ordinary backtracking is doing so well that I am not motivated to use two-step search in it. Maybe I will find another suitable problem.