rKanren - Guided Search in miniKanren, Part I

Reading time ~16 minutes

In a previous post, I mentioned a rough idea to allow users to perform guided search in miniKanren. As part of my qualifying examination with Dan Friedman, I rewrote the deep structural segments of miniKanren to allow users to do exactly that! (The code will be posted in a public repository when the paper is published or rejected.)

This post will discuss the basic usage of this new search strategy through the condr form. This form uses a revised implementation to associate ranks with individual search paths, allowing us to guide the answer search.

Generic Answer Order

To help describe the implementation and how it “ticks”, we’ll start with some simple examples to describe the new search strategy. First, consider the following miniKanren programs using conde and their associated output:

> (run* (q)
    (conde
      ((== q #t)) 
      ((== q #f))))

(#t #f)

> (run* (q)
    (conde   
      ((== q #f)) 
      ((== q #t))))

(#f #t)

Using conde causes the goals listed to be explored in a first-in, first-out style, exploring the first goal of the first clause, then the first goal of the second clause, and so on.

Depending on the order of clauses to change the answer order, however, is less than ideal: programmers must carefully modify programs to change the order of the answer stream, and the result may be unexpected. One alternative is to provide a new form, condr, which associates a numeric value with each clause in order to guide the search:

> (run* (q)
    (condr
      (2 (== q #f))
      (1 (== q #t))))

(#t #f)

Complex Searches

While the searches are associated with a rank, there are a number of other considerations when designing guided searches. The most important is that more goals mean more work, and the work done during a computation contributes to the cost, and thus the computation’s rank. Thus if a search path is ranked lower but does significantly more work, the cost increase may cause the answer to occur later.

> (run* (q)
    (fresh (a b)
      (condr
        (2 (== q #f))
        (1 (== q `(,a ,b))
           (== a #t)
           (== b #t)))))

(#f (#t #t))

This additional computation cost may be dealt with by increasing the rank of other branches: each goal contributes one more point of work, so we must increase the first goal to rank 4 in order to produce the expected behavior once again.

> (run* (q)
     (fresh (a b)
       (== q `(,a ,b))
       (conde
         ((condr
            (2 (== a 'a) (== b 'b))
            (1 (== a 'b) (== b 'a))))
         ((condr
            (2 (== a 'a) (== b 'a))
            (1 (== a 'b) (== b 'b)))))))

((b a) (b b) (a b) (a a))

Recursive Weights

The original motivation for developing condr was to find deeper answers more quickly. Large search trees may be easily generated using recursive functions, but previous miniKanren implementations have been unable to return answers found recursively earlier than ground answers listed in the function’s body.

Ground answers are answers found at the end of search branches, and grounding out indicates arriving at the end of a search path and finding an answer. Consider the following function, written with conde, and associated call. It has two recursive options and a single clause (the first one) that grounds the search path.

(define recur-e
  (lambda (e)
    (fresh (a b)
      (conde
        ((== e '(x)))
        ((== e `(b . ,a)) (recur-e a))
        ((== e `(a . ,b)) (recur-e b))))))

> (run 5 (q) (recur-e q))
((x) (b x) (a x) (b b x) (a b x))

The conde clause that grounds out first—in this case, the first clause—is the first answer, and each subsequent answer is produced in order of the conde clauses. If we would like these answers in a different order, we may use condr to rank our desired answer order.

(define recur-r
  (lambda (e)
    (fresh (a b)
      (condr
        (10 (== e '(x)))
        (4  (== e `(b . ,a)) (recur-r a))
        (2  (== e `(a . ,b)) (recur-r b))))))

> (run 5 (q) (recur-r q))
((x) (a x) (b x) (a a x) (b a x))

This result may be somewhat unexpected: though the first clause is ranked significantly higher, and thus should occur later than the other clauses, it is still produced as the first answer. Luckily, the intuition here is straight-forward: the other two clauses must eventually ground out in a via the first clause, and ranks are cumulative. As a result, the answer (x) has rank 10 while the answer (a x) has rank 12 and the answer (b x) has rank 14.

If we would like complex answers earlier, we must change the cost of grounding during later recursive calls. This is a relatively simple fix: each rank is a full Scheme expression that evaluates to a natural number, so we may use an extra parameter to keep track of our recursive depth and use the information to change the grounding cost:

(define recur-r-n
  (lambda (e n)
    (fresh (a b)
      (condr
        ((if (< n 1) 10 1)
         (== e '(x)))
        (4 (== e `(b . ,a))
           (recur-r-n a (add1 n)))
        (2 (== e `(a . ,b))
           (recur-r-n b (add1 n)))))))

> (run 5 (q) (recur-r-n q 0))
((a x) (b x) (a a x) (x) (a b x))

Our recursion tracks the depth, and after the first two steps the grounding cost is reduced from 10 to 1, causing the answers (a x), (b x), and (a a x) each to have a total cost lower than 10. This technique allows users to explore deeper branches of the search space before considering the shallow answers.

Summary

The behavior of condr is modeled after the A* search technique, where the traditional heuristic approach has been discarded in favor of simple Scheme expressions that evaluate to numeric values. (This may lead to inadmissible heuristics—heuristics that may overestimate the cost to reach an answer—and programmers should take care to fine-tune the numeric values used.) Consider revising recur-p by ranking the first clause with one hundred: 100 recursive steps would need to be taken before any ground answers were produced, and yet the answer order would be identical to the one presented.

More importantly, the condr form still performs a complete search: unlike the conda and condu forms that provide users some control over search paths but ultimately discards certain answers, condr will eventually find every answer in finite search spaces. Furthermore, condr is strictly more expressive than conde: conde’s behavior may be reproduced by replacing each rank in condr with the number 0.

Using LaTeX for Programming Language Semantics

LaTex is a fantastic tool for typesetting, but there seem to be a serious gapin documentation for using it to lay out programming semanti...… Continue reading

A Small ALU in Haskell, Part I

Published on April 24, 2015

The Refined Gradual Guarantee and Compilation

Published on April 15, 2015