We worked on the Roman Numeral kata, basically just converting between Roman numerals and decimal numbers. We first tackled converting decimal to roman. Our first attempt looped over the (diminishing) decimal value, checking a set of conditions to determine what to pull out of the decimal and emit as the next largest Roman numeral.

This approach worked but no one felt good about the big set of cases so we refactored it to be based on a data table. This approach requires looping instead over the set of roman numeral values from big to small, repeatedly tearing off the appropriate hunks.

(def numerals {1 "I", 4 "IV", 5 "V", 9 "IX", 10 "X", 40 "XL", 50 "L", 90 "XC" 100 "C", 400 "CD", 500 "D", 900 "CM", 1000 "M"}) (defn to-roman [decimal] (loop [[num & r :as nums] (reverse (sort (keys numerals))) roman "" d decimal] (if num (if (>= d num) (recur nums (str roman (get numerals num)) (- d num)) (recur r roman d)) roman)))

In the main loop we have three variables:

`nums`

– The sequence of roman numeral values still to try. These are destructured into the first`num`

and a sequence`r`

of the rest of the them while preserving the whole sequence as nums. This is initialized from our data table by just sorting the keys in descending order (1000, 900, …).`roman`

– The accumulator of the end value, initialized to empty string`d`

– The remaining decimal value to be converted, initialized to the input

In each step of the loop, we have three cases:

- We still have numerals to try and the d has enough left in it to catch the current numeral – in this case, remove the value of the numeral from d, and prepend the roman numeral to the accumulator.
- We still have numerals to try and d does NOT have enough left in it to apply to the current numeral – in this case, drop the current numeral and try the next one.
- We have no numerals left to try – just return the accumulated value.

One key insight we had during this change was to treat the two-letter combinations (like IX) as a numeral value instead of special-casing. We suggested writing some code to generate the numerals table for a smaller starting table. That still appeals to me, but certainly would have been more effort than it was worth.

At this point we did some more testing to satisfy ourselves that it was working. I did not preserve the tests but we just applied to-roman to a range of inputs and eyeballed them, then did some spot checks for larger ones (1492, 1977, etc).

Next we tackled the from-roman as the inverse of to-roman, pulling off the first character (or two!) from the input roman numeral and adding its value to an accumulated decimal. We knew we needed a data table for this one as well so we worked in the repl to munge our prior table till we had something that looked useful, then encoded the process of producing it as a digits table. We ended up reworking that on our second pass.

I did not preserve our first attempt at the from-roman code unfortunately. It worked but had a lot of brute force conditions in it. The digits table was unnecessarily complicated. This is our second pass at cleaning it up:

(def digits (apply hash-map (mapcat reverse numerals))) (defn from-roman [roman] (loop [[c1 c2 & r] roman d 0] (if-let [num (and c2 (get digits (str c1 c2)))] (recur r (+ d num)) (if-let [num (get digits (str c1))] (recur (apply str c2 r) (+ d num)) d))))

We loop through the roman numeral and pull off the first char, second char, and the rest of the chars, accumulating our final answer into d.

For each pass of the loop, if there are at least two characters in the roman numeral, we look up that two-char sequence in the digits table. If we find it (ex: IX), then we recurse with the rest of the chars and the accumulated value. If we don’t, then we try again with just the first character (ex: X) – note that this also nicely outlines the no chars case which will do a lookup of nil and return nil (all other should be found).

We had a bug in our first refactoring to this form in recombining c2 back into r (weren’t using apply). Since we had to-roman and from-roman, we were testing by round-tripping over a range of values and checking for equality. We were seeing about 20% failures over the range 1:10000. We found an example value that failed and added a println to the top of the loop to dump the values which quickly highlighted the problem.

I cleaned up the repl checks we were doing into test-round-trip:

(deftest test-round-trip (let [nums (range 1 10000) round-trip (map #(from-roman (to-roman %)) nums)] (is (= nums round-trip))))

That was a fun kata! I wasn’t quite sure how it would work as I’ve never done it before but it wasn’t as bad as I feared.

]]>Brief description:

This informal talk will describe an approach for running queries as Hadoop Map/Reduce jobs from Clojure. The talk will cover:

- representing queries as s-expressions in Clojure
- brief introduction to the Cascading library
- compiling s-expressions into Cascading flows
- storing semantic data in Hadoop

The Clojure lunch club meets on the 4th Thursday of the month at 11:30 am in the Revelytix offices. The office is located at 680 Craig Rd, Suite 106, Creve Coeur 63141. Lunch is provided!

]]>It’s a great overview of some of the interesting things done at Revelytix in building a stream-processing query engine and how Clojure has enabled that work. Topics include: how a query engine works, buffering to disk with memory-mapped files (some Java), representing data as s-expressions, and some real world Clojure use.

Meeting: Thursday, Oct. 20th, 2011 at 11:30 am

Location: 1099 Milwaukee, Kirkwood, MO 63122 – Room 50 in basement

Pizza will be provided!

]]>Nate Young will be talking about some code he’s been writing to implement regex within Clojure. [No snarky comments about Clojure already supporting regex are required.]

Hope to see you there!

]]>Two topics for discussion this month:

- Jeff Sigmon – a small app that scrapes a website with enlive and outputs text to speech using the MARY TTS Java library
- Dave McNeil – will show some interesting design aspects of an RDF library that starts from core data structures and decorates abstractions over the top to add features

Hope to see you there!

]]>Last month we did a brief intro to Clojure and talked about various things to do at our meetings. One of the first things we plan to do starting this month is work through a Clojure problem as a group, discovering how to work from problem statement to tests and code.

Hope you can join us!

]]>In last week’s cljub session, a question was posed as to how to start thinking in terms of functional programming, coming from an object-oriented programming background. This question, and the topic of “FP uptake”, is of interest to me, not only because I’m trying to figure this out myself, but also because our company has gone down the path of using FP for our core product development (and most developers are still coming from an OO background).

At the time, I was in the middle of reading John Hughes’ paper from 1984, Why Functional Programming Matters, and thought of suggesting that as one (of many) ways to “grok FP”. But the paper has some funny syntax in it (relative to my 1 and only FP data point, clojure). So I decided to try to implement some of the examples in the paper using clojure. Below (couldn’t figure out how to attach) is a REPL session I did to implement the first example, a Newton-Raphson square root algorithm, from section 4.1. I annotated the session so people new to clojure might see how someone might incrementally work up a function to solve a problem in clojure using the repl.

This implementation uses a few interesting functions and capabilities in clojure: iterate, partial, cons, recur, destructuring in argument lists, variable-length arg lists, passing functions as arguments.

The paper can be found on the web in various forms. One is here: http://www.cse.chalmers.se/~rjmh/Papers/whyfp.pdf. There’s also an interview with Hughes on InfoQ: http://www.infoq.com/interviews/john-hughes-fp. [BTW, in that interview, Hughes says his favorite book is “Learning Functional Programming” by Paul Hudak; that book uses Haskell though].

If someone is new to clojure and wants to learn a lot fast, you may be interested in doing something similar with the examples in sections 4.2 (numerical differentiation) and 4.3 (numerical integration). This was the first time I explicitly used lazy evaluation to actually solve a problem, and was also the first time I did anything with destructuring, so this was well worth my time.

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ; Why Functional Programming Matters, John Hughes, 1984. ; Section 4.1, Newton-Raphson Square Roots ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ; Iteration function, version 1 (defn fx [n x] (/ (+ x (/ n x)) 2)) (fx 100 10) ; 10 (fx 100 11) ; 221/22 (fx 100 12) ; 61/6 ; Need to call this function repeatedly, passing in the latest estimate x each time. ; The paper uses the function 'repeat', let's start there. (doc repeat) ; Nope, that just repeats the value or sequence you pass in; no functions involved. (doc repeatedly) ; Nope, this calls the same no-arg function repeatedly. Not much use for this situation. (find-doc "repeat") ; Lots of spammage, nothing useful here... ; Finally ran across 'iterate' function somewhere on internet while looking for ; something totally different... (doc iterate) ; ------------------------- ; clojure.core/iterate ; ([f x]) ; Returns a lazy sequence of x, (f x), (f (f x)) etc. f must be free of side-effects ; Note that the iterate function takes 1 arg, so 'bind' in the 1st arg to the fx function ; using the clojure 'partial' function. (iterate (partial fx 100) 15) ; Ruh roh, just killed by repl with an infinite sequence! (But it seemed like it worked) (take 5 (iterate (partial fx 100) 15)) ; (15 65/6 1565/156 976565/97656 381469726565/38146972656) ; Don't really like the ratios, so convert to float instead - iteration fn, version 2 (defn fx2 [n x] (float (/ (+ x (/ n x)) 2)) ) (take 5 (iterate (partial fx2 100) 15)) ; (15 10.833333 10.032051 10.0000515 10.0) ; First cut at the function to check for convergence. The paper's definition looks ; similar to clojure's destructuring, so try that. (defn within [ [a b remainder] ] (if (<= (abs (- a b)) 0.0001) b (recur (cons b remainder))) ) ; java.lang.Exception: Unable to resolve symbol: abs in this context (doc repeat) ; Nuthin. ; I thought clojure had an 'abs', maybe that's incanter. Anyways, it's easy to write one. (defn abs [x] (if (< x 0) (- 0 x) x)) (abs 3) ; 3 (abs 0) ; 0 (abs -1) ; 1 ; Now back to our regularly scheduled programming... (defn within [ [a b remainder] ] (if (<= (abs (- a b)) 0.0001) b (recur (cons b remainder))) ) (within [10 11 12 13 14]) ; java.lang.IllegalArgumentException: Don't know how to create ISeq from: java.lang.Integer ; Gaaa. Something's amiss, probably my destructuring attempt. ; Make sure I'm calling cons the right way... (doc cons) ; cons needs a seq for 2nd arg ; Digression to check destructuring... (defn test-destruct-args [[a b c]] (println a (type a) b (type b) c (type c))) (test-destruct-args [1 2 3 4 5]) ; 1 java.lang.Integer 2 java.lang.Integer 3 java.lang.Integer ; nil ; Last arg is being seen as a scalar, need it to be the 'rest of the sequence' (defn test-destruct-args [[a b & c]] (println a (type a) b (type b) c (type c))) (test-destruct-args [1 2 3 4 5]) ; 1 java.lang.Integer 2 java.lang.Integer (3 4 5) clojure.lang.PersistentVector$ChunkedSeq ; nil ; Need to put an '&' before last arg to get the 'rest of the sequence' (defn within [ [a b & remainder] ] (if (<= (abs (- a b)) 0.0001) b (recur (cons b remainder))) ) (within [10 11 12 13 14]) ; java.lang.NullPointerException ; Not handling the 'end of sequence' condition, (- a b) blows up with nils. ; [Also, not handling the termination condition for the recur, so we'd go on forever!] (defn within [ eps [a b & remainder] ] (if (<= (abs (- a b)) eps) b ; if true part (if-not (nil? remainder) (recur eps (cons b remainder)) nil) ; else part ) ) ; Oh, btw, added in eps as an argument above, removed the hard-coded value (within 0.0001 '(12 13 14 15 15.1 15.11 15.111 15.1111 15.11111 15.111111)) ; 15.1111 (within 0.0001 (iterate (partial fx2 100) 15)) ; 10.0 ; Now put together the iteration function fx (or fx2) and the convergence function, within (within 0.0001 (iterate (partial fx2 100) 900)) ; 10.0 (within 0.0001 (iterate (partial fx2 100) 0)) ; java.lang.RuntimeException: java.lang.ArithmeticException: Divide by zero ; ignore trapping edge cases (within 0.0001 (iterate (partial fx2 100) 1)) ; 10.0 (within 0.000000001 (iterate (partial fx2 100) 1)) ; 10.0 (within 0.000000001 (iterate (partial fx2 877) 1)) ; 29.614185 (within 0.000000001 (iterate (partial fx 877) 33)) ; 725383613681596145326847250447451516047505115519/24494464201290276417946743597698555717522343369 (float (/ 725383613681596145326847250447451516047505115519 24494464201290276417946743597698555717522343369)) ; 29.614185 (float 725383613681596145326847250447451516047505115519/24494464201290276417946743597698555717522343369) ; 29.614185 ; Define the sqrt function now that we have the pieces... (defn sqrt [eps a0 n] (within eps (iterate (partial fx n) a0))) (sqrt 0.0000001 14 100) ; big long ratio, yuck ; I could redefine sqrt with my other iteration function, fx2. But why not parameterize ; the sqrt function with the iteration function? (defn sqrt [iter-fn eps a0 n] (within eps (iterate (partial iter-fn n) a0))) (sqrt fx 0.0000001 14 100) ; 15917322219892801768783874/1591732221989280176878387 (float 15917322219892801768783874/1591732221989280176878387) ; 10.0 (sqrt fx2 0.0000001 14 100) ; 10.0 ; But of course it's gratuitous overkill (and confusing to users) to always specify that ; function. So customize for a particular usage (application, enterprise library, etc). (def my-sqrt (partial sqrt fx 0.0000001)) (my-sqrt 14 100) ; 15917322219892801768783874/1591732221989280176878387 (def my-sqrt (partial sqrt fx2 0.0000001)) (my-sqrt 14 100) ; 10.0 ; So, might as well just go for the gold now and parameterize the convergence fn: (defn sqrt [iter-fn conv-fn eps a0 n] (conv-fn eps (iterate (partial iter-fn n) a0))) (def my-sqrt (partial sqrt fx2 within 0.0000001)) (my-sqrt 14 100) ; 10.0 (my-sqrt 14 877) ; 29.614185 ; It is easy now to fold in the 'relative' version of the convergence fn from the paper. ;;;;;;;;;;;;;;;; ; Post-mortem ;;;;;;;;;;;;;;;; ; Lessons: ; - get the pieces working first, then put them together ; - don't worry about non-essentials (like the tolerance above), until you get the essentials ; - break out and experiment 'in general' when necessary; understand how something works ; Why this particular ordering of the parameters in the sqrt argument list? ; - iteration & convergence functions 1st, to partial out for particular application ; - eps second, to partial out for particular use in a fn or module (for known 'size' #'s) ; - initial value third, possibility to partial out for particular function or module ; - number to be square-rooted last - always different ; If you were really doing what I did above, parameterizing the sqrt function with the ; various other functions, the generic sqrt would likely be named something else, like ; 'sqrt-gen', and the specific incantations named something more user-friendly, like ; 'sqrt' (if only 1 used primarilly) or 'sqrt1', 'sqrt2' or 'sqrta', 'sqrtb', etc.]]>

The club will meet at 1099 Milwaukee, Kirkwood, MO 63122 in the Boiler Room, located in the NE corner of the lower floor. Tentatively, the meetings will be on the 4th Thursday of the month starting August 26th at 11:30 am – 12:30 pm.

A google group has been created for discussion of topics and other stuff. The first meeting will tentatively discuss the just released Clojure 1.2 features (records, protocols, etc).

All levels of Clojure experience are welcome! If you’re just getting started, we’re happy to help.

]]>