Summary

This was my fifth Clojure conference. I’ve become accustomed to having my mind blown, being humbled, entertained, charmed, and inspired, and coming away from each conference with a bunch of new tools and tricks to try.

This conj was no exception. Changes this year:

  • Greater diversity. This is a welcome shift, though still far from optimal.
  • Videos from talks went online within hours. This is also a big improvement over some of the previous conferences.
  • I knew and recognized more people.
  • I understood more of the content.
  • I came with colleagues – a very welcome change.

As always I came away with a renewed appreciation for the intellectual depth and the warmth of this community. The first talk by Paul deGrandis I found especially noteworthy. The work he, Tim Baldridge, and others at Cognitect did for Consumer Reports showed the power of the Clojure language, philosophy and ecosystem. They created not just a single Web service or data import mechanism or rich reporting client but, in each case, developed data-driven (declarative) frameworks for creating those things, in less than a month per framework, with only two developers. I’ve been assuming that doing things well with Clojure takes more time initially but pays off in the long run due to the benefits which comes from focusing on simplicity. Now I still believe in the latter claim (benefits of simplicity) but have in hand a vivid counterexample for the former (speed of development). I was also particularly impressed that they spent half their time on design. I think the talk (see links below) is well worth watching and contemplating.

Thanks to my employer OpinionLab for sending me to the conj, and for letting me post these notes on my personal blog! By the way, we’re hiring.

Notes

What follows are my own terse notes (quotes, links, examples, things to try or think about), ordered by talk, in case someone else might find them helpful.

Notes

Paul deGrandis - Data → Services and Apps

Amazing talk.

Video link: https://www.youtube.com/watch?v=BNkYYYyfF48

Slides: https://github.com/ohpauleez/data-driven-slides

Free the data! (From RDBMSs)

Prototype services in hours instead of days/weeks.

Services only do a handful of things. This can be abstracted away (service of services).

“VASE” - stored in Datomic; based on Pedestal

We spent half our time in design.

  • Superpowers - Clojure/ClojureScript/Datomic/core.logic/…
  • Power in constraints and simplicity.
  • Combinations multiply impact.
  • Holistic exponential value combination
  • Be exploratory; think holistically
  • Write it all down.
  • Constrain the design space.
  • Think critically and SLOWLY.
  • Visualize outcomes and possibilities.

Anna Pawlicka - OM for Visualization

Dimple.js - higher level lib on top of D3

http-kit and sente - websockets

sente has core.async as well – “just works”

Colin Fleming - Cursive

Cursive plugin for IntelliJ continues to improve.

Text is the wrong level of abstraction for working w/ our code.

Type less – let the tools do more.

“I object to doing things that computers can do.” – Colin Shivers

Determining if a macro expansion terminates is the halting problem.

“You should never have to see your namespace declaration. I hate that thing.”

Text is a serialization protocol for our programs.

For Emacs, look at clj-refactor.

Steve Miner

“Herbert” schema generator (very \(\alpha\)) → test.check property expressions

(hg/generator '[(:= V [int+]) (in V)])

replaces

(gen/bind (gen/not-empty (gen/vector gen/int))
  #(gen/tuple (gen/return %) (gen/elements %)))

Can also use regexps, e.g. (str #"fo*bar+")

“Regex’s are probably the most powerful DSL in history.”

Use Herbert both for test.check generators AND validation.

Ghadi Shayban

Nashorn

Truffle and Graal

Rewriting nodes of ASTs based on optimistic typing

Inlining like macroexpansion

TruffleC - crazy project: inline trees of C in trees of Ruby and vice versa.

Clojure PersistentVector implements \(\approx 20\) interfaces(!).

Why even run Java bytecode? (Future tools may make bytecode less relevent.)

“Low-level stuff is fun.”

Bozhidar Batsov - History of Clojure Tooling for Emacs

CIDER maintainer. Dark knight of the order of Emacs.

“You are the architect of the structure you want, built on a foundation [Emacs].”

CIDER: 85 contributors v. 1 contributor for Cursive.

TODO Try sexp-fu, paxedit, rainbow-identifiers, refactor-nrepl.

Steven Yi

Music – Pink and Score. Overtone wraps Supercollider; Pink and Score replace those.

Linking space and time.

Lock-free audio functions.

Filters/delay lines require state.

Rich Hickey – Inside Transducers

Transduce = “lead across.” map, filter, take, mapcat, partition-by now all return transducers.

Transducer takes and return 3 arity functions; arity-0 flows through to wrapped fn. Arity-1 is used for completion, if none flow through. Arity-2 is reducing stop – you can morph input, call nested (or not), expand, etc.

Map:

(defn map
  ([f]
     (fn [rf]
       (fn
         ([] (rf))
         ([result] (rf result))         ;; call reducing fn
         ([result input]
            (rf result (f input)))))))  ;; ACTUAL map operation

Filter:

(defn filter
  ([pred]
     (fn [rf]
       (fn
         ([] (rf))
         ([result] (rf result))  ;; call reducing fn
         ([result input]
            (if (pred input)     ;; Again, the meat
              (rf result input)
              result))))))

take requires state (volatile!).

Mapcat is simple composition:

(defn mapcat
  ([f] (comp (map f) cat)))

where cat is a transducer which concatenates the contents of each input into the reduction. Expands input into many inputs in the next stage. It is semantically opposite to a reduction.

partition-by builds up intermediate collections and returns the last during the 1-arity (complete) phase if given a finite input.

(defn transduce
  ([xform f coll] (transduce xform f (f) coll))
  ([xform f init coll]
    (let [xf (xform f)  ;; transformation of reducing step
          ret (reduce coll xf init)]
      ;; completion:
      (xf ret))))

Transducers are “pushy,” rather than lazy. Especially at completion.

educe – to lead out. Lead back, lead across, lead out. Return an “eduction” which is the reduction without actually carrying it out.

Good for inexpensive work that you’re not ready to use yet. Still composes with subsequent transductions.

Parallel transducers are coming (like reducers).

More core.async:

Channels will be deref-able with blocking semantics (or timeout).

“Promises are cool, but channels are cooler.” Promise semantics coming to channels as “promise channels”. Write once, read many. Always write-ready. All readers will unblock together.

Don’t do a lot of work in channel transducers (they block the channel); use pipelines instead. Parallel with user control of number of threads. But can’t parallelize stateful transducers.

[We can make fruitful use of core.async and transducers at work… best to start understanding them now.]

UNSESSION: Bozhidar Batsov

CIDER Demo – new features

  1. Now don’t have to evaluate namespace. Use :init-ns option to :repl-options.

    Look at *nrepl-messages* (recently upgraded) if problems.

  2. cider-undef undefines a var.

  3. cider-refresh does Stuart Sierra trick for reloading namespaces. “30 years from now people will still be using Emacs… but not IntelliJ.”

  4. M-. can show Java code now (and presents a selection of which class to look at).

  5. C-h m show all keybindings in scope.

  6. C-h f show keybindings for mode.

  7. C-c C-d show JavaDoc (browser not needed).

  8. smex (on top of IDO). Suggest recent commands.

  9. Look at ac-cider for autocomplete. “This is Emacs – anything is possible.” “I have been drinking a lot of Cider while fixing CIDER bugs… it’s a problem that kind of fixes itself.”

    See also compliment Clojure completion library (Clojure only).

  10. Grimoire integration (in-editor or in-browser).

  11. Tracing functionality cider-var-toggle-trace (C-c M-t v).

  12. Choose recent file?

  13. C-u C-c C-z? change namespace in REPL.

  14. C-c M-o clears everything.

  15. cider-browse-ns – list of all namespaces, explorable.

  16. Can open (browse) jar files from Emacs!!!

  17. dash.el Clojure-idiomatic Elisp.

  18. cider-inspect function for introspecting data structures/objects.

  19. Awesome package from Ustun: https://github.com/ustun/emacs-friends for launching just about anything from inside Emacs.

  20. helm-imenu for inventory of / searching in current namespace.

“If you know Clojure, learning Emacs Lisp is very easy.”

Nathan Herzing & Chris Shea – voting infrastructure

Voting infrastructure – Turbovote

Architecture:

USPS | FTP frontend | REDIS | API <> Datomic
                              API | frontend

Cool interactive Om/core.async/Datomic example.

TODO Take another look at Pedestal

[Learned from Ustun Ozgur – git log -p – awesome for delving into change details. See also http://www.philandstuff.com/2014/02/09/git-pickaxe.html ].

Michał Marczyk…

… Ported PDS’s to ClojureScript.

2r100111 legal syntax for an integer :-)

Could we do better than core.rrb-vector (radix search) for performance? Probably not; but adding requirements until data structure isn’t optimal; then, optimize again. E.g. concatenate, slice, reverse.

data.avl – AVL tree (http://en.wikipedia.org/wiki/AVL_tree) – like red-black tree; lookup / insert / delete all \(O(\log n)\). Lookup, insert, delete “universally known.”

“Well-known” operations: join, split, range.

Julian Gamble – core.async

Paradigms of core.async

Forthcoming Clojure Recipes book

core.async is about:

  • Queues in your application; asynchronous communication.
  • Making your program simpler to reason about.
  • “[In Lisp], you can extend the compiler simply by shipping the library.”
  • Modeling time as reading from a queue

Example:

go block >! my-q <! second go block
go block | my-q | second go block
  • 10k “processes” in browser (TBaldridge)
  • David Nolen “render pipeline” example – guarantees queues don’t back up.
  • JS event queue–handler puts event on queue; event loop processes these.
  • Parallelism is not concurrency – in concurrency, there is shared state.
  • Rich Hickey ant demo – multiple threads (agents) against one data structure
    • previously impossible in the browser
    • demo: no async version (performs poorly)
    • add core.async for evaporation; modest improvement
    • adopt Nolen minecraft demo (.cljs animations) – imperative style loops / aset’s.
  • Locks and callbacks don’t scale.

Jeanine Adkisson – Variants are not Unions

Nice talk.

@jneen language developer

  • Clojure – “a viable alternative to Ruby.”
  • Variants a pattern from typed languages.
  • Values are nice. No one is going to come around and change them.
  • Variants are tagged data. [:tagged data]
  • Destructuring via core.match.
  • Book store example
    • Antipattern: maps or database tables populated with missing fields. Makes her sad.
    • {:address nil, :email nil, :store-id nil} will show up in your run time.
  • Maps model and by putting data together:

    {:a 1
     ;; and
     :b 2}
    
  • Model OR:

    {:address "a"
     ;; or
     :email "b"}
    
  • Variants have multiple constructors and single destructors.
  • Core.typed can check for things that can go wrong:

    (defalias MyVariant
      (U [(Value :tag1) String]
         [(Value :tag2) Sym]
         [(Value :tag3) ...]))
    
  • Herbert schemas can also support variants:

    '(or [:tag1 str]
         [:tag2 sym]
         [:tag3 ...])
    
  • Recursive variants and (untyped) λ calculus
  • hiccup uses recursive variants to model HTML
  • Erlang does this stuff well also.
  • Closed Variants restrict creation of new tags
  • Variant examples:
    • Loop variant where control decision made elsewhere
    • result variant. [:ok val] and [:err message].
    • Datomic stores variants efficiently (e.g., :db.variant/type :xx/order)
      • Polymorphism straightforward
      • Schema example
  • Variants are everywhere (begging to get out)

    Especially in shallow subclass heirarchies

  • This idiom has a long pedigree from typed languages. Use it.

Lucas Cavalcanti & Edward Wible – Exploring four hidden superpowers of Datomic

@lucascs São Paulo and remote.

Why build a bank from scratch in Brazil using Clojure and Datomic?

Banks are slow to adapt to new realities.

Barriers for entry are substantial.

Datomic – is an alien spaceship.

Superpowers

  1. Audit trail. Metaphor is like Git.
  2. Authorization: Queries which traverse graph of relationships.
    • recursive rule definitions
    • queries using such rules
    • makes ownership/permissioning simple(r)
  3. HTTP Cache – Last-Modified header → 304 Not Modified
  4. Mobile Sync – pass update URL w/ timestamp in response header
    • d/history gives ALL facts, including retracted ones.
  5. bonus Future DBs: Can create future (simulation) data and “include” it with a DB.
  6. bonus Testing: Make virtual database populated with test variables.
  7. bonus Schema Extension: Transact schema itself into Datomic (migrations) :pii example.
  8. non-feature – sharding reads. Currently one transactors; will scale as needed for availability. “Canary shard.” Shard-specific peer cache. ACID transactions inter-shard.
  9. bonus DB Aggregation. Give Data Science dept. a read-only copy of DBs using their own CPUs. Knowing when things happened is crucial for doing machine learning (e.g., testing predictions).

We never take our system down.

Lightning talks

  • Chris Oakman’s friend
    • Pros http://pteroattack.com/
      • CLJS needs a website and an easier compiler tool
      • cljs.infolook at this
      • More helpful compiler UI
  • Reid Draper - finding simple failing test cases with test.check

NightCode… and games. – Zach Oakes

Game Development is a gateway drug for programming, and for Clojure. So a solid game development story is good for the ecosystem.

Mainstream game development environments well-suited for “big” games. But for indie games and “art games,” artistic merit and interesting gameplay are the differentiator. 2D is more abstract and better for some things.

(Check out Lone Survivor (2012))

Tools for Indie Games

  1. Hosted languages. Frameworks like unity (CLR) and libGDX (JVM) very popular for indie developers. “I’m afraid of the \(z\)-axis.”

    play-clj experimental

    Desktop, “iOS” and Android in one project.

    Transforming entities is most of what you do.

    Entities:

    • Particle effect editor → LibGDX entity
    • Texture
    • Shape

    Teaching problem: modify an existing game.

  2. Functional and logic programming

    Uncommon in game development. Entity Component Systems

    Papers to read:

    • A logical approach to building dungeons (roguelike games)
    • Game Dialogue as Expressions in First-Order Logic

    Time rewinding is simple – built into clj-play (awesome demo).

  3. Interactivity (REPL)

    Biggest selling point for Clojure game development. Esp. important for art. (Ebert: “Games can’t be art.”)

    “I don’t think art has to make sense. If it didn’t, David Lynch wouldn’t be a famous director.”

    “I really love unfinished games… they avoid a lot of common game tropes… like the game has an end.”

Glenn Vanderberg – Algorithms of TeX… in Clojure (Cló)

IBM 650 Knuth “An Appreciation from the Field.”

3 Goals of TeX (“the greatest yak shave in history”)

  • Utility
  • Discovery
  • Education

“Breaking Paragraphs into Lines” – Knuth paper

TCP/IP flag day 1 Jan. 1983

John Lions commentary on the Unix OS

Showpiece of literate programming.

Matz quote (see photo).

Reading TeX code is opening a time capsule of another era. “The last hurrah of procedural structured programming.”

seque-> parallel threading macro (!)

Study a modern implementation to compare with the original.

Impedence mismatch between Clojure and TeX. How much compatibility is needed? Is this a problem in Clojure? In TeX? How much of the quirks in TeX are intrinsic?

Embrace mutation when appropriate.

Java NIO scatter/gather → TeX font metric files.

The Gifts of 30 Years – “It’s 2014 and we’re still using….”

Zach Tellman/Factual – ABC (Always Be Composing)

WebGL - Sierpinski triangles

Macros are not our most composable abstraction

Deferring invocation allows one to do more with it. But functions are somewhat “opaque” to examination. Data is better. Prefer:

Data > Functions > Macros

in terms of generality.

However, for execution,

Data < Functions < Macros

Taxonomies – value of periodic table (“gaps”).

Abstractions take us further from a particular problem, towards more generality. This is a balance. E.g. transducers – they are a specific kind of composition, and do not compose w/ e.g. sort. They are not easy for beginners to wrap their heads around.

Composing data – example of mean, variance, …. and data sets whose mean and variance are all the same.

“Premature specialization is the root of all evil.”

ztellman/automat – “better automata through combinators”

[Beautiful automata / Kleene-y slides here]

“We are side-effecting the world.” Backpressure in TCP/IP &c.; structural in most implementations (queues, core.async, …). But “causality doesn’t have to be structural.”

ztellman/manifold — streams vs. deferreds; like channels and promise-channels in core.async. Beautifully composable; higher-level abstraction than core.async. let-flow – data flow in complex DAGs similar to Prismatic’s graph.

Code is data, but code is a very narrow subset of data; code’s semantics are fixed; data’s is arbitrary.

Tradeoffs – how many affordances can we give to users?

What are the different forces at play in these design questions? These decisions can make the ecosystem flourish or wither.

Building a Data Pipeline with Clojure and Kafka – David Pick (Braintree)

Amazon Redshift (data warehouse)

Keeping track of deleted data is very hard to do. E.g. extra columns (created_at, updated_at).

Data pipeline + search/reporting infrastructure [sound familiar?]

  • First attempt: SQL powered. Slow and limited.
  • Harvest PostgreSQL change log?

    DB | ..... | Redshift
               | Elasticsearch
    

    But Redshift dies frequently.

  • PGQ_ async. processing of live transactions. Triggers on CRUD. When transaction fails, queue not populated. ACID and fast. So:

    PGQ | Redshift
        | Elasticsearch
    
  • Apache Kafka. pub/sub w/ multiple nodes in between (high avail.). Partitions (ordered) shard topics (not necessarily ordered). Can keep data for e.g. 1 month. 300k msgs/sec rates. Can re-wind Kafka streams in case a job dies.

    Pipeline:

    shards
         0 | datastream  | redshift loader
                         | ES loader
           | eventstream | ES loader
         1 ...
         2 ...
    
  • Why Clojure? Why not Ruby? (Braintree is mostly a Ruby shop.)
    • JVM – ES, Kafka client, etc. already there.
    • Concurrency – architecture is inherently concurrent, and Ruby sucks at this.
    • Abstractions – this is the main point
      • Infinite sequences (this is the primary abstraction of a data pipeline)
      • Error handling
        • First try – threads
        • Second try – agents. They can handle errors, but they don’t talk to each other… not the right abstraction.
        • 3rd try – core.async – single go block per merchant.

          datastream  |             | 0 |
          eventstream | distributor | 1 | Elasticsearch
                                    | 2 |
          

          But, had race conditions and callback-like semantics.

        • Actor semantics turned out to be a good fit.
          • Can think about single merchant at a time.
          • Can put backpressure
          • Avoid races by notification
          • Pulsar/quasar provided semantics for this
    • Lessons learned:
      • GC is problematic
      • Use G1GC
      • Tune your heap size – smaller minimizes pause times (Clojure objects tend to be needed for a short time)
      • Used JMX for almost everything
      • Default configs for ES, etc. are bad for production
      • Use a model that avoids race conditions

Ashton Kemerling – Generative Integration Testing

Pivotal Labs – Pivotal Tracker

How do we make software more reliable? Software either needs to get simpler, and/or better unit and integration tests.

52k LOC Ruby. 12k+ RSpec examples …

Time, creativity, edge/ordering cases, emergent behavior: these are all causes of bugs which creep into production.

Downside of generative testing: test duration; reduced assertion power.

GT is helpful for Web apps because it’s hard to simulate many different sequence of user actions, and harder to find the minimal sequence which introduces a failure.

Tools:

  • Clojure
  • test.check
  • clj-webdriver
  • JDBC or similar. Make sure queries are fast!
  • clj-http (optional)

Before tests

  • copy/setup database state
  • setup all browsers

Before run

  • restore DB state
  • refresh browsers and clear caches
  • generate actions (generators) – what the user will do
  • execute actions (action runners)
  • assert (assertion engines)

Actions are composable and atomic. e.g., drag-drop, click, …. They must completely describe all test behavior.

Don’t worry about eliminating impossible sequences from your test cases.

Multimethods very helpful for describing actions.

At a higher level, strategies separate context from action. Can be parallel or serial. They include setup and assertions.

Make invalid actions a no-op.

Assertions are trickier in GT. Can’t depend on input data. Must be true every time.

Examples (from Pivotal Tracker):

  • changes should synchronize
  • client/server state parity
  • presence of alerts and error dialogs
  • no crashes

Assertions can be slow, and this will reduce number of runs you can do.

Demo, via video.

Stewardship: the Sobering Parts – Brian Goetz

“Blub Language Architect”

“You have your brushes and your colors… go paint paradise, and then, in you go!”

Pragmatism – “there is no good, there is only good for.” – Yoda

COBOL was an able predator for the problem ecosystem of the mid 1960s. ALTER statement.

Steward: A person employed to manage another’s property, especially a large house or estate.

How much change, how fast?

  • Better safe than sorry
  • Compatibility
  • Don’t alienate users

vs.

  • Adapt to change
  • Fix mistakes.

Put more eggs in fewer baskets – invest in changes w/ high cost/benefit ratio. “That’s not where I want to expend my complexity budget.”

Ford: practical cars for ordinary people.

“Every feature is one step towards collapsing in your own complexity.”

“Developers overestimate the importance of code, and underestimate the importance of stable programs.”

Central park == USD \(\frac{1}{2} 10^{12}\). Based on trust. Breaking changes are like building in Central Park.

Evolution of λ in Java 8. The obvious choice may not be the one you want. Java’s ALTER statement is only one bad decision away. If you don’t know the right thing to do, at the very least, do nothing.

Hardware: memory access has gotten more expensive and less predictable. Cache-miss can equal 1000 arithmetic instructions. Future JVM: support for values! Implicit immutability, perf. improvement and predictability. “Codes like a class, behaves like an int.” Non-Java languages will see the biggest benefits.

Tail recursion not supported due to a somewhat bone-headed decision. This may get fixed.



blog comments powered by Disqus

Published

27 November 2014

Tags