Five years of decline

(the command-line parsing library)

It’s been a little over five years since I first published decline, a functional and UNIXy command-line parser for Scala. (If you’re not familiar with decline, the documentation might help.)

Hash-Range Partitioning

Better partitioning for distributed data

In distributed systems, it’s extroardinarily common to want to split a large dataset across some number of physical shards or partitions. This is commonly done by taking the key, hashing it, and then taking the hash modulo the number of partitions:

Releasing Coast

Samza, Cycles, and Streaming for People

Today, I’m delighted to announce the 0.2 release of the coast project: a high-level streaming toolkit written in Scala. coast is designed around Kafka’s partitioned log model, and supports complex streaming topologies with unusually strong messaging guarantees and no need for a central coordinator. The current release includes a new backend that compiles to Samza and supports exactly-once semantics for messages and state, support for cyclic dataflow graphs, and a bunch of improvements to the core library and documentation.

Doing the Impossible

Exactly-once Messaging Patterns in Kafka

Exactly-once messaging is something of a holy grail in the Kafka ecosystem – widely sought-after but rarely encountered. There are a handful of systems that promise exactly-once semantics, but none of them are a general-purpose solution: they’re often too task-specific, too heavyweight, or too broken, and sometimes all three. Complicating the picture is the fact that exactly-once message delivery is, in general, impossible.