zooko.com

*introduction | *current events | *projects | *stuff

reading

All hyperlinks which lead to a different web site are in this pretty green color: sample link

status of things I'm reading

title author(s) type status, last touched notes
Integrity (I ) Codes: Message Integrity Protection and Authentication Over Insecure Channels Mario Cagalj, Srdjan Capkun, RamKumar Rengaswamy, Ilias Tsigkogiannis, Mani Srivastava and Jean-Pierre Hubaux research paper 2006-04-07 notes below
Integrity (I ) Codes: Message Integrity Protection and Authentication Over Insecure Channels Mario Cagalj, Srdjan Capkun, RamKumar Rengaswamy, Ilias Tsigkogiannis, Mani Srivastava and Jean-Pierre Hubaux research paper 2006-04-07 notes below
A refutation of Metcalfe's Law and a better estimate for the value of networks and network interconnections A. Odlyzko, B. Tilly research paper 2005-05-23 notes below
Structure and Interpretation of Computer Programs Hal Abelson and Gerald Sussman non-fiction still studying it, again, 2005-05-23. This is the third time I've read this book (not counting an abortive start on my third pass last year), but this time I'm actually doing all the exercises and checking my answers, and comparing notes with Aaron Swartz!
Outflanking and securely using the PIN/TAN-System A. Wiesmaier, M. Fischer, E. Karatsiolis, M. Lippert research paper 2004-10-13 notes below
P is not NP M. Ionescu research paper 2004-10-13 notes below
Reperasure: Replication Protocol using Erasure-code in Peer-to-Peer Storage Network Z. Zhang, Q. Lian research paper 2004-08-02 notes below
Analysis of the WinZip encryption method T. Kohno research paper started 2004-03-12 practical cryptanalysis
The Impact of DHT Routing Geometry on Resilience and Proximity K. Gummadi, R. Gummadi, S. Gribble, S. Ratnasamy, S. Shenker, I. Stoica research paper started 2003-09-18 notes below
The Small-World Phenomenon: An Algorithmic Perspective Jon Kleinberg research paper haven't started yet 2003-09-18
On Preventing Replay Attacks on Security Protocols Streekanth Malladi, Jim Alves-Foss, Robert B. Heckendorn research paper haven't started yet 2003-08-26 (then why is it on this page? So I can easily find it to print it out at the local printshop)
Environmental Requirements for Authentication Ran Canetti, Catherine Meadows, Paul Syverson research paper haven't started yet 2003-08-26 (then why is it on this page? So I can easily find it to print it out at the local printshop)
Skip graphs James Aspnes, Gauri Shah research paper finished 2003-09-13 notes below
Symphony: Distributed Hashing In A Small World Gurmeet Singh Manku, Mayank Bawa, Prabhakar Raghavan research paper haven't started yet 2003-08-26 (then why is it on this page? So I can easily find it to print it out at the local printshop)
Exploiting Network Proximity in Distributed Hash Tables Miguel Castro, Peter Druschel, Y. Charlie Hu, Antony Rowstron research paper re-read 2003-08-22 notes below
The Codebreakers David Kahn non-fiction on chapter 18 ("Russian Cryptography"), 2003-08-21 The definitive history of cryptology from antiquity to World War 1. Unfortunately this book was written before some of the stories of World War 2 were declassified.
An Excess-Based Economic Model for Resource Allocation in Peer-to-Peer Networks Christian Grothoff research paper finished 2003-08-21 notes below
A Transport Layer Abstraction for Peer-to-Peer Networks Ronaldo Ferreira, Christian Grothoff, Paul Ruth research paper finished 2003-08-21 notes below
Tweakable Block Ciphers Moses Liskov, Ronald L. Rivest, David Wagner research paper finished (skipped proofs) 2003-07-29 notes below
A Theoretical Treatment of Related-Key Attacks: RKA-PRPs, RKA-PRFs, and Applications Mihir Bellare, Tadayoshi Kohno research paper started 2003-07-29 notes below
A tweakable enciphering mode Shai Halevi, Phil Rogaway research paper started 2003-07-29 notes below
A Deepness In The Sky Vernor Vinge fiction half-way through (second reading) 2003-07-29 One of the best science fiction novels ever. Amber and I are reading it aloud in the evenings.
A Designer's Guide to KEMs Alexander W. Dent research paper finished (skipped the proofs) 2003-06-19
Robust and Efficient Data Management for a Distributed Hash Table Josh Cates, master's thesis, MIT research paper finished 2003-06-17 The system described is very like Mojo Nation (except, of course, layered on top of Chord). One fact that was new to me is that the IDA erasure code is probabilistic.
One Hop Lookups for Peer-to-Peer Overlays Anjali Gupta, Barbara Liskov, and Rodrigo Rodrigues, MIT Laboratory for Computer Science research paper finished (first pass) 2003-06-09 coming eventually...
Luby-Rackoff Backwards: Increasing Security by Making Block Ciphers Non-Invertible Mihir Bellare, Ted Krovetz, and Phillip Rogaway research paper finished (first pass) 2003-06-04 notes below
Something Wicked This Way Comes Ray Bradbury fiction finished 2003-05-30 notes below
Linked: the New Science of Networks Albert-László Barabási popular science just started 2003-05-29 It was a birthday gift from an Mnet Hacker. I had put it on my amazon.com wishlist because Raph Levien recommended it.
Koorde M. Frans Kaashoek, David R. Karger research paper re-reading, 2003-05-28 notes below
Kademlia Petar Maymounkov, David Mazières research paper re-reading, 2003-05-25 notes below
Off-The-Record Messaging Ian Goldberg, Nikita Borisov, and Eric Brewer research paper still studying it 2003-04-24 In order to learn desiderata and techniques for EGTPv2.
Capability Myths Demolished Mark Miller, Ka-Ping Yee, and Jonathan Shapiro research paper still studying it 2003-05-20 notes below
A Security Kernel Based on the Lambda-Calculus (1996) Jonathan Rees research paper still studying it 2003-05-22 This thesis builds the theory and implementation of capabilities on a simple and elegant foundation -- the Scheme programming language. More coming eventually.
Sandman: Preludes and Nocturnes Neil Gaiman comic book (that really seems like the wrong term for it. "graphic novel"?) finished (at least the second reading), 2003-05-28 notes below
Topology-Aware Routing Miguel Castro, Peter Druschel, Y. Charlie Hu, Antony Rowstron research paper finished, 2003-05-25 notes below
Night Shift Stephen King fiction finished, 2003-05-26 notes below
David's Sling Marc Stiegler science fiction finished, 2003-04-29 Because I liked EarthWeb, and I bought this paperback from Darius Bacon.
The Case For Mars Robert Zubrin non-fiction finished, 2003-04-23 Because my brother gave it to me, and I like space exploration.
Sloppy hashing and self-organizing clusters Michael Freedman and David Mazières research paper finished, 2003-04-22 Because the authors are good at inventing new emergent networking and privacy ideas.
Crypto Steven Levy non-fiction finished, 2003-04-21 Because I loved "Hackers: Heroes", and I wanted to learn more about about the recent history of crypto.
The Flask Security Architecture: System Support for Diverse Security Policies Ray Spencer, Stephen Smalley, Peter Loscocco, Mike Hibler, David Anderson, and Jay Lepreau research paper haven't started, never Because David Wagner suggested it as a potential counter-example to my claim that Access Control Lists can't allow users to use the Principle of Least Privilege.
Sub-Operating Systems: A New Approach to Application Security Sotiris Ioannidis and Steve Bellovin research paper haven't started, never Because David Wagner suggested it as a potential counter-example to my claim that Access Control Lists can't allow users to use the Principle of Least Privilege.

notes about things I'm reading or have read

A_refutation_of_Metcalfe_s_Law 2005-05-23 dura-link v1.0.1 entry above

Recommended to me by Kragen Sitaker. Excellent stuff! Score another major win for Odlyzko by applying empirical data and reasonable argument to an important theoretical question. The basic argument is that the aggregate value of a network is better approximated as O(n log n) ("Odlyzko's Law") instead of O(n^2) ("Metcalfe's Law"), or O(2^n) ("Reed's Law"). Even though you now know the basic argument, I strongly suggest that you read the actual paper -- it is easy to read and short, but packed with useful observations.

Integrity (I ) Codes: Message Integrity Protection and Authentication Over Insecure Channels 2006-04-07 dura-link v1.0.0 entry above

I wasn't interested in the radio setting, but

Outflanking and securely using the PIN/TAN-System 2004-10-13 dura-link v1.0.0 entry above

There is nothing particularly surprising here, but I enjoy reading the details of this kind of cracking. One interesting detail was that the attackers needed to hurry and perform their illegitimate transaction before the user restarted his web browser and obviated their stolen TAN. Hm...

P is not NP 2004-10-13 dura-link v1.0.0 entry above

I haven't read it yet. Since its claims are so bold (see title), and since no complexity theorists are talking about it as far as I've noticed, it must be oddball in some way that makes legitimate scientists want to ignore it. It also comes with a "please break me" cryptographic challenge, which always makes me think "oddball". (Note: "please break me" is, of course, the current state of the art in the design of symmetric crypto primitives. However, the way that the reputable researchers plea for attacks is by publishing a paper in one of the reputable fora, rather than by explicitly pleading on a web page. Is this the Scientific Method at Work, or is it socialization and shibboleths to filter out kooks? They aren't exactly the same thing...)

Reperasure: Replication Protocol using Erasure-code in Peer-to-Peer Storage Network 2004-08-02 dura-link v1.0.0 entry above

They suggest an interesting notion: "statistical consistency". I think that means "normal consistency protocol but with erasure coding instead of replication". I didn't really understand the details of that part. I like Figure 1, which is one answer to a frequently asked question about Mnet: "Why use this erasure coding stuff instead of simple replication?". (Kubiatowicz, Weatherspoon, et al. have already answered that question quantitatively, but the graph in this paper is a more dramatic representation.) They chose M = 4 * K, just like the Mnet v0.7 file system. Recently, while playing with erasure codes, I decided that Mnet ought to use 84/254 -- M ~= 3 * K instead of 64/254 -- M ~= 4 * K.

The Impact of DHT Routing Geometry on Resilience and Proximity 2003-09-18 dura-link v1.2.0 entry above

This is exactly the topic that I have been wondering about recently. It surveys a few of the most prominent DHT geometries, comparing them in terms of how much flexibility they offer in peer selection and route selection. Their "peer selection" is equivalent to Topology-Aware Routing's "proximity neighbor selection", and their "route selection" is equivalent to its "proximity routing". They don't mention the third kind of flexibility that Topology-Aware Routing does: "topology-based nodeId assignment".

??? Isn't it the case that Circle can have either nice peer selection or nice route selection among short (log(N)) routes, but not both?

??? Isn't XOR really the same as Tree?

??? What the heck is this PRS-XOR doing? Routing to a peer who doesn't have a longer matching prefix (i.e. is not in a smaller common subtree) is not going to get you any closer to the target, even if the peer's Id is closer to the target Id in XOR distance. If you can't route to a longer-prefix, how is routing to a closer-by-XOR any better than routing to a random peer?

Skip graphs 2003-09-18 dura-link v2.0.2 entry above

Like Koorde, this is a new emergent network with novel properties, and like Koorde the concept is simple enough to be sketched in a few sentences.

skip lists

First, you ought to understand traditional skip lists. (If you already do, you can skip (ha ha) this paragraph.) Take a normal old linked list, and add a randomized "height" to each node. About half of the nodes have height 1, a quarter of them have height 2, and in general about 2^-n of the nodes have height n. Now make each node which is at least 2 tall have a link to the next node which is at least 2 tall, in addition to having a link to the very next node in the list. (You can see that by following those level-2 links, you can "skip" about half of the nodes while searching the list.) Likewise, make sure that each node which is at least l tall has a link to the next node which is at least l tall, in addition to its other links. Now you have a skip list, which is probabilistically about as efficient for searching and inserting as is a balanced tree, while being pleasingly simple in concept.

decentralized skip lists

Now this paper is about translating the skip list concept into a decentralized data structure. Make each node in the list be a separate node (machine) in the network. Now you have a decentralized data structure which is as efficient as a DHT (e.g. log(n) hops per lookup), while having the novel property that the nodes do not need to be evenly distributed throughout some space. (All current DHTs of which I am aware require that the nodes be evenly distributed throughout a certain space, and typically achieve this even distribution by using a hash function to determine the node's placement in the space. Some of them such as CAN -- Content Addressable Network -- have apparently tried with limited success to ease this requirement.)

robust decentralized skip lists -- skip graphs

But the structure that we just designed is not very robust -- any single node that disappears partitions the network! So now add redundancy. The naive way to add redundancy, to my way of thinking, is to add k extra successor links to each node (as DHTs like Chord typically do). However, the designers of the skip graph want each node to participate in more of the "higher level" linked lists -- they don't want half of the nodes to be of height 1 and to participate solely in the basic linked list. So in essence they just make k redundant skip lists starting from the same basic linked list.

The way they actually build the redundant skip lists is pleasingly elegant, and the paper has a concrete implementation with pseudocode, plus extensive analysis.

what I think about it

I haven't yet grokked this concept in its fullness. The freedom of placement of the nodes is constrained only by the one-dimensionality of the space! In the terms of Topology-Aware Routing, skip graphs offer topology-based nodeId assignment. In terms of The Impact of DHT Routing Geometry on Resilience and ProximityAt the same time, skip graphs offer a high degree of free choice in peer selection. The only constraints are probabilistic and system-wide -- there are no hard constraints on any individual node, but the network of all of the nodes has to fit the probabilistic distribution of a skip graph if the (proofs of the) emergent properties are to hold. This combination of freedoms is intriguing.

There is one major point in this paper that I don't understand: they say that skip graphs require log(n) links per resource, which compares unfavorably with the log(n) links per machine of DHTs, but I don't see why skip graphs couldn't implement machine-based topology nor why DHTs couldn't implement resource-based topology, so the issue seems separate to me.

addendum

Having let the ideas ferment in my mind for a few days, I now think that skip graphs are rather like a more flexible Chord. Or equivalently, that Chord is rather like a more rigid decentralized-redundant-skip-list. Imagine that the skip graph would be a circle instead of an interval, and that it would route only forward instead of bi-directionally. See how it is almost like Chord now? The remaining differences are quite intriguing! To choose the n'th peer of a Chord node, you go to the 2^n'th position counting forward from your starting position on the circle, and then take the first peer. To choose the n'th peer of a skip graph node, you walk forward through the list of nodes until you hit a node with the appropriate (randomized) flags to be your peer. The intriguing thing is that in the skip graph algorithm there is no reference to a node's position in some id space, and therefore there is no requirement for the skip graph nodes to be evenly distributed throughout such an id space.

Exploiting Network Proximity in Distributed Hash Tables dura-link v1.0.0 entry above

This is a better paper than the authors know. It is a survey and comparison of the currently-known ways to exploit network proximity in DHTs. That's a useful thing to have, especially if you are designing a DHT which is intended to have good performance, but it also offers much more, although the possibility is never mentioned in this paper.

What the authors don't state in this paper (although they did state in the original Pastry paper) is that optimizing for network proximity is only one particular use of these techniques. The techniques described are in fact all general, dynamic-graph-theoretic, techniques which could be used to optimize, not network proximity, but network reachability (i.e., to circumnavigate firewalls, NATs, Internet routing failures, etc.), uptime (as Kademlia does), server load, some kind of trust metric or measure of vulnerability, a decentralized incentive measure (i.e., micropayments, trust metric stamps, etc.), and probably more.

An Excess-Based Economic Model for Resource Allocation in Peer-to-Peer Networks dura-link v1.1.0 entry above

This is a reasonable stab at the problem from an engineering approach. The ideas are actually implemented, as I understand it, in GNUnet. It is completely decentralized, and they've given thought to the detailed problems: newbies, transitive operations, systemic attack resistance. It feels like exploratory work, and the author admits as much. He includes, for example, a "proof" that the attack resistance of the system as a whole is bounded by the attacker's bandwidth, but it is so specific to GNUnet's design and its assumptions that it isn't an exciting general result. These GNUnet-specific assumptions include that all traffic is request-response pairs, the requests are the only things that trigger consumption of resources, the responses are verifiable, and so on.

Still, to his credit the author is clear that this is exploratory, and he clearly indicates his plans for future work.

On a personal note, it is bittersweet to see some ideas that we implemented in Mojo Nation being reinvented here, such as the central notion of "excess-based" accounting, in which services are free when the server is idle anyway. I don't feel pride about this -- rather I feel regret that we didn't document what we did, and gratitude that the GNUnet folks are doing the world a better service by documenting their ideas. Also, of course, GNUnet is not identical to Mojo Nation, as GNUnet lacks the centralized component that Mojo Nation had.

The basic approach that GNUnet takes to decentralized incentives is very like what I've been planning for Mnet v0.7. For anyone out there who is familiar with Mojo Nation, it is pretty much what you get by leaving the "KEEP_RUNNING_WHEN_TOKENSERVER_IS_DOWN" flag set and then adding some tweaks to favor the most useful of your peers. If you are interested in decentralized incentives in Mnet v0.7, you should read this paper.

A Transport Layer Abstraction for Peer-to-Peer Networks dura-link v1.0.0 entry above

Another piece of good explatory engineering that closely parallels our work at Mojo Nation (and in fact, some unpublished work that we did at DigiCash in 1997). This paper includes a few microbenchmarks of UDP, TCP, and SMTP, which I read carefully. Ironically, I am somewhat less interested in transport layer abstraction for Mnet v0.7 since the DHT-style routing scheme can also do firewall-circumnavigation.

In general, these two papers give me a good feeling about GNUnet. It's reassuring to know that there are plenty of independent researchers, hackers, and groups pursuing these sorts of ideas.

Tweakable Block Ciphers dura-link v1.0.0 entry above

This seems like a really good idea, from a "crypto-engineering" point of view. It also seems to dovetail with the idea of "Luby Rackoff Backwards" -- one could design a new block cipher which was tweakable and non-invertible. It seems like such a block cipher, used in a CTR-type mode, could be both more efficient and more "convincingly" secure than e.g. AES used in CTR mode.

A more immediate question is:

Isn't AES-CTR a secure tweakable block cipher if the attacker is constrained to be nonce-respecting?

This notion of a nonce-respecting attacker, which I got from Relayed-Key Attacks, formalizes the fact that in modes that use nonces -- such as CTR mode and CBC (the IV) -- the higher layer must carefully ensure that no nonce is used twice with the same key. As long as everyone who knows the secrets (the key) maintains this guarantee, then the attacker can't have access to a nonce-disrespecting oracle.

If AES-CTR is a secure tweakable block cipher when the attacker is deprived of a nonce-disrespecting oracle, then it is much faster, simpler, and (? sort of?) better-studied than the newly proposed tweakable modes e.g. CMC.

Update: David Wagner tells me in personal correspondence that the notion of a Tweakable Block Cipher is supposed to include re-using tweaks! So nevermind using CTR mode for a tweakable cipher. David also agreed with me that the idea of a block cipher designed to be tweakable and non-invertible is interesting.

Luby Rackoff Backwards dura-link v1.0.0 entry above

coming eventually...

Something Wicked This Way Comes dura-link v1.0.0 entry above

This was really enjoyable. I stayed up late to finish it. I disagreed with the moralistic premise, that wanting to increase or decrease your age is one of those unhealthy desires whose satisfaction entails your spiritual destruction. Nonetheless, I was delighted by the rich, sometimes unbalanced, prose and the unapologetic nostalgia for the limitless horizons of boyhood.

Sandman: Preludes and Nocturnes dura-link v1.0.0 entry above

I really didn't like it this time around. I think I enjoyed it pretty much all the way through, and I certainly stayed up late to read it, but the final chapter -- little more than a poem in praise of the peace and freedom that death brings -- disgusted me, and in retrospect I don't think whole thing was really worth the time it took to read. My favorite part was the visit to Hell. That part of the story had good tension, the art seemed to "clean up" a bit, and since the setting was overtly fantastic I didn't have the creepy feeling that some New Age religious freak was trying to convert me.

Koorde dura-link v1.0.0 entry above

Koorde is the simplest, most elegant emergent network idea I have seen. The core idea is so simple that I can explain it to you right now.

Suppose we could assign every node in our network a unique id K bits long, and there were exactly 2^K nodes in our network. (The only tricky part of Koorde is how to relax this unrealistic assumption, but I won't get into that.) Now we want to route from a start node s to an end node e. Let r be s||e, the binary string that results from s concatenated with e. Now take a "window" K bits wide and place it over r. The window starts at the leftmost position, covering bits 0 through K-1, which means that the window currently shows the value s. Now each hop consists of sliding the window one bit to the right.

So, to make the first hop from s to the next node, you slide the window one bit to the right, so now you are looking at bits 1 through K of r. Each node n is required to have two outgoing links, one to the node whose id is equal to n, minus its leftmost bit, plus a new 0 bit on the right, and one to the node whose id is equal to n, minus its rightmost bit, plus a new 1 bit on the right.

That's it. It is obvious that it will take K hops to get from any node to any other node on the network, for a network of size 2^K.

The only subtle bit left is how to make this work in a network where we don't have the convenient assumption that there are 2^K nodes, each of which has a unique id K bits long. Read the Koorde paper for details on that.

Kademlia dura-link v2.1.0 entry above

I'm looking at this paper again while trying to figure out how to discover new peers that are in the right part of the address space for ent.

Ent is based on Kademlia, which is pretty much the same as Pastry, modulo some tweaks. But the Kademlia paper is so much more readable to me than the Pastry paper. For example, Kademlia describes the network organization as a binary tree where the routing algorithm is choosing the subtree that contains the target. Pastry describes same system as a one-dimensional circular space where routing is replacing the prefix of an id with the prefix of the target's id. For some reason, the former is appealing, and helps me understand the algorithms, and the latter is unappealing.

Actually I'm not perfectly sure that they are the same, because I don't quite understand the Pastry algorithm...

Pastry deserves academic props for being first. It also seems to have a sophisticated added feature of linking-to-nearest-neighbors to counteract the problem of exponentially shrinking options that plagues Kademlia. Ent will almost certainly grow a feature like that, thus making ent more like Pastry than like Kademlia.

Topology-Aware Routing dura-link v1.2.0 entry above

A good examination of how emergent networks can be aware of the underlay graph having hetergeneous edge costs. If I had similar results for how emergent networks can be aware of the underlay graph having some edges with infinite cost -- i.e., the underlay graph is not fully connected, I would publish! Such results would show how emergent networks can run on the real Internet (which, unlike the systems scientist's ideal, is not fully connected) as well as how friendnet and attack-resistance properties can be integrated with emergent networks.

Hm. That last part constitutes yet another reminder that I ought to read Raph Levien's thesis.

In any case, I hope to use what I learn from this paper in ent (the emergent network that we're implementing in Mnet right now).

Hmm. They make a good point about nodeId-selection, which was a technique that I was planning to implement in ent. It changes the "uniform distribution through the id space" assumption, on which some arguments of robustness rely.

Okay, I skimmed a lot of it. Basically, proximity neighbor-selection, which is another technique that I was planning to use in ent, works well.

This is by the authors of Pastry, by the way.

Capability Myths Demolished dura-link v1.0.0 entry above

This paper attempts to rehabilitate capability access control in the minds of other security researchers. In doing so, it has clarified for me some long-standing questions of what are the precise differences between capability access control and access control lists.

Night Shift dura-link v1.0.0 entry above

The focus is on the story, which I like. I like genre fiction in general, if the author is respectful of the story. (The thing I don't like is knock-off genre fiction, where the author has no respect for the genre, for the reader, or for his own story. Most science fiction movies seem like that to me.) Stephen King is serious about his stories without being pretentious, so I like it, even though almost every story has enough weak spots to break my verisimilitude. (Note to Stephen King: hidden things, unexplained things, eerie coincidences -- those are scary. Real things like murderers, religious wackos, and people who destroy their most precious relationships because they are trapped by anger and guilt -- those are very scary. Magical realism -- ghosts, eyeballs growing out of your hands: sometimes scary. Big creatures with glowing eyes? Not scary. In fact, they are so not-scary that an appearance by one instantly breaks the spell, breaks verisimilitude, and banishes all the scariness that had already built up.)


Zooko O'Whielacronx
Last modified: Tue Apr 11 12:15:59 ADT 2006