Dec 30 2008

Renormalizing Denormalized Data

Yummy – that’s a fun tongue twister.  It doesn’t quite mean “synchronizing data”, but it’s in the same family.  I don’t have a better phrase yet for “renormalizing denormalized data”, but there is probably a construct for it that someone reading this blog can tell me (or invent – here’s your choice to replace the TLA RDD with something else.)

Two companies – Gnip and Brightkite – that I’m involved with (there – full disclosure!) made announcements today about data integrations.

Here’s a bizarre use case.

  1. I register my location with Brightkite, take a picture, and write a note.
  2. Brightkite saves this data to Brightkite and these starts putting this data out to services I’ve integrated with.
    1. Service 1: Twitter
    2. Service 2: Flickr
    3. Service 3: Facebook
  3. I’m running a Twitter / Facebook status synchronizer.  So – when I twitter something it shows up on Facebook.  When I put a status on Facebook, it shows up on Twitter.

Theoretically, Brightkite, Twitter, and Facebook should know enough about each other not to repost the same thing in cases 2.1, 2.3, and 3 (recursive).  But I don’t think it’s going to do the right thing.  Let’s try and see how many tweets we get!  I’m guessing three.

Nope – it only showed up once.  I’m guess this is either because (a) Brightkite takes care of the issue (smart people at Brightkite), (b) something isn’t set up correctly (just tried twice with two different configurations), or (c) something unknown and mysterious is going on somewhere.

I’m going to bet on (a).  However, not everyone is going to be a tuned in to the issue as the guys at Brightkite and as this dynamic proliferates, the RDD problem will get worse.  Remember that this is bad:

10 Print “Hello”
20 Goto 10

I can’t wait until integration between three services get stuck in an infinite loop, bring down the entire Internet, and suck us all into a black hole.