March 7, 2007

BBC Story about Palimpsest up...

One of my favorite projects at google is one that we don't talk much about, but I think is pretty important. Darren Waters from the BBC came by and I'm guessing he liked the project as much as I do :-)

7 comments:

Andrew Hitchcock said...

Interesting. Last quarter when I was sick, I went through and watched a bunch of the engEDU videos. The Palimpsest video was really fascinated (I watched it after just seeing the Dead Sea Scrolls, so I was getting my fix of ancient writings), but I never heard any follow up about Google's involvement.

Anonymous said...

I read with interest the story about Google using its techie mojo to help make this planet just a little better - it rang in the same vein as something nice coming out of NASA from the Apollo project - velcro or something equally cool and useful. Kudos to Google for making it possible and to you for being the human bean that plugs the ends together.

It made me wonder if Google might have some insight into a problem we are working with at the American Disability Association, where we are trying to identify different nations mean and average longevity (quality of life) for people with disabilities, specifically to pick programs that work better than the American (US) disability policy.

As for your posts about the love of our gadgets, what can I say. I've still not fully recovered because Clairol has discontinued my favorite Herbal Essences shampoo and replaced it with something that smells completely different. We get used to things, and then they change. Life is funny that way!

Georg said...

I find it interesting that the article claims that

"The networks aren't basically big enough and you don't want to ship the data in this manner, you want to ship it fast.

(I hope this was paraphrased, and not what was actually said)

"You want to ship it sometimes on a hard drive. What if you have these huge data sets - 120 terabytes - how do you get them from point A to point B for these scientists?"

I am involved with the data management for a large Particle Physics experiment, we're going to be creating ~5PB of data a year which is shipped to multiple sites world-wide. All over networks, no driving of hard disks.

I can (and do) happily pull in to my local site, minutes after giving the request, files of ~TB size.'

The networks' can cope, you just need the right networks and the right tools!

Chris DiBona said...

You shouldn't confuse your fantastic resources with that of the average site. You do have cool resources, for sure :-)

Georg said...

That is very true, our resources aren't exactly comparable to those available to most. What this points to, though, is the opening of these tools and systems to all - look what happened when the World Wide Web (the direct result of Particle Physics too, I should add!) took off...

Of course these systems require money ( = hardware, = time), but that's one for the research councils.

lloyd said...

I have an SBIR proposal under review at DOE for a similar concept, which I called the "data suitcase". I'd like to talk with you in more detail. Lloyd at blueskyelectronics dot com.

Chris said...

I think the palimpset project is interesting but I hope you don't mind if we try to make it redundant. Right now we deploy anywhere from 1GB to multiple lambdas to research sites through the national lambda rail. Using things like multihost to multihost 3rd party transfer tools - like gridftp we should be able to move 120TB in around 2 days. http://www.nlr.net/ Being that these resources are available to hundreds of participant institutions these resources may be fantastic but hardly rare.

While some people might consider this to be extreme technology I expect it to be the standard inside of a few years. What is interesting is that for the most part the networks aren't the bottleneck. The operating systems, file systems, and transfer tools are. Fortunately, with the growth of easy to assemble RAIDs that really are inexpensive, autotuning TCP stacks (linux 2.6 and Vista), and better tuned applications (GridFTP and HPN-SSH (my personal project)) we're making multi-terabyte transfers an everyday occurance.