Big Data, “Mass” Data

2012-05-30_14-33-39_78Just a few days after I started my Big Data blog series, yesterday was Big Data in Massachusetts day. I was fortunate to be in the crowd at the Stata Center when MIT, Intel and the Commonwealth of Massachusetts launched a whole series of Big Data initiatives in Boston and Massachusetts. The MassTLC has a nice summary blog post here.

As a region, we are staking a claim to be the place for big data innovation. I think this is a well-founded claim and I think Boston will rise to the opportunity.

My first big data post talked about the rise of behavioral data as the driver of big data. I also alluded to systems being observed in finer detail, and instrumented in real time. A broader look at this deluge includes some of these factors not necessarily all based on behavior, although finer granularity really does elucidate system behavior much more readily. One example is data generated by genetic sequencing and other life science research such as high throughput screening. Another example is medical imaging, where image file sizes are now huge, because of resolution improvements and massive number of frames (rather than that single xray) per study. This crosses over into behavioral data of living systems when you think of those hi-res videos of a beating heart, or a digital “movie” of radiology guided surgery. In another industry, I was told by someone working in oil and gas seismology that similar digital imaging technology is used on drill cores and each cubic centimeter produces hundreds of gigabytes of image data. A drill core is 45 meters long, and apparently the total amount of data for a single core can reach up to an exabyte– talk about big data!

On a final note, I received an email from my brother-in-law who said he read the blog post yesterday and had never heard of big data before. He went on:

I didn't understand what you meant when you wrote “expect to read or hear about it 3 more times in the next few days", but as I write this I guess you're referring to the predictive ability of big data. Anyway, having never heard the term I was just reading up on <another company>, when bingo came across “big data overview“.

No comments: