Cheap Big Data

(Bits, The New York Times's technology blog -- "The Business of Technology" -- is good to follow.)
"Big data," according to the redoubtable Wikipedia, is
. . . a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. The challenges include capture, curation, storage, search, sharing, analysis, and visualization. The trend to larger data sets is due to the additional information derivable from analysis of a single large set of related data, as compared to separate smaller sets with the same total amount of data, allowing correlations to be found to "spot business trends, determine quality of research, prevent diseases, link legal citations, combat crime, and determine real-time roadway traffic conditions.
. . . Data sets grow in size in part because they are increasingly being gathered by ubiquitous information-sensing mobile devices, aerial sensory technologies (remote sensing), software logs, cameras, microphones, radio-frequency identification readers, and wireless sensor networks.
This article in The New York Times "Bits" technology column/blog lays out the prospect of big data capability exiting largely the domain of well resourced enterprises (companies, universities).
Some new product impress for what they says about the future. Win or lose, they show where the world is going with near certainty.
In this case, the product is Big Data computing at near consumer prices.
http://www.cleanbreak.ca/2013/02/15/big-data-is-the-key-to-unlocking-big-gains-in-energy-productivity/
Violin Memory is an eight year-old company that makes large-scale data storage systems for computer centers. Its boxes fetch information uncommonly fast. Now, the company is going downmarket, with data storage for individual computer servers. These data cards create powerful machines that can do sophisticated work, at less than one-tenth the current costs of storage.
If the product works, ordinary servers costing a few thousand dollars might be deployed for sophisticated data analysis, genetic research, logistics management, or other activities that are currently done on multimillion-dollar racks of computers. It could make possible much cheaper real-time computing projects at companies and schools, bringing in more customers and experimentation.
Article continues at link.
This is just the hardware -- also needed are accessible data sets themselves and the software tools to extract, manipulate, load, management, and analyze them. . .
Reader Comments