Really Massive Complex Event Processing
- newyorkscot
- Feb 20, 2008
- 1 min read
ScientificAmerican ran a feature this month on the Large Hadron Collider (LHC) being built by CERN to conduct the largest physics experiments ever. Aside from its sheer physical scale, one of the remarkable aspects of the project is the massive volumes and frequency of data generated, causing it to be probably the most impressive combinations of complex event processing and distributed grid computing ever:
The LHC will accelerate 3000 bunches of 100 billion protons to the highest energies ever generated by a machine, colliding them head-on 30 million times a second, with each collision spewing out thousands of particles at nearly the speed of light.
There will be 600 million particle collisions every second. Each one called an "event".
The millions of channels of data streaming away from the detector produce about a megabyte of data from each event: a petabyte, or a billion megabytes, of it every two seconds.
This massive amount of streaming data needs to be converged, filtered and then processed by a tiered grid network. Starting with a few thousand computers/blades at CERN (Tier 0), the data is routed (via dedicated optical cables) to 12 major institutions around the world (Tier 1), and then finally down to a number of smaller computing centers at universities and research institutes (Tier 2). Interestingly, the raw data coming off the LHC is saved onto magnetic tape (allegedly the most cost-effective and secure format). I wonder how many nano-seconds they took to consider what CEP vendor they wanted to use for this project ?!!
Bình luận