3 Ways Big Data Thrives with Ceph Storage

Home   >   Blog

3 Ways Big Data Thrives with Ceph Storage

The rain in Atlanta has finally stopped, and I decided to catch up on some industry reading. Sitting on the patio, I was surprised by an Accenture/GE study showing that 88% of executive respondents were working on big data projects. What surprised me wasn’t the high percentage, but rather the breadth of businesses looking to leverage big data. This study included a wide variety of industrial enterprises – including healthcare.

I shook my head. Can you imagine the headaches these monolithic initiatives are giving IT? According to Gartner and others, more than half of big data projects fail. After all, how can legacy hardware, including legacy scale-up storage, adapt to massive amounts of new data?

It brought home for me many of the advantages of using scale-out storage. Ceph-based, next generation storage can help ease the headache in IT. Here are three ways scale-out storage enables big data projects when deployed properly:

  1. Beginning with the future in mind: One of the starkest features of big data is that, well, it’s big. Traditional storage systems can be designed to store large quantities of data, but proper planning is not always simple. Legacy storage systems require IT to guess how much data they need to store over the next 1-3 years. The more data expected, the higher the cost of the storage system. Making a bad guess is like putting your money on the wrong horse—either you never get a return or it could cost you your job.Ceph-based, scale-out storage has some major advantages here. You can add capacity whenever it is needed. You only pay for what you need, and as your needs grow, you can grow accordingly. With Ceph you simply add a new server-based node… no more complex migrations or scheduled down-time to upgrade storage capacity. Scaling this way makes much more sense when trying to predict and stay ahead of big data projects.
  1. Keeping it simple: Big data comes in many shapes and sizes. Unstructured big data can be drawn from virtually any source in your environment. It is not uncommon to find file, block and object-based storage all uniting in a big data project, nor is it unusual to find data residing in several different and fractured storage systems. I hope you have your Alka-Seltzer and aspirin cocktail ready when trying to bring it all together.Ceph is a unified storage system for file, block and object-based storage. Storing all your data in one cost-effective architecture will eliminate complexity. Additionally, as new data sources are added to the big data project, they can all be extracted using the various interfaces on one Ceph-based system.
  1. Staying Responsive: We all know big data has turned storage on its head. Data collected by machine sensors piles up at alarming rates. Data queries crunching those numbers are made constantly. Putting it mildly, responsiveness requirements and expectations are sharply on the rise.This is possibly the most important advantage of Ceph. While scaling to exabytes is fantastic, not losing performance along the way is even better. Traditional storage relies on controllers to access the data. When these controllers max out on RAM, CPU or network speeds, all the storage space in the world won’t help with responsiveness. Maxed is maxed.

    Using a Ceph-based system, CPU, RAM and network speed are added with each new Ceph node (along with storage space). By adding responsiveness as the system grows, performance never suffers as storage expands.

Back on the patio, I wonder how many more big data projects will have to fail before enterprises start to realize how important the underlying technology is. Thankfully, finding a storage solution that meets the requirements of the project can be cost-effective and painless to implement.

Find out more at www.concurrent.com “Powering Brighter Ideas”