Selecting a hyperconverged solution? 4 data efficiency tips to keep in mind

By Jesse St. Laurent, HPE Chief Technologist, Hyperconverged & SimpliVity

A friend of mine recently needed some pretty significant repairs done on his house. Over lunch one afternoon, he relayed to me the agonizing task of choosing contractors to get the work done. How do you know you’re choosing someone who is trustworthy, has expertise in the area, or is giving you the most for your money? It wasn’t a job I envied.

It did make me think though about how businesses go about choosing which vendor to select when it comes to transforming their infrastructure. The task can seem just as daunting … where do you even start? When making the switch to a hyperconverged infrastructure, it’s important to approach vendor selection carefully. In the critical area of data efficiency, many vendors claim they offer deduplication – the process of eliminating redundant information in the data center. But not all deduplication technology is created equally. If implemented correctly, global inline deduplication, compression, and optimization is extremely beneficial – improving performance, saving capacity, and lowering costs. But if a vendor treats the deduplication process like an afterthought, you may not be getting the data efficiency your organization really needs.

When choosing a vendor, think carefully about the following four points to determine if you’ll be getting what you need in the area of data efficiency:

  1.  Consider the scope of the data pool that is being deduplicated – If a hyperconverged infrastructure vendor limits deduplication to the disk group, cluster, or data center level without extending over the WAN (which many do), it is eliminating the benefits associated with moving large data sets between remote sites. Most important to consider in this case is the high cost of WAN bandwidth and the expense of special-purpose WAN optimization appliances.
  2. Understand the primary and backup data process – When a vendor sells both hyperconverged infrastructure and backup solutions, it will sometimes include backup software as a discounted add-on. While this may sound appealing from a cost perspective, the deduplication technology is often isolated at the primary storage tier, resulting in additional overhead of repeated deduplication, rehydration, and dehydration of data as it moves in and out of different modules in the infrastructure. Plus, backup performance will not take advantage of the data efficiencies of deduplication.
  3. Identify whether deduplication occurs at the beginning or end of the backup process (or both). Some vendors only offer post-process deduplication, which is helpful in terms of storage capacity, but doesn’t offer benefits when it comes to removing I/O as a performance bottleneck.
  4. Recognize how the vendor addresses the impact of deduplication functions on running applications. Without some acceleration, inline deduplication will often slow down performance when turned on. Be wary of vendors who suggest turning off deduplication when applications are running, or those who only make it available in an all-flash mode. These suggestions typically mean that storage efficiencies come with performance degradation, which isn’t something you should have to choose.

As evidenced, many vendors cut corners when it comes to deduplication, putting organizations at major risk for lackluster data efficiency. This act by some vendors leads to the question – is top-notch data efficiency really important and why?

My response is – absolutely! In addition to cost savings, focusing on data efficiency improves performance while saving on storage. For example, the average efficiency ratio for an HPE SimpliVity hyperconverged solution is 40:1 when you include primary and backup data with more than a third of customers averaging 100:1 or more (TechValidate). That means 100 TB is reduced to just one TB stored on disk. And by eliminating redundant I/O, more than half of customers see an increase in application performance by 50% or more (TechValidate). Additionally, when globally deduplicating all data inline across all storage tiers (including backup), incredible things can happen. For example, you can perform a full backup or restore of a local one TB VM in 60 seconds or less. This is only made possible through a focused approached to data efficiency.

Just like when looking for contractors to help repair a house, knowledge is key when making a decision as important as transforming your data center. If hyperconvergence is the next step for your business, be careful when talking with vendors about their deduplication capabilities. Always make sure to understand the impact their technology will have on your organization and ensure that they can provide data efficiency guarantees. When possible, connect with current customers to see whether the benefits were delivered in a real-world setting, especially if the customer’s environment is similar to your own.

Finally, make sure to ask the following important questions:

  • What is the scope of deduplication and does it extend to remote sites?
  • Does it includes both primary and backup data?
  • Is the technology inline or post-process?

The answers to these questions will matter to you once you’ve actually deployed the solution, so it’s better to ask at the outset – before it’s too late.

About Jesse St. Laurent

Jesse St. Laurent is the Chief Technologist for HPE Hyperconverged and SimpliVity. He uses his 20 years of experience to engage channel partners, evaluate emerging technologies, and shape innovative technology solutions involving data center modernization. For more information on how hyperconverged infrastructure can elevate your hybrid IT environment, download the free HPE SimpliVity edition of Hyperconverged Infrastructure for Dummies ebook.