HDFS Erasure Coding

Erasure coding provides redundancy by breaking the data into smaller pieces and storing the pieces in different locations. The key is that you can recover the data from any combination of a smaller number of those pieces.

Erasure coding technology

Erasure coding consumes less storage than replication and allows you to configure the robustness of your system (how many device failures you can recover from) with a very minimum storage cost.Erasure coding is useful when dealing with large quantities of data and any applications or systems that need to tolerate failures, e.g. disk array systems, object stores and archival storage.[su_spacer]The erasure coding framework in Hadoop Distributed File System is a key new feature and is also one of the driving features for releasing Hadoop 3.0.0. It can reduce storage costs typically by 50% compared to 3x replication while maintaining the same or better durability. Additionally, erasure coding adds minimal overhead, and can improve read and write performance in some conditions.[su_spacer]Erasure codes reduce storage by half, but speed of processing is crucial for a fast service to users. Chocolate Cloud provides its customers with HDFS-EC Plugin which is faster than Intel’s ISA-L in most configurations and is the only library to provide hardware acceleration on both Intel and ARM processors. Our goal is to exploit the power of erasure coding for delivering the ultimate performance and reliability to our customers.  Want to know more about Chocolate Cloud’s HDFS Plugin? You are always welcome to contact us at info@chocolate-cloud.cc and we will answer all your questions.