GIRAFFE: A Scalable Distributed Coordination Servicefor Large-scale Systems

TitleGIRAFFE: A Scalable Distributed Coordination Servicefor Large-scale Systems
Publication TypeReport
Year of Publication2014
AuthorsShi, X, Lin, H, Jin, H, Zhou, BBing, Yin, Z, Di, S, Wu, S
Other NumbersANL/MCS-P5157-0714
Abstract

The scale of cloud services keeps increasing over time, significantly introducing huge challenges in system manageability and reliability. Designing coordination services in cloud is the right track to solve the above problems. However, existing coordination services (e.g., Chubby and ZooKeeper) only perform well in read-intensive scenario and small ensemble scales. To this end, we propose Giraffe, a scalable distributed coordination service. There are three important contributions in our design. (1) Giraffe organizes coordination servers using interior-node- disjoint trees for better scalability. (2) Giraffe employs a novel Paxos protocol for strong consistency and fault-tolerance. (3) Giraffe supports hierarchical data organization and in-memory storage for high throughput and low latency. We evaluate Giraffe on a high performance computing test-bed. The experimental results show that Giraffe gains much better write performance than ZooKeeper when server ensemble is large. Giraffe is nearly 300% faster than ZooKeeper on update operations when ensemble size is 50 servers. Experiments also show that Giraffe reacts and recovers more quickly than ZooKeeper against node failures.
 

PDFhttp://www.mcs.anl.gov/papers/P5157-0714.pdf