GIRAFFE: A Scalable Distributed Coordination Servicefor Large-scale Systems

Publication TypeConference Paper
Year of Publication2014
AuthorsShi, X, Lin, H, Jin, H, Zhou, BBing, Yin, Z, Di, S, Wu, S
Conference NameIEEE Cluster 2014
Date Published09/2014
Conference LocationMadrid, Spain
Other NumbersANL/MCS-P5157-0714
AbstractThe scale of cloud services keeps increasing over time, significantly introducing huge challenges in system manageability and reliability. Designing coordination services in cloud is the right track to solve the above problems. However, existing coordination services (e.g., Chubby and ZooKeeper) only perform well in read-intensive scenario and small ensemble scales. To this end, we propose Giraffe, a scalable distributed coordination service. There are three important contributions in our design. (1) Giraffe organizes coordination servers using interior-node- disjoint trees for better scalability. (2) Giraffe employs a novel Paxos protocol for strong consistency and fault-tolerance. (3) Giraffe supports hierarchical data organization and in-memory storage for high throughput and low latency. We evaluate Giraffe on a high performance computing test-bed. The experimental results show that Giraffe gains much better write performance than ZooKeeper when server ensemble is large. Giraffe is nearly 300% faster than ZooKeeper on update operations when ensemble size is 50 servers. Experiments also show that Giraffe reacts and recovers more quickly than ZooKeeper against node failures.