In order to ensure the data security, high availability of services and good access performance, more and more large-scale distributed systems are deployed across IDCs. When different parts of distributed systems work cooperatively, critical data such as configuration and control information will be frequently exchanged across IDCs.
Faced with the latency and reliability challenges introduced by cross-IDC data distribution. We propose a Redis-based low-latency data publish/subscribe framework: Tensor. In order to deal with the problems of data loss and duplication caused by network anomalies or cluster node failures, we design a transaction-oriented information transfer mechanism in Tensor to guarantee the eventual consistency in cross-IDC data distribution. To improve the data synchronization performance, we optimized Redis’s replication mechanism to make it better suit the unstable network links between cross-area IDCs. What’s more, we design an intelligent log analysis based system bottleneck prediction method and a service discovery oriented system failover strategy to ensure the high availability of data distribution service. An extensive set of tests on Tensor in the production environment prove the low-latency and high-reliability of its data distribution service.