广东了36选7开奖结果:GitHub Engineering Adopts New Architecture for MySQL High Availability

| by Hrishikesh Barua Follow 15 Followers on Jul 08, 2018. Estimated reading time: 3 minutes |

深圳风采开奖号码 uses MySQL as a backbone for many of its critical services like the API, authentication and the website itself. GitHub's engineering team replaced its previous DNS and Virtual IP (VIP)-based setup with one based on Orchestrator, Consul and the GitHub Load Balancer in order to get around split brain and DNS caching issues.

GitHub runs multiple MySQL clusters for different services and tasks, making it imperative to have them highly available (HA). GitHub's infrastructure is spread out across multiple datacenters, consisting of around 15 clusters, close to 150 production servers and 15 TB of MySQL tables. Each MySQL cluster has a single master, which responds to write requests, and multiple replicas, which serve read requests. The master node forms a single point of failure, and without it writes would completely fail. The HA requirements for this setup include auto-detection of failure, auto-promotion of a replica node to a master and auto-advertisement of the new master node to client applications.

GitHub's engineering team has employed several strategies for HA over the years, gradually moving towards uniformity across the organization. Since this is not restricted to MySQL, requirements for an HA solution also include cross-datacenter availability and split brain prevention. There are different possible approaches for MySQL master discovery. Previously, GitHub utilized DNS and VIP for discovery of the MySQL master node. The client applications would connect to a fixed hostname, which would be resolved by DNS to point to a VIP. A VIP allows traffic to be routed to different hosts to provide mobility without tying it down to a single host. The VIP would always be owned by the current master node. However, there were potential issues with the VIP acquire-and-release process during failover events, including split-brain situations. When this happens, two different hosts can have the same VIP and traffic can be routed to the wrong one. In addition, DNS changes have to occur to handle a master node that is in a different data center, and that can take time to propagate due to DNS caching at clients.

The latest setup at GitHub includes the Orchestrator toolkit, Consul for service discovery and the GitHub Load Balancer. In this architecture, when a client application looks up the master’s IP on DNS via its name, it is resolved via Anycast. The advantage of using Anycast is that while the name is resolved to the same IP address in every data center, the client traffic to that IP will be routed to the nearest master. The nearest master is the one that is co-located in the same data center. This routing is taken care of by GLB, which knows the current active MySQL master backends.

GitHub MySQL HA architecture
Image courtesy:

Orchestrator, also a GitHub engineering open source project, is responsible for master failure detection and the failover process. It utilizes collective knowledge drawn from all MySQL nodes including the replica to arrive at an informed decision about the master’s state. When a write master fails, the Orchestrator leader node detects the failure and starts the failover process to choose a new MySQL master. The rest of the Orchestrator cluster nodes notice this change and update their local Consul daemon with the new master details. Consul, a service discovery tool from HashiCorp, keeps track of the master nodes by storing them as key-value pairs. Consul can run in a distributed mode across datacenters but in GitHub's case each Consul cluster is independent at a datacenter level. The GLB gets notified of master status changes on a failover event using Consul Template, which queries the Consul clusters and updates the GLB state, which in turn routes traffic to the new master.

In the article, Shlomi Noach, senior infrastructure engineer at GitHub, mentions that although the new setup provides "between 10 to 13 seconds" of max outage time in most cases, there are some scenarios that need more work, like data center isolation leading to a split-brain or a Consul outage at the time of failover.  GitHub’s new setup is a move away from traditional techniques based on networking, to ones based on proxying and service discovery. It completely replaces the VIP-based one, but there is debate around whether it would have been easier to adopt a different approach utilizing the Border Gateway Protocol (BGP).

Rate this Article

Adoption Stage

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Welcome to 2018 by Robert Van Dell II

Leader/follower is the modern preferred terminology over master/slave (e.g. "leader election" from failure detection).

Re: Welcome to 2018 by Daniel Bryant

Thanks for you comment Robert, and I understand your concern (I too personally prefer the modern terminology).

I've looked in the MySQL docs ( and they do still use the older terminology, and so I can understand why Hrish chose the the original words for this news piece. However, the source article from GitHub used the terms "master-replica" and so I have updated this piece to reflect their choice.

Best wishes,

InfoQ News Manager

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

2 Discuss

Login to InfoQ to interact with what matters most to you.

Recover your password...


Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.


More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.


Stay up-to-date

Set up your notifications and don't miss out on content that matters to you

  • 世界杯专版 竞猜型彩票 赔率是公认的竞猜利器 2019-02-21
  • 反制更快更强更准!中国坚决打赢对美贸易自卫反击战! 2019-02-20
  • 北京国际旅博会开幕 旅游产品直降数千提前带热暑期档--旅游频道 2019-02-19
  • 世界杯倒计时:球迷街头狂欢为自己的国家打Call 2019-02-18
  • 这论坛需要风水们创新、发展,他们也只能靠这个了······ 2019-02-18
  • 雷政富狱中发声:否认漏罪举报 不服原判正申诉 2019-02-17
  • 央行:有效防控互联网金融领域风险 2019-02-16
  • 苏57空中姿态控制能力瞬间救场, F22隐身战机无法超越! 2019-02-15
  • 这泼猴总给人萌萌达的感觉 2019-02-15
  • 十年前人大代表工作二三事 2019-02-14
  • 厦门出现不打烊的便民服务站 2019-02-13
  • 故事新西兰强震致民宅破坏严重 超市遭抢购 2019-02-12
  • 中关村医院为中科院科研人员提供就医绿色通道 2019-02-11
  • 候选企业:内蒙古呼和浩特金谷农商银行 2019-02-11
  • 百姓满意的口碑,就是最高荣誉! 2019-02-10
  • 380| 801| 237| 121| 717| 466| 992| 90| 600| 929|