Tuesday, April 29, 2014

MySQL Fabric: Tales and Tails from Percona Live

Going to Percona Live and presenting MySQL Fabric gave me the opportunity to meet a lot of people and get a lot of good feedback. I talked to developers from many different companies and got a lot of great feedback that will affect the priorities we make, so to all I spoke to I would like to say a great "Thank you!" for the interesting discussions that we had. Your feedback is very valuable. It was very interesting to read the comments on MySQL Fabric on MySQL Performance Blog. The article discuss the current version of MySQL Fabric distributed with MySQL Utilities and give some brief points on features of MySQL Fabric. I think it could be good to give some context to some of the points they raise, both to elaborate on the points and show what they mean in reality, and also to give some background to how we were thinking around these points.

The Art of Framing the Fabric

It was a deliberate decision to make MySQL Fabric extensible, so it is not surprising that it have the feel of a framework. By making MySQL Fabric extensible, we allow community and users to explore ideas or add user-specific support.

In the MySQL Team at Oracle we are strong believers in the open source model and are working hard to keep it that way. There are many reasons to why we believe in this model, but one of the reasons is that we do not believe that one size fit all. For any users, there are always minor variations or tweaks that are required by the users own specific needs. This means that the ability to tweak and adapt the solution to their specific needs is very important. Without MySQL being open-source, this would not be possible. As you can see from WebScaleSQL, this is not just a theoretical exercise, this is how companies really use MySQL.

From the start, we therefore focused on building a framework and created the sharding and high-availability as plugins; granted, they are very important plugins, but they are nevertheless plugins. This took a little more effort, and a little more thinking, but by doing it this way we can ensure that the system is truly extensible for everybody.

Hey! I've got a server in my farm!

As noted, many if the issues related to high-availability and sharding require server-side support to get it really solid. This is also something we recognized quite early; the alternative would be to place the logic in the connectors or the Fabric node. We recognized that the right place to solve this is in the server, not in connector layer since that put a lot of complexity at the wrong place. Even if it was possible to handle everything in the connector, there is still a chance that something goes wrong if the constraints are not enforced in the server. This could be because of bugs, because of mistakes in the administration of the server, or any other number of reasons, so to build a solid solution, constraints on the data should be enforced by the servers and not in the connectors or in a proxy.

An example given is that there is no way to check that a row ends up in the right shard, which is very true. A generic solution to this would be to add CHECK constraint on the server, but unfortunately, this is a very big change in the server code-base. Adding triggers to the tables on the server is probably a good short-term solution, but that require managing and deploying extra code on all servers, which in turn is an additional burden on managing the servers, which is something we would like to avoid (the more "special" things you have to do with the servers, the higher the risk is of something going wrong).

On the proximity of things...

One of the central components of MySQL Fabric are the high-availability groups (or just groups, when it is clear from the context) that were discussed in an earlier post. The central idea around a group is that each group manages the same piece of data and MySQL Fabric is designed to handle and coordinate multiple groups into a federation of databases. The feature of being able to manage multiple groups is something that is critical to create a sharded system. On thing that is quite often raised is that it should be possible for a server to belong to multiple groups, but I think this comes from a misunderstanding on what a group represents. It is not a "replica set", which gives information about the topology, that is, how replication is set up, nor does it say anything about how the group is deployed. It is perfectly OK to have members of the group in different data centers (for geographical redundancy), and it is perfectly OK to have replication between groups to support, for example, functional partitioning. If a server belonged to two different groups, it would mean that it manages two different sets of data at the same time.

The fact that group members can be located in different data centers raises another important aspect, something that was often mentioned at Percona Live, that of managing the proximity of components in the system. There is some support for this in Hadoop where you have rack-awareness, but we need a slightly more flexible model. Imagine that you have a group set up with two servers in different data centers and you further have scale-out slaves attached locally. You have connectors deployed in both data centers, but when reading data you do not want to go to the other data center to execute the transaction, it should always be done locally. So, is it sufficient to be able to just have a simple grouping of the components? No, because you can have multiple levels of proximity, for example, data centers, continents, and even rooms or racks within a data center. You can also have different facets that you want to model, such as latency, throughput, or other properties that are interesting for particular uses. For that reason, whatever proximity model we deploy, it need to support a hierarchy and also have a more flexible cost model where you can model different aspects. Given that this problem have been raised several times on Percona Live and also by others, it is likely to be something we need to prioritize.

The crux of the problem

As most of you have already noted, there is a single Fabric node running that everybody talk to. Isn't this a single point of failure? It is indeed, but there is more to the story than just this. A single point of failure is a problem because if it goes down, so does the system... but in this case, it doesn't really go down, it will keep running most of the time.

The Fabric node does a lot of things: it keeps track of the status of all the components of the farm, execute procedures to handle fail-over, and deliver information about the farm on request. However, the connectors are the ones that route the transactions to the correct place, and to avoid having to ask the Fabric node about information each time, the connectors maintain caches. This means that in the event of a Fabric node failure, connectors might not even notice that it is gone unless they had to re-fill their caches. This means that if you restart the Fabric node, it will be able to serve the information again.

Another thing that stops when the Fabric node goes down is that no more fail-overs can be done and ongoing procedures are stopped in their tracks, which could potentially leave the farm in an unknown state. However, the state of the execution of any ongoing procedures are stored in the backing store, so when you bring up the Fabric node again, it will restore the procedures from the backing store and continue executing. This feature alone do not help against a complete loss of the machine where the Fabric node and the backing store are put, but, MySQL Fabric is not relying on specific storage engine features, any transactional engine will do, so by using MySQL Cluster as the storage engine it is possible to ensure safe-keeping of the state.

There are still good reasons to support multi-node Fabric instances:

  • If one Fabric node goes down, it should automatically fail over to another and continue execution. This will prevent any downtime in handling executions.
  • Detecting and bringing up a secondary Fabric node can become very complicated in the case of network partitions since it require handling split-brain scenarios reliably. It is then better to have this built into MySQL Fabric since it makes deployment and management significantly simpler.
  • Management of a farm does not put any significant pressure on the database back-end, but having a single Fabric node can be a bottleneck. In this case, it would be good to be able to execute multiple independent procedures on different Fabric nodes and coordinate the updates.
  • If a lot of connectors are required to fill their caches at the same time, we have a risk of a thundering herd. Having a set of Fabric nodes for read scale-out can then be beneficial.
  • If a group is deployed in two very remote data centers, it is desirable to have a local Fabric node for read-only purposes instead of having to go to the other data center.

More Fabric-aware Connectors

Currently we support connectors for Python, Java, and PHP, but one point that pop up quite often (both at Percona Live and elsewhere) is the lack of a Fabric-aware C connector. It is the basis for implementing both the Perl Database Interface MySQL driver DBD::mysql and for the Ruby connector, but is also desirable in itself for applications using C or C++ connector. All I can say at this point is that we are aware of the situation and know that it is something desired and important.

Interesting links

1 comment:

Anonymous said...

☑️DO YOU WANT TO RECOVER YOUR LOST FUNDS ON BINARY OPTIONS AND BITCOIN INVESTMENTS??? OR YOU NEED A LEGIT HACKING SERVICE ?? TAKE YOUR TIME TO READ🔘

☑️ The COMPOSITE CYBER SECURITY SPECIALISTS have received numerous complaints of fraud associated with websites that offers an opportunity to buy or trade binary options and bitcoin investments through Internet-based trading platforms.  Most Of The complaints falls into these Two categories:
1. 🔘Refusal to credit customers accounts or reimburse funds to customers:
These complaints typically involve customers who have deposited money into their binary options trading account and who are then encouraged by “brokers” over the telephone to deposit additional funds into the customer account.  When customers later attempt to withdraw their original deposit or the return they have been promised, the trading platforms allegedly cancel customers’ withdrawal requests, refuse to credit their accounts, or ignore their telephone calls and emails.

2. 🔘Manipulation of software to generate losing trades:
These complaints allege that the Internet-based binary options trading platforms manipulate the trading software to distort binary options prices and payouts in order to ensure that the trade results in a Loss.  For example, when a customer’s trade is “winning,” the countdown to expiration is extended arbitrarily until the trade becomes a loss.

☑️ Most people have lost their hard earned money through binary options and bitcoin investments, yet they would go and meet fake recovery Experts unknowingly to help them recover their money and they would end up losing more money in the process. This Is Basically why we (COMPOSITE CYBER SECURITY SPECIALISTS) have come to y’all victim’s rescue. The clue is most of these Binary option brokers have weak Database security, and their vulnerabilities can be exploited easily with the Help of our Special HackTools, Root HackTools And Technical Hacking Strategies because they wouldn’t wanna spend money in the sponsorship of Bug bounty Programs which would have helped protect their Database from Unauthorized access to their Database, So all our specialists do is to hack into the Broker’s Database via SQL Hook injections & DNS Spoofing, Decrypt your Transaction Details, Trace the Routes of your Deposited Funds, Then some Technical Hacking Algorithms & Execution Which we cant explain here would follow then you have your money recovered. 💰 ✔️

☑️All our Specialists are well experienced in their various niches with Great Skills, Technical Hacking Strategies And Positive Online Reputations And Recommendations🔘
They hail from a proven track record and have cracked even the toughest of barriers to intrude and capture all relevant data needed by our Clients.
We have Digital Forensic Specialists, Certified Ethical Hackers, Software Engineers, Cyber Security Experts, Private investigators and more. Our Goal is to make your digital life secure, safe and hassle free by Linking you Up With these great Professionals such as JACK CABLE, ARNE SWINNEN, SEAN MELIA, DAWID CZAGAN, BEN SADEGHIPOUR And More. These Professionals are Well Reserved Professionals who are always ready to Handle your job with great energy and swift response so that your problems can be solved very quickly.
All You Need to Do is to send us a mail and we’ll Assign any of these specialists to Handle your Job immediately.

☑️ Below Is A Full List Of Our Services:
* FUNDS RECOVERY ON BINARY OPTIONS AND BITCOIN INVESTMENTS
* WEBSITE HACKING
* CREDIT CARD MISHAPS
* PHONE HACKING (giving you Unnoticeable access to everything Happening on the Target’s Phone)
* CLEARING OF CRIMINAL RECORDS
* SOCIAL MEDIA ACCOUNTS HACKING


☑️ CONTACT:
••• Email:
composite.cybersecurity@protonmail.com

🔘2020 © composite cybersecurity specialists
🔘Want faster service? Contact us!
🔘All Rights Reserved ®️