MongoDB Cluster Exceptions

0.00 avg. rating (0% score) - 0 votes

In the previous blog for Mongo, we came to a conclusion how Mongo was a better choice for our situation. Carrying it forward we implemented Mongo with clusters to log the fields’ edit and view time of the users.

Problem Statement :-

After turning live, we faced an unusual exception from the Mongo cluster which set up an alarming situation for us. The exception that we encountered was something like this :-

Failure to send “getnonce” command to database “db1” . Failure to send all requested bytes

So, getnonce is a command used by client libraries to generate a one-time password for authentication for the mongo replica sets. At once, we got the picture that it is a connection exception thrown from our mongo-php client driver. We made a search in the mongo driver for the connection function and studied how it works, since not much documentation was available online.

Basically, we passed a connection string to the mongo driver for connecting to the mongo replica set. In that connection string, we made a pass of the DNS of all the servers of the mongo replica set as said in the mongo driver documentation. We were now able to narrow it down that somewhere the connection string was the problem. After studying about how the driver was parsing the connection string to connect to the servers, we deduced that passing both the DNS in the connection string was somehow wrong.

Connection String – http://authname:authpass@dns1,dns2,dns3/dbname

Mongo does not have the traditional master-slave architecture but has a three member replica set where there is one primary and the other two secondary. There is a voting for the primary in case the primary falls down and hence the odd number of servers.

Mongo Replica Set

Mongo Replica Set

So, as a failover we initially passed the DNS of all servers in the connection string so that in case the primary fails and mongo chooses another server to be our primary mongo server and the application does not stream data to the failed server and automatically switches to the new primary.

Learning from the driver connection function, we found that driver was making a getnonce request (request to authenticate on all servers of the replica set) on all the servers of the replica set while streaming data which was not we wanted and hence the exception.

Little insight into Mongo Driver :-

The Mongo driver selects the Mongo server and accepts a ReadPreference as its parameters. ReadPreference class at all times decide which server to promote as primary and secondary. The preference of the selection is decided on the basis of read modes and the tag sets. The following are the read modes :-

RP_PRIMARY

RP_PRIMARY_PREFERRED

RP_SECONDARY

RP_SECONDARY_PREFERRED

RP_NEAREST

The tag sets contain the replica information and are created either by passing the server IPs or by passing a replica set name. The method to decide read preference is static in nature and hence it keeps the replica information in its global space. On any error or exception during connection, the global information is destroyed and re-created as per new replica structure.

This helped us understand that the tag sets need to be created from replica set name instead of the server IPs.

Solution :-

The solution was to pass the replica set name in the connection string of the driver to connect to the mongo replica set. Connection through replica set would automatically return the primary server and send a authentication request to that server. This was easy and done by just passing a parameter in the connection string as below:-

Connection string – http://authname:authpass/dbname?replicaSet=replicaName

This stopped the exceptions at our application servers and also gave us the added advantage.

While understanding the driver connection methods, we also found out that while connecting with mongo replica set in our fashion, the driver uses caching mechanisms to cache the ips of the servers in the replica set and does not authenticate each time it makes a new connection opposite to what it was doing initiallly. This was also giving us a less connection time than usual.

In case the primary mongo node fails, the mongo replica set will automatically promote a secondary to primary by polling and our connection string since uses replicaSet to connect, connections will automatically be updated for our mongo driver. The application will now stream data to the changed mongo primary node and hence we have a failover in place for us.

So, quite a win situation for us and the digging up of the mongo driver connection methods do helped us in some way.

Sometimes by losing a battle, you find a new way to win the war.

Posted in Database