Isilon and SmartConnect

Isilon Overview

Isilon is Dell EMC’s scale out storage platform.  Running the OneFS operating system, it can serve as a large-scale file server, sizing from 16 TB to as much as 50 PB.  It is also easily scalable, as more storage can be added to your cluster simply by adding a new node.  The data is rebalanced to utilize the new node, and the extra storage is added to your total available capacity, all without any downtime.

In many cases, all nodes connect to the IP network.  Each node does have its own IP assigned from a pool of IP addresses.  Data on the Isilon can be accessed by any node in the cluster.  For example, let’s say that you are connected to node 4, and data could potentially reside on nodes 1, 2, and 3, as well.  You could access data through an individual node, but this is not ideal for a couple of reasons.  If you only connect to a specific node, how would you access data during periods of maintenance, node failures, or other downtime?  If only one node is used to access Isilon data, how would multiple users access data?  The answer to both issues is the SmartConnect switch.

Isilon Nodes and SmartConnect

The SmartConnect switch is a small, Isilon Cluster-only DNS server sitting on the lowest node of the Isilon Cluster.  Instead of a user connecting to a domain name and IP that is sitting on a specific node, they connect to an Isilon Cluster name.  The SmartConnect switch will then look at the cluster configuration, see which nodes are online, review the Isilon load balancing policy, and then return a node IP address, from the cluster IP pool, for the user to connect to.  Depending on the policy, the very next connection to the same name may return a different node IP, preventing a single node from being overburdened with I/O requests.  The Isilon can scale up to 144 nodes, and using DNS on the SmartConnect switch, only a NS record and an A record is required.  If nodes are added to the cluster, SmartConnect recognizes them, immediately, and can start serving their IP addresses to the users.  If a node is down or under maintenance for any reason, the SmartConnect Switch can take that node’s IP out of the rotation, so users will not resolve to the node that is out of service.

SmartConnect DNS Name Resolution

How the Isilon SmartConnect completes the name resolution is an interesting process.  To demonstrate, let’s assume the following configuration:

(6) Isilon Nodes, in Production, (1) node offline and (2) more nodes to be added later

Node Hostname IP Address Status
Isi-Node-01 10.10.1.10 Online
Isi-Node-02 10.10.1.11 Online
Isi-Node-03 10.10.1.12 Online
Isi-Node-04 10.10.1.13 Offline
Isi-Node-05 10.10.1.14 Online
Isi-Node-06 10.10.1.15 Online
Isi-Node-07 10.10.1.16 Not Yet Implemented
Isi-Node-08 10.10.1.17 Not Yet Implemented

There is a general purpose DNS server:

ProdDNS, 10.10.10.10

There is a user workstation on the network as well, requesting data:

User1, 10.10.20.50

There’s an Isilon Cluster IP Pool with the following entries NOT in DNS:

Isi-Cluster, 10.10.1.10 - 10.10.1.20.

Finally, there is the SmartConnect Switch:

Isi-SMC, 10.10.1.9

User1, who knows the Isilon cluster as Isi-Cluster only, will send a request out to connect to \\Isi-Cluster.  User1’s workstation sends a DNS lookup to its local DNS server, ProdDNS:

The DNS server will not have an A record for Isi-Cluster, but rather a NS record, or DNS Delegation Zone.  This means that while it may not know what Isi-Cluster’s IP is, it will have a NS record of it to forward the request to another DNS server, whomay have the IP for Isi-Cluster.

The DNS server, will then look in the NS record and see that
Isi-SMC is a host that may know who Isi-Cluster is.  ProdDNS will do a DNS query of its own to Isi-SMC, resolving it to 10.10.1.9, as Isi-SMC has the A record for Isi-Cluster.  The SmartConnect Switch will be sitting on the lowest node of the Isilon cluster.  Per the configuration mentioned above, Node 4 is offline for maintenance.

The SmartConnect Switch will get the request and resolve Isi-Cluster to an IP address.  The SmartConnect will check its internal database for a
SmartConnect zone that matches the name “Isi-Cluster”, and if it finds one, it will then check the Isilon node
database for an available node and load balancing policy.  With Node 4 down, it is marked as such, and therefore will not be available to service requests for Isi-Cluster.

Based on the load balancing policy, the SmartConnect switch will grab one ofthe online nodes in the zone and return that IP address back to the DNS server, which in turn will return it to the user who made the initial request.  If the user does another request, immediately after the first request, the SmartConnect switch will then move on to the next IP, based on the behavior of the Round Robin Load Balancing policy.  It will skip over 10.10.1.13, as that node is still down.

This process will continue for each request to resolve the Isi-Cluster name to an IP address.  Let’s now that the issue with Node 4 is resolved, and it is now online.  The SmartConnect Switch will then be able to give out the IP address of Node 4 for requests to resolve Isi-Cluster.

Adding in more capacity to our Isilon cluster, with Node 7 and Node 8, both nodes are added to the Isi-Cluster zone and the node list in the SmartConnect Switch.

Note, while the SmartConnect Zone membership may change, there are no changes required on the DNS server, as you add/remove/repair nodes in the Isilon cluster.  The server still only needs a single NS (Delegation) and A record to operate correctly.

Conclusion

This example shows how the Isilon SmartConnect allows admins to quickly and easily scale-out their Isilon storage cluster and still provide a flexible, single point-of-entry for their users, with only two entries in their DNS server that can service upwards to 144 nodes in a single Isilon Cluster.  Nodes can be seamlessly added, removed, or brought down for maintenance, and the SmartConnect Switch will still be able to provide access to the user community, through the remaining nodes, with ease.