Topology
This tutorial describes how network topology can be used to extend the capabilities of sFlow-RT.
Configuring Topology
The diagram shows a simple leaf and spine topology annotated with the information needed to combine sFlow telemetry with the topological data.The topology is defined by four links: link1, link2, link3, and link4. The links and their attributes are described in a JSON object:
{
"links": {
"link1": {
"node1": "leaf1",
"port1": "swp1",
"node2": "spine1",
"port2": "swp1"
},
"link2": {
"node1": "leaf1",
"port1": "swp2",
"node2": "spine2",
"port2": "swp1"
},
"link3": {
"node1": "leaf2",
"port1": "swp1",
"node2": "spine1",
"port2": "swp2"
},
"link4": {
"node1": "leaf2",
"port1": "swp2",
"node2": "spine2",
"port2": "swp2"
}
}
}
In order to link sFlow telemetry to the topology, sFlow-RT must be able to learn the mapping from node name and port name to sFlow agent address and ifIndex, e.g. node: leaf1 port: swp1 → agent: 192.168.1.1 ifIndex: 3
If the switches include the host_descr and port_name structures, see sFlow Structure Numbers, in their sFlow telemetry then the mapping will be learned automatically via sFlow. Switches using the Host sFlow agent include these structures.
Alternatively, SNMP can be used to discover this mapping by setting the following System Property:
snmp.ifname=yes
This setting instructs sFlow-RT to make an SNMP request to retrieve sysName and ifName for each agent and ifIndex discovered in the sFlow telemetry. By default SNMP version 2c will be used with the public SNMP communtiy string. Addition System Properties can be set to overide these defaults: snmp.version, snmp.community, snmp.user, snmp.authprotocol, snmp.authpasswd, snmp.privprotocol, and snmp.privpasswd
Finally, a nodes structure can be included in the topology to provide the mappings:
{
"nodes": {
"leaf1": {
"agent": "192.168.1.1",
"ports": {
"swp1": {
"ifindex": "3"
},
"swp2": {
"ifindex": "4"
}
}
},
"leaf2": {
"agent": "192.168.1.2",
"ports": {
"swp1": {
"ifindex": "3"
},
"swp2": {
"ifindex": "4"
}
}
},
"spine1": {
"agent": "192.168.1.3",
"ports": {
"swp1": {
"ifindex": "1"
},
"swp2": {
"ifindex": "2"
}
}
},
"spine2": {
"agent": "192.168.1.4",
"ports": {
"swp1": {
"ifindex": "1"
},
"swp2": {
"ifindex": "2"
}
}
}
},
"links": {
"link1": {
"node1": "leaf1",
"port1": "swp1",
"node2": "spine1",
"port2": "swp1"
},
"link2": {
"node1": "leaf1",
"port1": "swp2",
"node2": "spine2",
"port2": "swp1"
},
"link3": {
"node1": "leaf2",
"port1": "swp1",
"node2": "spine1",
"port2": "swp2"
},
"link4": {
"node1": "leaf2",
"port1": "swp2",
"node2": "spine2",
"port2": "swp2"
}
}
}
Additional information can be attached to node, link, and port objects in the form of tags. For example, attaching a type tag to nodes to identify their role in the topology.
{
"nodes": {
"leaf1": {
"tags": {
"type": "leaf",
...
Once the topology object has been contructed, it can be posted to sFlow-RT using the REST API:
curl -X PUT -H "Content-Type: application/json" -d @topology.json http://localhost:8008/topology/json
Alternatively, the topology object can be installed using the internal JavaScript API:
setTopology(topology);
See Writing Applications for more information on using sFlow-RT's REST and JavaScript APIs.
The sflow-rt/topology application persists topology across sFlow-RT restarts and provides a dashboard to verify that all the nodes and links specified in the topology are being monitored.
Manually constructing the topology object is generally not feasible or recommended since it will likely contain errors. Ideally the network configuration and topology will be available in a centralized repository that can be queried to generate the information required by sFlow-RT. Alternatively, Link Layer Discovery Protocol (LLDP) data retrieved from network devices can be used to construct the topology.
Graphviz is a popular open source graph visualization tool that can be used to display network topologies defined using the DOT Language. The NVIDIA Cumulus Linux Prescriptive Topology Manager enforces cabling consistency using a DOT file to network topology. The Python script below, dot.py, converts simple DOT topologies to JSON and posts the result to sFlow-RT.
#!/usr/bin/env python3
import sys, re, fileinput, requests
url = sys.argv[1]
topology = {'links':{}}
def dequote(s):
if (s[0] == s[-1]) and s.startswith(("'", '"')):
return s[1:-1]
return s
l = 1
for line in fileinput.input(sys.argv[2:]):
link = re.search('([\S]+):(\S+)\s*(--|->)\s*(\S+):([^\s;,]+)',line)
if link:
s1 = dequote(link.group(1))
p1 = dequote(link.group(2))
if not p1.startswith('swp'):
continue
s2 = dequote(link.group(4))
p2 = dequote(link.group(5))
if not p2.startswith('swp'):
continue
linkname = 'L%d' % (l)
l += 1
topology['links'][linkname] = {'node1':s1,'port1':p1,'node2':s2,'port2':p2}
requests.put(url,json=topology)
For example, the following command uploads topology.dot.
./dot.py http://localhost:8008/topology/json topology.dot
The Python script below converts LLDP information gathered by NVIDIA NetQ.
#!/usr/bin/env python3
# Post output of 'netq show lldp json' to sFlow-RT topology
# netq show lldp json | ./lldp-rt.py http://sflow-rt:8008/topology/json
import sys, json, requests
url = 'http://127.0.0.1:8008/topology/json'
if len(sys.argv) == 2:
url = sys.argv[1]
links = {}
lldp = json.load(sys.stdin)['lldp']
for entry in lldp:
if not entry['interface'].startswith('swp'):
continue
if not entry['peerInterface'].startswith('swp'):
continue
if '%s %s' % (entry['hostname'],entry['interface']) < '%s %s' % (
entry['peerHostname'],entry['peerInterface']
):
lname = '%s-%s' % (entry['hostname'],entry['interface'])
if links.get(lname) == None:
links[lname] = {
'node1': entry['hostname'],
'port1': entry['interface'],
'node2': entry['peerHostname'],
'port2': entry['peerInterface']
}
else:
lname = '%s-%s' % (entry['peerHostname'],entry['peerInterface'])
if links.get(lname) == None:
links[lname] = {
'node1': entry['peerHostname'],
'port1': entry['peerInterface'],
'node2': entry['hostname'],
'port2': entry['interface']
}
topology = {'links':links}
requests.put(url,json=topology)
The Python script below provides another example, making eAPI requests to Arista Networks switches to query information from each switch in the network.
#!/usr/bin/env python3
import pyeapi
import requests
import sys
# eAPI info
EAPI_TRANSPORT = 'https'
EAPI_USER = 'admin'
EAPI_PASSWD = 'arista'
SWITCHES = [
'10.0.0.96',
'10.0.0.97'
]
# sFlow-RT REST API
RT = 'http://127.0.0.1:8008/topology/json'
nodes = {}
links = {}
topology = {'nodes': nodes, 'links': links}
linknames = {}
portGroups = {}
def getInfo(ip):
commands = [
'show lldp neighbors',
'show hostname',
'show snmp mib ifmib ifindex',
'show sflow',
'show interfaces'
]
node = pyeapi.connect(host=ip, transport=EAPI_TRANSPORT,
username=EAPI_USER, password=EAPI_PASSWD, return_node=True)
response = node.enable(commands)
lldp = response[0]['result']['lldpNeighbors']
hostname = response[1]['result']['fqdn']
ifIndexes = response[2]['result']['ifIndex']
agentAddr = response[3]['result']['ipv4Sources'][0]['ipv4Address']
ifaces = response[4]['result']['interfaces']
dev = {}
nodes[hostname] = dev
dev['agent'] = agentAddr
ports = {}
dev['ports'] = ports
for p in ifIndexes:
ports[p] = {'ifindex': str(ifIndexes[p])}
for n in lldp:
if '%s %s' % (hostname, n['port']) < '%s %s' % (
n['neighborDevice'], n['neighborPort']
):
lname = '%s %s' % (hostname, n['port'])
if linknames.get(lname) == None:
linknames[lname] = {
'node1': hostname,
'port1': n['port'],
'node2': n['neighborDevice'],
'port2': n['neighborPort'],
}
else:
lname = '%s %s' % (n['neighborDevice'], n['neighborPort'])
if linknames.get(lname) == None:
linknames[lname] = {
'node1': n['neighborDevice'],
'port1': n['neighborPort'],
'node2': hostname,
'port2': n['port']
}
for iface in ifaces:
members = ifaces[iface].get('memberInterfaces')
if members == None:
continue
for member in members:
portGroups['%s %s' % (hostname,member)] = {
'node': hostname,
'port':iface
}
def getInternalLinks():
lagnames = {}
linkno = 1
lagno = 1
for n in linknames:
entry = linknames[n]
if nodes.get(entry['node1']) != None and nodes.get(entry['node2']) != None:
links['L%s' % linkno] = entry
linkno = linkno + 1
portGroup1 = portGroups.get('%s %s' % (entry['node1'],entry['port1']))
portGroup2 = portGroups.get('%s %s' % (entry['node2'],entry['port2']))
if(portGroup1 != None and portGroup2 != None):
if '%s %s' % (portGroup1['node'],portGroup1['port']) < '%s %s' % (
portGroup2['node'],portGroup2['port']
):
lname = '%s %s' % ( portGroup1['node'], portGroup1['port'])
if lagnames.get(lname) == None:
lentry = {
'node1':portGroup1['node'],
'port1':portGroup1['port'],
'node2':portGroup2['node'],
'port2':portGroup2['port']
}
lagnames[lname] = lentry
links['G%s' % lagno] = lentry
lagno = lagno + 1
else:
lname = '%s %s' % (portGroup2['node'], portGroup2['port'])
if lagnames.get(lname) == None:
lentry = {
'node1':portGroup2['node'],
'port1':portGroup2['port'],
'node2':portGroup1['node'],
'port2':portGroup1['port']
}
lagnames[lname] = lentry
links['G%s' % lagno] = lentry
lagno = lagno + 1
for switch in SWITCHES:
try:
getInfo(switch)
except Exception as e:
print('Exception while connecting to %s: %r' % (switch, e))
sys.exit(2)
getInternalLinks()
try:
r = requests.put(RT,json=topology)
if r.status_code != 204:
print('Exception connecting to %s: status %s' % (RT, r.status_code))
sys.exit(3)
except Exception as e:
print('Exception connecting to %s: %r' % (RT, e))
sys.exit(4)
Finally, the following Python script queries NetBox to obtain topology.
#!/usr/bin/env python3
import pynetbox
import requests
from requests.packages.urllib3.exceptions import InsecureRequestWarning
requests.packages.urllib3.disable_warnings(InsecureRequestWarning)
URL = 'https://demo.netbox.dev'
TOKEN = 'b7ebaa99b7d2da549b151c4d4821ba73154dbd00'
SITE='dm-camden'
RT='http://127.0.0.1:8008/topology/json'
nb = pynetbox.api(URL,token=TOKEN)
nodes = {}
links = {}
linknames = {}
devices = nb.dcim.devices.filter(site=SITE)
for device in devices:
nodes[device.name] = {}
interfaces = nb.dcim.interfaces.filter(device=device,enabled=True)
for interface in interfaces:
connected_endpoints = interface.connected_endpoints
if connected_endpoints is not None and len(connected_endpoints) == 1:
connected_interface = interface.connected_endpoints[0]
if hasattr(connected_interface,'device'):
connected_device = connected_interface.device
if '%s %s' % (device.name,interface.name) < '%s %s' % (
connected_device.name,connected_interface.name
):
lname = '%s %s' % (device.name,interface.name)
if linknames.get(lname) == None:
linknames[lname] = {
'node1':device.name,
'port1':interface.name,
'node2':connected_device.name,
'port2':connected_interface.name
}
else:
lname = '%s %s' % (connected_device.name,connected_interface.name)
if linknames.get(lname) == None:
linknames[lname] = {
'node1':connected_device.name,
'port1':connected_interface.name,
'node2':device.name,
'port2':interface.name
}
linkno = 1
for lname in linknames:
entry = linknames[lname]
if nodes.get(entry['node1']) != None and nodes.get(entry['node2']) != None:
links['L%s' % linkno] = entry
linkno = linkno + 1
topology = {'links':links}
requests.put(RT,json=topology)
Please share additional topology discovery scripts with the sFlow-RT community.
Using Topology
The following sFlow-RT applications are topology aware:
- Browse Metrics, select Agent: TOPOLOGY, EDGE, or CORE in user interface.
- Browse Flows, set System Properties,
browse-flows.agents=TOPOLOGY
andbrowse-flows.aggMode=edge
- Prometheus Exporter, set agent: TOPOLOGY and aggMode: edge when querying flows
- Fabric Metrics
- IXP Metrics
A number of analytics features are enabled once a topology has been installed.
Note: the examples in this section assume familiarity with Writing Applications.
The metric, table, dump, prometheus/metrics and activeflows queries understand the token TOPOLOGY in the agent field to mean the agents that are in the topology. For example, the following query identifies the port with the hightest ingress utilization:
curl http://localhost:8008/metric/TOPOLOGY/max:ifinutilization/json
The tokens CORE and EDGE in the agents field further restricts the query to only consider interfaces that are connected / not connected by the topology. For example, the following query identifies the ingress port with the largest number of broadcasts:
curl http://localhost:8008/metric/EDGE/max:ifinbroadcastpkts/json
The following query returns metrics for connected ports in Prometheus scrape format:
curl http://localhost:8008/prometheus/metrics/CORE/ALL/txt
In addition, the aggMode=edge option in an activeflows query uses the topology to combine flow measurements and provide an accurate measure of the total amount of traffic entering the topology, i.e. summed over the access ports.
curl "http://localhost:8008/activeflows/TOPOLOGY/tcp/json?aggMode=edge"
The node:, link: and ifname: key functions are enabled when a topology is installed, see Define Flows. For example, create a flow definition with the following flow keys:
node:inputifindex,ifname:inputifindex
The definition instructs sFlow-RT to lookup nodes and ports in the topology and generate records with node and port names:
spine1,swp1
leaf2,swp1
Note: Flow keys always imply a filter since all keys in the definition need to be present in order to generate a record. In this case, flows will only be generated for nodes, links and ports that are defined in the topology.
A number of JavaScript functions provide access to the topology:
setTopology()
topologyVersion()
topologyLinkNames()
topologyNodeNames()
topologyLink()
topologyNodePortNames()
topologyTags()
topologyNodeTag()
topologyPortTag()
topologyLinkTag()
topologyNodeLinks()
topologyNodePortNames()
topologyInterfaceToLink()
topologyPortToInterface()
topologyLinkToInterfaces()
topologyNodesForAgent()
topologyAgentForNode()
ifName()
topologyLinkMetric()
topologyLocatesHostMacs()
topologyLocateHostMac()
topologyLocateHostIP()
topologyLocateHostIP6()
topologyLocateHostAgent()
topologyLocateHostUUID()
topologyLocateHostUUID()
topologyDiameter()
topologyShortestPath()
topologyShortestPaths()
topologyMinimumSpanningTree()
The topologyLocate*
functions query an internal table that sFlow-RT maintains, learning
the MAC addresses from network traffic and the access port where they enter the topology.
Finally, Multi-tenant sFlow
describes how to use topology to de-duplicate and split telemetry streams by network tenant.
The /tenant/json
and /tenant/{name}/json
REST API and setTenant()
JavaScript functions support this feature.