Monitoring Synology with telegraf inside docker

docker run -d --restart=always --hostname=wolverine --name=wolverine \
-v /volume1/docker/telegraf/telegraf.conf:/etc/telegraf/telegraf.conf:ro \
-v /:/hostfs:ro \
-v /etc:/hostfs/etc:ro \
-v /proc:/hostfs/proc:ro \
-v /sys:/hostfs/sys:ro \
-v /var:/hostfs/var:ro \
-v /run:/hostfs/run:ro \
-e HOST_ETC=/hostfs/etc \
-e HOST_PROC=/hostfs/proc \
-e HOST_SYS=/hostfs/sys \
-e HOST_VAR=/hostfs/var \
-e HOST_RUN=/hostfs/run \
-e HOST_MOUNT_PREFIX=/hostfs \
--net=host \

Monitoring network interfaces from inside docker does not seem to work. Workaround is running the container in “–net=host” mode.

Using telegraf to add metadata to measurements

At my job (An ISP in NL) we use Telegraf/InfluxDB for storing and monitoring data of our network elements. We are also in the process of automating a lot of our processes for obvious reasons. The same InfluxDB we use for representing monitoring data on our customer facing portals. To associate the measurements with a customers subscription and instance we use the port description to store service-id’s etc. However to search a service-id from an ifAlias you’ll have to use regex. As you can imagine this is an “expensive” type of query in InfluxDB. We decided to add the service ID’s as separate tags. This way we can efficiently create queries bij only knowing the service ID. This is how we do it.

Of each port we measure obvious things like

  • IF-MIB::ifTable
  • IF-MIB::ifXTable
  • EtherLike-MIB::dot3StatsTable

We store things like ifAlias as a tag in InfluxDB so measurements are made searchable by it. And this also represents a problem to search datapoints by a service id we’ll have to regex every ifAlias. Witch is quite expensive in terms of resources. The most efficient thing to do is extract the needed meta-data before we store it in InfluxDB. Lucky enough telegraf has these capabilities. By utilizing the plugins processors.regex and processors.strings we can extract services id’s and sanitize them when needed.

In our case an ifAlias would look like this:

EVPN:<Customer_Name> subid:<UUID> instid:<UUID>

The subid is the service ID and the instid is the instance of the service (a subid can have more than one instance). We are interested in extracting the servicetype (EVPN) the subid and instid as separate tags. The service id’s are an UUID, so easily extractble.

In /etc/telegraf/telegraf.d/ we’ve created an conf file ie. processor-ifAlias.conf. witch will look something like this:

# only apply this processor to only this measurement only to tags with the name "ifAlias"
  namepass = ["NetworkMeasurements"]
  tagpass = ["ifAlias"]

  # processor to extract the servicestype if applicaple and store in result_key = servicetype
    key = "ifAlias"
    pattern = "^(?P<servicetype>AM|PTP|AM|CU|IP|EX|EVPN|IP|EXIP|OT|CR|L3VPN)+:.*"
    replacement = "${servicetype}"
    result_key = "servicetype"

    # processor to extract the subid if applicaple and store in result_key = subid only when UUID is correctly formatted
    key = "ifAlias"
    pattern = ".*subid:(?P<subid>[a-fA-F0-9]{8}-[a-fA-F0-9]{4}-4[a-fA-F0-9]{3}-[8|9|aA|bB][a-fA-F0-9]{3}-[a-fA-F0-9]{12}).*"
    replacement = "${subid}"
    result_key = "subid"

    # processor to extract the instid if applicaple and store in result_key = instid only when UUID is correctly formatted
    key = "ifAlias"
    pattern = ".*instid:(?P<instid>[a-fA-F0-9]{8}-[a-fA-F0-9]{4}-4[a-fA-F0-9]{3}-[8|9|aA|bB][a-fA-F0-9]{3}-[a-fA-F0-9]{12}).*"
    replacement = "${instid}"
    result_key = "instid"

  # seperate process
  # Convert the UUID to all lowercase if needed.
  namepass = ["NetworkMeasurements"]
    tag = "subid"
    tag = "instid"

With this in place we can easily create queries in our customer portals based on subscriptions an split them on instances and/or other metadata. Hope this helps. Any comments, suggestions or questions, do not hesitate to drop me a note.