To what degree can we monitor VoIP services with our existing tools?
I’ll be spending some time in the next few weeks/months evaluating VoIP service monitoring products exhibited at CIPTUG. These products are end-to-end solutions and can really give us a solid look at what’s going on in the VoIP network. But meanwhile, what (more) can we do with what we’ve got? The rest of this post is a lot of “thinking aloud.”
We’re using three main products right now to do the bulk of our system monitoring and trend-reporting: Castlerock SNMPc, Cisco Internet Telephony Monitor, and Cricket. I’m speaking mostly about monitoring and trend-reporting related to the VoIP system and network infrastructure that supports it. There are many other systems out there monitoring other parts of the network.
Castlerock and Cricket perform service monitoring by SNMP and configurable service polling. With these we can get any statistics exposed by SNMP as well as service availability information by connecting to a service’s network socket. This takes care of server vitals (CPU, memory, I/O, process stats), application availability (CallManager, SQL database, web services), and some network stats (up/down, byte counts). We’ve also got gateway channel usage.
Those are the low-hanging fruit and many of those stats are already being monitored and recorded. Next comes PBX statistics that will depend on how we can query CallManager and Unity using SNMP and Perfmon. I know these data are exposed but the integration into existing tools may be challenging. Other systems that may be added to the VoIP environment, such as Asterisk and OpenSER, will have to expose data in similar ways.
We can get network performance data including latency and jitter by using the Smokeping tool, which we already use to measure the core backbone.
What do we need to monitor at the endpoints? ITM tells us when endpoints disappear in groups (e.g. a network segment goes down). I’d like to be able to place automated test calls between two randomly-chosen endpoints for spot checking.
With current tools, we probably can’t get live call-quality information including mean-opinion score (MOS).
The third-party VoIP monitoring tools excel by putting all these data into one console as well as giving the live call-quality information. A piecemeal solution may turn out to be too cumbersome to use / put up with on a continuous basis.