BSTM-Monitor :
The BotStream deb includes a command line utility for monitoring the system status. This will display the following details in real time:
Licence issued and licence usage
Load across all BotStream instances in the cluster
Physical SIP Gateways defined and their usage statistics
Logical Gateways defined and their usage statistics
This can be invoked by running following from botstream machine terminal :
/usr/local/epi/scripts/monitor.sh |
Press Ctrl+C to stop the monitoring.
Sample Result :
Logs files and directories to monitor :
MP3 Conversion
Typically runs every minute.
Log file path: /var/log/epi/process_recordings.log
Azure Upload
Typically runs once every 10 minutes.
Log file path: /var/log/epi/azure_upload.log
Server |
Log/Directory |
Keywords |
|
BotStream Servers |
botstream_yyyy-mm-dd.log |
“declined” |
|
BotStream Servers |
cdr_upload_script.log |
Look for errors |
|
BotStream Servers |
azure_upload.log |
Look for errors |
|
BotStream Servers |
/usr/local/freeswitch/recordings, (runs every minute) |
Are the recordings being processed?older than 5 to 10 minutes recordings shouldn't be seen. |
|
BotStream Servers |
/data/recordings(runs every minute) |
Are the recordings being processed?older than 5 to 10 minutes recordings shouldn't be seen. |
Botstream servers |
/home/epic/queue_utils_log |
Look for errors |
RecordingProcessor Servers |
process_recordings.log |
Look for errors |
|
BotStream Servers |
/usr/local/freeswitch/cdr |
Look for any files older than 10 minutes which are not being processed. |
|
BotStream Servers |
/usr/local/freeswitch/cdr/unprocessed |
Look for any unprocessed files in these directories. |
RecordingProcessor Servers |
/home/epic/recordings/unprocessed |
Look for any unprocessed files in these directories. |
/data/audio/unprocessed |
Look for any unprocessed files in these directories. |
Web-monitor :
In the campaigns Count of on call/streams - There should not be a large difference between these counts. If the difference is more, check the botstream log from botstream server for any errors.
If the error count of the "web errors" and the "Bot errors" increases it is an indication that web-sockets and bots are failing from the customer side .
Verify the CPS percentage for the cluster. and the calls have to be dialled based on the CPS configuration.
Monitor the scheduling of calls whenever the channel gets free and when there is a queue for the campaign and the calls being handled in the "usage" column.
Check whether the load distributor is allocating the calls across the data centres based on the weightage given for the campaign.
The queue should be in active state for the clusters.
The "Dialer IP" servers should be in active state and handling calls. If they go inactive, check why they go inactive.
a. Is the botstream server down?
b. Is the iraswitch service crashing? sudo systemctl status iraswitch
The "gateway" should be in active state and handling calls. Monitor the "error" column which indicates the errors while dialling out. If error count increases it is an indication that the remote end may not be handling requests. Verify "failure_phrase" field from hangup event for further details.
Monitor the channel allocation for the campaigns. In the case when a large number of contacts are added in the queue and the “limit” is set to low then it has to be increased.