Guidelines for Creating Action Policies
Action Policies bundle Actions and Notifications. Actions and Notifications are typically triggered by a condition or threshold observed by a monitor. You can combine Action and Notification building blocks as a way to implement a policy that governs selected devices and device roles.
In WhatsUp Gold Action Policies typically trigger any of the following:
Examples of site actions bundled to implement action policies might be:
- Notifying application management engineers when production devices transition state (for example from maintenance mode to up)
- Applying approved configuration to a device found out of compliance (a default password is detected, anomalous port opened or traffic detected)
- "Activating" (reconfigure and restart) a backup host device whenever a primary fails
- Restarting "non-responsive" critical services (an FTP or log server, for example)
: You can also create and apply powerful policies which enforce configuration record versioning and device configuration alignment using WhatsUp Gold Configuration Management policies.
Creating Action Policies
Begin by creating a list or matrix of critical state change events, appropriate actions, and the chain of responsible individuals for your site.
Example: Device Recovery Matrix
Policy: Internal Service Level Agreement
Device recovery actions
The following matrix shows the sequence of WhatsUp Gold managed controls (thresholds and states coupled with actions) rolled into an example action policy called "Internal Service Level Agreement."
Event
|
Actions
|
Notification
|
Device crash and reboot or forced reboot after n seconds.
|
Run recovery action scripts.
- Verify network (NIC) connectivity.
- Verify application service connectivity.
- If application service verification fails, trigger failed web service role policy.
|
- Data center team notification.
- On call services engineers notification.
- Notification attachments
- Event log report.
- Follow on with availability reports.
|
Failed web service role policy.
(No response or after n milliseconds to application service active monitors from WhatsUp Gold pollers.)
|
Remove web service role endpoint from the "load balancer" configuration.
Add to WhatsUp Gold maintenance mode.
|
- Data center team notification.
- On call services engineers notification.
- Product owner notification with availability reports.
|
Failed node.
Symptoms: External NIC or management NIC not responding, remote execution failure, remote write failure, kernel panic.
|
Run failed node recovery action.
|
- Data center team notification.
- Notify closest data center engineer.
|
- Create an Action type from the Actions Library.
Example: Test Connectivity/Remote Execution
- Click the new button, then select (for Linux/UNIX, select ).
- Add syntax for remote login to the device in the script text box.
- Create the Action Policy.
Example: Combine with Notification Schedule and Roll Test into Action Policy
- Add a notification action you created according to your site's notification hierarchy.
- Add the PowerShell action you created.