_
_
Back to Blog
Datadog
No items found.

Integrating ServiceNow with Datadog the Right Way

Utilize ServiceNow and Datadog to build a robust and efficient incident management system
5
min read
|
by
Sid Nigam
October 16, 2024

Integrating ServiceNow with Datadog can significantly enhance your organization's IT operations by ensuring that incidents are promptly and accurately managed. In this blog post, we’ll explore how to do this integration the right way, focusing on key aspects like host-based monitors, tag propagation, ServiceNow CMDB lookup, transform scripts, and automating the creation of ServiceNow incidents from Datadog monitors.

1. Host-Based Monitors: The Backbone of Your Integration

Infrastructure monitors in Datadog are crucial for tracking the health and performance of individual hosts in your infrastructure. These monitors check specific metrics or conditions on each host and trigger alerts when predefined thresholds are breached.

When integrating with ServiceNow, these monitors serve as the foundation for incident management. By setting up robust host-based monitors, you ensure that potential issues are caught early and can be addressed through ServiceNow's incident management workflows.

Best Practices:

  • Granular Monitoring: Define monitors for critical metrics like CPU usage, memory consumption, disk I/O, and network latency. This ensures that you get detailed insights into the health of each host.
  • Custom Thresholds: Adjust thresholds based on the specific roles and workloads of your hosts. This reduces false positives and ensures that alerts are meaningful.
  • Alerting Strategy: Configure alerts to escalate based on the severity of the issue. For example, warning alerts might only generate emails or notifications, while critical alerts could automatically create incidents in ServiceNow. Use conditional variables in Datadog monitors to change recipients for warnings and critical alerts from the same monitor.

2. Propagating Tags Through: Ensuring Consistency Across Systems

Tags are a powerful feature in Datadog, allowing you to categorize and filter your metrics, logs, and traces. When integrating with ServiceNow, it’s crucial to propagate these tags through to ensure consistency and context in the incident management process.

How to Propagate Tags:

  • Tag Standardization: Establish a standardized tagging system across your organization. This might include tags for environment (e.g., env:production), application (e.g., app:checkout-service), and team ownership (e.g., team:payments).
  • Automated Tagging: Use Datadog's automation features to apply tags consistently across all your hosts and services.
  • Tag Mapping: Ensure that these tags are mapped appropriately when incidents are created in ServiceNow. To do this, group by hosts when creating monitors in Datadog so that all the tags associated with a host get sent to ServiceNow in the payload. This allows ServiceNow to filter, route, and prioritize incidents based on the same tags, providing a seamless experience.

3. ServiceNow CMDB Lookup: Leveraging Your Configuration Management Database

ServiceNow's CMDB (Configuration Management Database) is a valuable asset in your IT environment, containing detailed information about your IT assets and their relationships. When integrating with Datadog, leveraging the CMDB can enhance incident context and resolution.

Steps for CMDB Lookup:

  • CI Matching: When a Datadog monitor triggers an alert, use the host or service information to perform a lookup in the ServiceNow CMDB. This associates the alert with a specific Configuration Item (CI).
  • Enrichment: Once matched, you can enrich the incident in ServiceNow with additional details from the CMDB, such as the asset's owner, support group, and historical incidents.
  • Automated Updates: Ensure that any changes or updates to the CI in the CMDB are automatically reflected in related incidents, keeping the information accurate and up-to-date.

4. Transform Scripts: Customizing Data for Precise Incident Creation

Transform scripts in ServiceNow are used to manipulate data during the integration process. These scripts allow you to customize the data coming from Datadog before it is stored in ServiceNow for additional enrichment of incident tickets.

Using Transform Scripts Effectively:

  • Data Normalization: Use transform scripts to normalize data from Datadog so it fits the structure and requirements of your ServiceNow environment. For example, you might need to convert timestamps, reformat tags, or adjust field names. One example is the priority tag. Datadog sends a priority tag but ServiceNow tickets expect Impact and Urgency values. Sample code snippet below parses impact and urgency from the priority tag coming in from Datadog.

// Define incident impact and urgency
  gs.info('Running transform');
  if (source.monitor_priority == 1) {
    target.impact = 1;
    target.urgency = 1;
  } else if (source.monitor_priority == 2) {
    target.impact = 2;
    target.urgency = 1;
  } else if (source.monitor_priority == 3) {
    target.impact = 2;
    target.urgency = 2;
  } else {
    target.impact = 3;
    target.urgency = 3;
  }
  target.contact_type = 'monitoring ';
  • Conditional Logic: Implement conditional logic to handle different types of alerts or monitors differently. For instance, you might want to route high-priority incidents to a specific support group.
  • Error Handling: Ensure that your transform scripts include error handling to manage any issues that arise during the data transformation process. This prevents data corruption and ensures reliable incident creation.

5. Automating Incident Creation: From Datadog Monitor to ServiceNow Incident

One of the most powerful aspects of the integration is the ability to automatically create incidents in ServiceNow based on Datadog monitors. This ensures that critical issues are tracked and managed without manual intervention.

Steps for Automation:

  • Monitor Configuration: In Datadog, configure your monitors with the necessary conditions to trigger incident creation. Ensure that these monitors are tightly aligned with your organization's incident management policies.
  • Webhook Setup: Set up a webhook in Datadog that sends alerts to ServiceNow. This webhook should include all relevant information, such as the monitor name, alert status, and any tags or metadata. Using webhooks, you can select multiple monitors to report to ServiceNow instead of the typical method of using “@servicenow-<instance>” in the monitor notification message which needs to be configured on each monitor individually.
  • Incident Creation Rules: In ServiceNow, define rules that dictate how incidents are created from incoming alerts. This includes setting the incident type, priority, assignment group, and any other relevant fields.
  • Verification: Regularly verify that incidents are being created as expected. Monitor for any discrepancies or missed alerts, and adjust your setup as needed.

Conclusion

Integrating ServiceNow with Datadog the right way involves careful planning and configuration. By focusing on key areas like host-based monitors, tag propagation, CMDB lookup, transform scripts, and automated incident creation, you can build a robust and efficient incident management system. This integration not only helps in detecting and resolving issues faster but also ensures that your IT operations are streamlined and effective.

Take the time to implement these best practices, and your organization will be well-equipped to handle the complexities of modern IT environments.

Have questions? Reach out to our team at chat@rapdev.io.

Written by
Sid Nigam
Boston, MA
Boston based Senior Cloud Engineer with expertise in DevOps and AWS. My hobbies include watching and playing soccer, F1, traveling, and cooking.
you might also like
back to blog