Cisco Prime Nerk 43 User Guide
Have a look at the manual Cisco Prime Nerk 43 User Guide online for free. It’s possible to download the document as PDF or print. UserManuals.tech offer 53 Cisco manuals and user’s guides for free. Share the user manual or guide on Facebook, Twitter or Google+.
CH A P T E R 10-1 Cisco Prime Network 4.3.2 User Guide 10 How Prime Network Handles Incoming Events These topics explain how Prime Network handles incoming events and provides information about events and tickets in the GUI clients: How Events Flow Through Prime Network Components, page 10-1 Standard and Upgraded Events, page 10-4 How Prime Network Correlates Incoming Events, page 10-4 How Prime Network Calculates and Reports Affected Parties (Impact Analysis), page 10-11 Clearing, Archiving, and Purging and the Oracle Database, page 10-12 Checking An Event’s Registry Settings, page 10-15 How Events Flow Through Prime Network Components Figure 10-1 illustrates how Prime Network responds to incoming notifications from devices. The exact flow depends on how Prime Network is configured in your network. The flow is described in detail in How Prime Network Correlates Incoming Events, page 10-4. NoteFigure 10-1 illustrates the logical flow of events through Prime Network. The actual network communication is subject to the transport configuration between the gateway server and units.
10-2 Cisco Prime Network 4.3.2 User Guide Chapter 10 How Prime Network Handles Incoming Events How Events Flow Through Prime Network Components Figure 10-1 Logical Flow of Incoming Events Received By Prime Network External OSS Apps (SNMP) Fault Agent (AVM 25)Fault Manager Even repo Gateway VNEVNE Units - New tickets and related events - Updates to existing tickets - Upgraded events with correlation - System internal events - E-mail notifications - Standard events - Events from specified unmanaged devices (if enabled) - SNMP V1/V2C/V3 trap notifications - EPM-Notification-MIB- New tickets and related ev - Updates to existing tickets - Upgraded events with corr - System internal events External OSS Apps (BQL) Event Collector (AVM 100) Raw events from devices Raw events from devices Event Notification Service Optional raw event filters that drop noise (or specified syslogs, traps) Optional global event filter that drops events when system load is heavy Correlation Fault DB OracleStandard events Upgraded events and tickets 1 1 2 22
10-3 Cisco Prime Network 4.3.2 User Guide Chapter 10 How Prime Network Handles Incoming Events How Events Flow Through Prime Network Components The main components involved in fault processing are described in the following table. Component Located on: Description Event Collector (AVM 100)Gateway or unit(s)1 1. By default, the Event Collector is installed on the gateway. All supported configurations are described in the event monitoring topics in the Cisco Prime Network 4.3.2 Administrator Guide. Examines events for basic information and associates and distributes events to corresponding VNEs. If handling events from unmanaged devices is enabled, saves these events to the database; if an Event Notification Service is enabled, forwards these events to the gateway. If a raw event (noise) filter is enabled, drops the events. NoteUser can create AVMs from 101-999 automatically using AUTO-VNE Assignment feature. See Chapter 4 of Cisco Prime Network Administrator Guide VNEsHosting unit Parses and associates events to specific components in NEs; if the NE is a physical interface, checks if alarms are disabled on the interface. Determines whether events are standard or upgraded (see Standard and Upgraded Events, page 10-4). Attempts to correlate the event, depending on its configuration, and enriches the event with additional information (category, nature). Forwards events to AVM 25. If a global event filter is configured and system load is high, drops any events that match the filter (by default, no filters are implemented; see the Cisco Prime Network 4.3.2 Administrator Guide). Fault Agent (AVM 25)All gateways and unitsOpens new alarms and tickets, and persists (saves) information in the database. Uncorrelated events that are ticketable—Opens new alarms and tickets and saves information in database (active partition). Uncorrelated events that are not ticketable—Saves the information in database as archived. Correlated events—Updates the ticket and saves the information in database. AVM 25 requires a database connection to store information in the Oracle database. If a direct connection is not available, configure Prime Network to forward events to another AVM 25 that has a database connection (called using a proxy AVM 25, described in the Cisco Prime Network 4.3.2 Administrator Guide). Ticket AgentOracle databaseAssociates new events to existing alarms and tickets. DatabaseOracle databaseStores all tickets, alarms, and events which can be viewed from: Events client—Tickets, Service, Audit, Provisioning, Security, System, Standard, All events Vision client—Tickets, Network Events, Provisioning Events, Latest Events Fault ManagerGateway If an Event Notification Service is configured, retrieves information for e-mail and trap forwarding and forwards information to external OSS applications.
10-4 Cisco Prime Network 4.3.2 User Guide Chapter 10 How Prime Network Handles Incoming Events Standard and Upgraded Events For more details about what each component does, see How Prime Network Correlates Incoming Events, page 10-4. Standard and Upgraded Events If the VNE cannot extract adequate information about an event, it performs some basic parsing and saves the event in the database. These events are called standard events. A standard event is an event that Prime Network cannot match with any of the rules that define events of interest. Standard events are not processed for correlation. They are immediately saved to the database and marked as archived. Standard events can be viewed from the following clients: From the Events client under the Standard tab. From the Vision client under the Network Events tab in a device inventory view. If enabled from the Administration client, standard events are also displayed in the Latest Events tab in a map view. An upgraded event is an event that a VNE can match with the rules that determine events of interest. Upgraded events are parsed and if are enabled for correlation, the VNE begins the correlation process. Not all upgraded events are enabled for correlation. For an illustration of how Prime Network handles standard events, see How Prime Network Correlates Incoming Events, page 10-4. How Prime Network Correlates Incoming Events NoteAn event can have many additional correlation and metadata attributes that determine how Prime Network processes the event. Examples are provided in Event Correlation Examples, page C-1. The correlation process determines the causality for events, event sequences, and tickets. Causality is represented in a ticket’s correlation tree, with a root cause event at the top (for an example, see Figure 11-6 on page 11-16). The process begins when Prime Network receives an incoming event. The Prime Network Event Collector (AVM 100) receives all incoming events—external events like traps and syslogs. The Event Collector performs some basic parsing to associate the event with the appropriate VNE. If handling events from unmanaged devices is enabled, AVM 100 saves these events in the database. If a raw event (noise) filter is enabled, AVM 100 drops the events. You can configure the Event Notification Service to forward these events to OSSs or e-mail recipients. This is done from the Administration client and is described in Cisco Prime Network 4.3.2 Administrator Guide. The following figures illustrate how Prime Network handles events that are: Enabled for correlation, in Figure 10-2. Not enabled for correlation, in Figure 10-3.
10-5 Cisco Prime Network 4.3.2 User Guide Chapter 10 How Prime Network Handles Incoming Events How Prime Network Correlates Incoming Events Figure 10-2 Event Processing—Events With Correlation Enabled AVM 100 VNE Event from managed device? (If saving events from unmanaged devices is enabled) Standard events Upgraded events Ye s Ye sNo Can identify and associate?No Ye s Flapping? SuppressNo Local Network Ye s Correlate? Local or Network? Cause found? Suspend for 2 minutes Ye s NoNo Ye s Ticketable? Create new alarm and ticket Look for cause in local VNELook for cause in network Ye s N o Select best causeLook for an event sequence Sequence found? Save the information Ticket Agent updates alarm and ticket Active PartitionArchive Partition Save the informationAVM 25 Database Running on gateway or unit Information from AVM 25s running on other units/ gateways
10-6 Cisco Prime Network 4.3.2 User Guide Chapter 10 How Prime Network Handles Incoming Events How Prime Network Correlates Incoming Events Figure 10-3 Event Processing—Events With No Correlation Parse the Event To Identify It, Associate It With a Source, And Determine If It Is a Standard or Upgraded Event The VNE begins the event identification process by extracting and parsing the following information from the raw event: Event Functionality Type—Trap, syslog, or Service event Event Type and Subtype—Identifier describing the fault, such as Link Down (the subtype provides further information) AVM 100 VNE Event from managed device? (If saving events from unmanaged devices is enabled) Standard events Upgraded events Ye s Ye sNo Can identify and associate?No Ye s Flapping? SuppressNoNoCorrelate? Ye s NoNo Ye s Ticketable? Create new alarm and ticket Look for an event sequence Sequence found? Save the information Ticket Agent updates alarm and ticket Active PartitionArchive Partition Save the informationAVM 25 Database Running on gateway or unit Information from AVM 25s running on other units/ gateways
10-7 Cisco Prime Network 4.3.2 User Guide Chapter 10 How Prime Network Handles Incoming Events How Prime Network Correlates Incoming Events Event description strings—Content of the notification message content and a short description Event Severity—Event’s importance, derived from the setting for the event’s severity registry key): –Flagging—Indicates a fault: Critical (red), Major (orange), Minor (yellow), or Warning (sky blue) –Clearing—Indicates a fault that is resolved: Cleared (green) –Informational—Information only (dark blue) If the VNE cannot extract adequate information, it performs some basic parsing and saves the event in the database. These events are considered standard events. No further processing is performed on standard events. They are immediately saved to the database and marked as archived. If the VNE can extract the information listed above, the event is considered an upgraded event and the VNE begins event association (the next step). Some traps and syslogs may expedite polling, which means that the VNE polls the device for more information without waiting for the device’s usual polling cycle. This is the case for traps and syslogs that are likely indicators of a Service event, allowing quicker detection of any problem. (If a VNE is in the maintenance state, it does not expedite events but it will correlate events.) The VNE continues parsing the event to identify the source location (for example, associating a port down to a device’s physical interface). In rare cases, the event source may not yet be in the VNE model, such as when a new module is installed. Prime Network may not have finished the process of polling the device interfaces and building (populating) the model. A retry mechanism minimizes this occurrence, but if it persists, the association logic falls back to the network element that is the source of the new event. To check a Trap, Syslog, or Service event’s default severity setting, see Checking An Event’s Registry Settings, page 10-15. Optimize the Expedite Polling You can optimize the expedite polling whenever traps and syslogs are received.The optimization can be performed at 2 levels namely VNE level and System level. To enable the optimization polling, use the runRegtool command and then restart VNEs. When optimized-expedite is enabled, Prime Network waits for the specific time (delay time) that is based on the registration delay of the registration that is being expedited or window-length (default value is 20 seconds and configurable), whichever is greater, and then it will expedite. In the span of delay time, if the same event occurs for multiple times then expedite occurs only once that is after the delay time of the last event received. Table 10-1 describes runRegTool commands for different levels. Table 10-1 Enable Optimize the Expedite Polling Level Command VNE runRegTool.sh -gs localhost set 127.0.0.1 avm/agents/da//optimized-expedite/enabled System runRegTool.sh -gs localhost set 127.0.0.1 site/agentdefaults/da/optim ized-expedite/enabled
10-8 Cisco Prime Network 4.3.2 User Guide Chapter 10 How Prime Network Handles Incoming Events How Prime Network Correlates Incoming Events Note‘true’ enables the expedite optimization and ‘false’ disables the expedite optimization. You can also override the window-length of the time span by using the following runRegTool command and then restart the VNEs: runRegTool.sh -gs 127.0.0.1 set 0.0.0.0 agentdefaults/da/optimized-expedite/window-length NoteThe time-value should be in milli seconds. The inventory update for this flapping event or a duplicate event happens after the flapping or duplication stops plus the “delay time”. Examine Event for Flapping NoteFlapping detection is enabled for certain events and disabled for others (the flapping registry key is set to true or false). If an event is not configured for flapping, the VNE skips this step. To check a Trap or Syslog event’s default flapping setting, see Checking An Event’s Registry Settings, page 10-15. After the event is associated with a source location, the VNE examines it to see if it is a flapping event. Flapping is a flood of consecutive event notifications related to the same alarm. It can occur when a fault causes repeated event notifications (for example, a cable with a loosely-fitting connector.) Prime Network represents the new notifications as a single event with a flapping subtype. The VNE identifies a sequence of events as flapping if: All events are of the same event type and are associated with the same source. The event occurs more than 5 times with less than 1 minute between events (default). If the event is part of a flapping sequence, it is suppressed (not saved in the database or displayed in the clients), and the event’s duplication count in the alarm is incremented. During flapping, the fault management logic generates periodic event notifications with a Flapping Update subtype that also becomes part of the event sequence. After the fault stabilizes and the new event notification frequency returns to normal, the fault management logic terminates the alarms flapping mode by generating a final event notification (either Flapping Stopped Cleared or Flapping Stopped Non-cleared subtype), based on the last received new event notification. Determine If Event Is Enabled for Correlation The VNE examines the event to see if it is enabled for correlation—that is, whether Prime Network should attempt to find a root cause for the event. In this example, the event is called Event A: Event Registry Key If set to true, Prime Network will: If set to false, Prime Network will: correlationTry to find Event A’s root cause. Not try to find Event A’s root cause. is-correlation-allowedAllow other events to correlate to (be caused by) Event A.Not allow other events to correlate to (be caused by) Event A.
10-9 Cisco Prime Network 4.3.2 User Guide Chapter 10 How Prime Network Handles Incoming Events How Prime Network Correlates Incoming Events An example of an event with a correlate=false registry setting is a Link Down Due To Oper Down event, where the event is its own cause. An example of an event with a is-correlation-allowed=false registry setting is a syslog that does not cause other events. The VNE attempts to identify an event sequence (see Identify Event Sequences and Hierarchies, page 10-10). Because clearing events are associated to their predecessor, there is no need to correlate clearing events. To check Trap, Syslog, or Service event’s default correlation and is-correlation allowed settings, see Checking An Event’s Registry Settings, page 10-15 Wait for New Incoming Events The VNE suspends its correlation process for the event for 2 minutes so other related events can be detected. During this time, the VNE does not perform processing for the new event. (Although this means event updates to the Oracle database and the Vision client are delayed by 2 minutes, the events are immediately displayed in the Vision client Network Events tab.) Check VNE for Correlated Events (Local and Network Correlation) and Identify Root Cause When the 2-minute suspension period has expired, the VNE begins the process of local correlation or network correlation. This is controlled by a setting in the registry. If an event’s activate-flow registry key is set to true, the VNE performs network (flow) correlation. Examples of events that use network correlation are LSP Down, MPLS TE Tunnel Down, and OSPF Neighbor State Change. If an event’s activate-flow registry key is set to false, the VNE performs local (key) correlation. To check a Trap, Syslog, or Service event’s default activate-flow setting, see Checking An Event’s Registry Settings, page 10-15. Local (Key) Correlation In local correlation (key correlation), the event source VNE is examined. In other words, correlation is performed on the local VNE only. Most trap and syslog events use the local correlation process. The correlation logic examines the local VNE for possible causing events. These potential causing events must fall within the new event’s examination time: The 7 minutes before the examination process begins, or the 2 minutes after the examination process finishes. After this 9 minute period has passed, the new event expires (meaning it cannot be considered a causing event for a new incoming event). In addition, potential causing events must be configured to allow correlation, and must contain a correlation key that matches one of the new event’s correlation keys. Network (Flow) Correlation In network correlation (flow correlation), the VNE examines events that occurred on different VNEs to see if they may be the cause of the local problem. Network correlation uses historic snapshots of the VNE model to search both the local and other VNEs for correlated events that meet the following criteria: Are configured to allow correlation. Arrived within the 7 minutes before the event and up to 2 minutes after the event. Exist on VNE components that appear on a flow path traversed according to the forwarding information of the new event.
10-10 Cisco Prime Network 4.3.2 User Guide Chapter 10 How Prime Network Handles Incoming Events How Prime Network Correlates Incoming Events The correlation is based on a flow that runs across the Prime Network model and topology. Network correlation is most successful if the event holds forwarding information, such as the IP address of a Border Gateway Protocol (BGP) neighbor, or a Frame Relay virtual connection. Network correlation is well suited for the following scenarios: The event represents a failure in a connection or service that spans multiple devices. For example, an MPLS traffic engineering (TE) Tunnel Down event tries to correlate to faults on the path that the tunnel traverses. Logically, the new event can result from events that occurred in other devices. For example, Prime Network tries to find the root cause for a Device Unreachable event in other devices by performing a flow to the management IP address. Identifying the Root Cause If the VNE finds more than one potential causing event, the root cause is determined using event weight. The heavier the weight, the more likely it will be chosen as the cause. This is controlled by the weight registry key. To check a Trap, Syslog, or Service event’s default weight setting, see Checking An Event’s Registry Settings, page 10-15. Identify Event Sequences and Hierarchies Next, the VNE attempts to identify event sequences (alarms). Events that have the same type and the same source are considered part of an event sequence. VNEs use the predecessor/successor relationship to properly handle incoming duplicates without either discarding them or creating new tickets. When an event arrives, Prime Network searches its stored alarms for a possible predecessor. It identifies possible predecessors and finds the correct predecessor by matching it against the incoming alarm according to the following rules: The predecessor and successor both come from the same OID. The predecessor and successor are of the same alarm type. The predecessor is not archived. The VNE forwards to AVM 25 the information it has gathered thus far (including uncorrelated events). Save Information to Database, and Update or Open New Alarm and Ticket AVM 25 saves all of the information it has received to the database. The actions that Prime Network takes depends on whether Prime Network could find the event’s root cause and whether the event is ticketable (is-ticketable registry setting); Root Cause/Ticketable Prime Network does the following: Root cause was found (the event was correlated to another event). Does not matter if event is ticketable or not.AVM 25 saves the information in the database active partition. The database Ticket Agent updates the event and ticket information (severity, last modification time, event counter). No root cause was found (the event was not correlated to another event), and the event is ticketable.AVM 25 opens a new alarm and ticket and saves the information in the database active partition. No root cause was found (the event was not correlated to another event), and the event is not ticketable.AVM 25 saves the information in the database archive partition. This includes events that are enabled for correlation, but no root cause was found.