Nagios Business Process View and Nagios Business Impact Analysis ---------------------------------------------------------------- The software and documents in this package have been produced by Sparda-Datenverarbeitung eG, Nuernberg, Germany, Bernd Stroessenreuther and are available to the community under the conditions of GNU General Public License Version 2, see LICENSE. Short overview -------------- The AddOn "Business Process View" takes results of the single nagios checks out of NDO or IDO backend (NDO database, IDO database, ndo2fs, Merlin, mk_livestatus, Icinga-API) and builds up aggregated states. How they are associated is described in one or more config files. There is the possibility to make "and" conjuctions, "or" conjunction and other... A business process (as defined by such a formula) can be used as a part of another business process. So You can build up a hirachical structure to describe the state of Your Application. The AddOn "Business Impact Analysis" allows You to simulate Outages. You can set manually the state of each single component to each state You like and look, how this would impact Your applications. Help ---- If You have problems installing or using this AddOns, please visit the Support page on our homepage: http://nagiosbp.projects.nagiosforge.org/support.shtml Here You find a FAQ and some helpful mailinglists. Also a very active community of users You find at http://www.nagios-portal.org/ (this one is in german only) What is it? ----------- (a little bit more in detail) You are running a lot of applications for Your customers. (I use the word customers because as a system administrator You allways have customers, no matter if they are employees or customers of the company You are working for.) Each application needs a few or a lot of components (like webservers, application servers, DNS- or mail servers, LAN- or WAN-connections ...) to work properly. There are components You need for only one application, and of course there are components which are important for more applications (e. g. DNS-servers) You already are running Nagios or Icinga to monitor all of these components I guess. (Otherwise You would not think about this AddOn.) If You are the only system administrator of Your company, You will probably know all Your applications very well, You know which application needs which components - then You will not need this AddOn. If there are more admins, You probably will share work. This means each admin knows view applications very well and the other applications only a little bit. So maybe You would find it great to visualize, how all these components work together. If one ore more components fail, You want to see on one single page, which applications are unavailable for Your customers - in this case: Install this addon. It has two modes: 1. Nagios Business Process View it shows the actual state of Your applications 2. Nagios Business Impact Analysis this is a simulation mode. You can set each of Your components to every stat You like. So if You want to know: What would be if my web server would fail now? Just klick the state of Your web server an set it to CRITICAL Return to the overview page and look, which applications are now in state CRITICAL. The states of the single services and hosts defined in Nagios or Icinga are taken from the NDO or IDO backend (NDO database, IDO database, ndo2fs, Merlin, mk_livestatus or Icinga-API). How it works ------------ You have one or more config files in which You define Your applications. You define which components are needed and how they are related. So go and set up a config file called etc/nagios-bp.conf There You have to type some simple formulas for defining business processes. e. g. loadbalancers = loadbalancer1;System Health | loadbalancer2;System Health website_webserver1 = webserver1;HTTP & webserver1;HTTPD Slots The first string is the name You want to give to the business process. On the right side You have strings in the form ; The example above means: You have a loadbalancer cluster. If one of them is in ok state, the application is available for the customer. So You define a "or" conjuction for Your business process. If You are looking if Your webserver1 works well, You normaly look for the Check HTTP and also for the check "HTTPD Slots". If both are in OK state, You know, the webserver1 is working well. So we put these two together by making a "and" conjuction. Next step is, to give a name to each business process You defined, so type display 0;loadbalancers;Loadbalancer Cluster display 0;website_webserver1;WebServer 1 The digit after the keyword display is the priority class, in which these business process ist displayed in the top level view. 0 means: No display 1, 2,...: Display in the given priority. As You can use single business processes again in other processes, display 0 is very useful, if You do not want to display each sub-process in the top level view. Let's have a complete example: internetconnection = internetconnection;Provider 1 | internetconnection;Provider 2 display 0;internetconnection;Internet Connection loadbalancers = loadbalancer1;System Health | loadbalancer2;System Health display 0;loadbalancers;Loadbalancer Cluster dns = dns1;DNS | dns2;DNS | dns3;DNS display 0;dns;DNS Cluster website_webserver1 = webserver1;HTTP & webserver1;HTTPD Slots website_webserver2 = webserver2;HTTP & webserver2;HTTPD Slots website_webservers = website_webserver1 | website_webserver2 website = internetconnection & loadbalancers & dns & website_webservers display 0;website_webserver1;WebServer 1 display 0;website_webserver2;WebServer 2 display 0;website_webservers;WebServer Cluster display 1;website;WebSite If these line are the only ones in Your nagios-bp.conf file, this should work. You have defined Your first business process! Congratulations! Go and view http://your-host/nagiosbp/cgi-bin/nagios-bp.cgi (I just assume, You have all these services and hosts defined in your Nagios or Icinga configuration or adapted the example.) Care for the correct spelling! The and the must exactly match the spelling, how You defined them in Nagios/Icinga. Watch for correct upper case and lower case. More examples can be found in etc/nagios-bp.conf Syntax check ------------ To make sure the syntax of Your nagios-bp.conf is correct, run bin/nagios-bp-consistency-check.pl In this syntax it checks Your default nagios-bp.conf. If You want to check some other file call bin/nagios-bp-consistency-check.pl Some more keywords ------------------ In the top level view (http://your-host/nagiosbp/cgi-bin/nagios-bp.cgi) the right column is empty at the moment. It can be used to display some short information according the business process. This can be a static or dynamic string. e. g. You want to display, how many users are currently logged into Your webshop if You defined a business process WebShop. Or You want to display a short announcement, ... Just write a little script that displays the information You like (just one line to stdout) and the You configure: external_info website;echo 'Please note: Today maintainance on WebServer1,
Production only on WebServer2' or external_info website;/path/to/your/script.sh Maybe one little string is not enough for all the information You have. Then info_url website;/more_info/website.html or info_url website;http://some.other.site.com/more_info/website.html would be of value for You. Just linking to a WebSite with all the information. In the first syntax, the page is located on the Nagios/Icinga machine. In the second syntax, it is somewhere in the world. If You defined some info_url, a little info icon appears, which can be clicked by Your users. Maybe You want to use it for some emergency documentation or so. complete syntax description --------------------------- PLEASE NOTE: The order of Your definitions is important!! If You use a (sub level) business process in the definition of another business process, make sure You define the subl level process BEFORE You use it. = ; [& ;]+ Services have a "and" conjuction. All of them are needed for the application to be available to the customer. Or in other words: If all of the given services are OK, the defined business process has state OK. If at least one is WARNING, the process has state WARNING. If at least one is CRITICAL, the process gets CRITICAL. = ; [| ;]+ Services have a "or" conjunction. This is often used if You have redundant systems. If one of them is working, the application is available to the customer. If at least one service is OK, the process gets state OK. If all services are CRITIAL and at least one is WARNING, the process gets state WARNING. Only if all of the services are CRITIAL, the process gets CRITICAL. = of: ; + ; [+ ;]+ Use this one, if You have a number of application servers running the same application and You know You need at least of servers active for load reasons. e. g. appserver_cluster = 2 of: appserver1;WebShop + appserver2;WebShop + appserver3;WebShop + appserver4;WebShop So if at least 2 of the given services are in state OK, the process is OK. ;Hoststatus You also can use the results of Nagios/Icinga host checks in Your business processes. In this case You use this syntax. Instead of ; You always can use too, where is the name of a business process You defined BEFORE. display ;; The digit x is the priority of the process. The process is displayed in this priority class in the top level view. 0 means: This process is not displayed in the top level view. is the name or description used when displaying the process. (The user never sees in the GUI, always ) external_info ;