The Doppler Quarterly Summer 2016 | Page 28

• Consume information from each component , deliver that information to the correct target , and produce that information for the target .
• Maintain high-performance data rates to instantly react to service performance alerts and responses .
• Log all communications with current and future analytics .
• Recover quickly from communications failures , including rolling forward and rolling back .
Performance Analytics Engine
The performance analytics engine is a “ pluggable ” software component that provides “ embedded ” analytical services . You leverage these analytics to dynamically manage service performance during production . It ’ s the job of the performance analytics engine to :
• Provide real-time analytical services around the performance of all connected services and recommend changes in threshold , capacity , or behavior . For instance , if a service runs under-threshold and the agent generates an alert , the analytics engine can determine a course of automatic action based on the current performance data of that service from the time series database , and the profile of that service from the service repository . The resulting actions could be to dynamically increase the cache size of the database , reroute to another server , or alert a human .
• Provide ad hoc reporting on service performance and trending over time .
• Dynamically learn as it gathers data , understanding cause and effect as performance issues are identified and resolved .
• Provide the administrative console as well as APIs for integration with other system management consoles .
Time Series Database
The time series database deals with both structured and unstructured complex data . This database stores all raw data that is recorded around service performance , such as time , service response , database response , network latency , and other information that could be used in the service ’ s performance profile . There are two key roles of the time series database :
• Storing massive amounts of time series data to actively monitor and analyze performance
• Recording all performance issues ( e . g ., alerts ) and solutions to those issues so that that system can respond right away the next time it happens
Alert Management
The alert management system is a piece of software that deals with services placed into an alert status by their respective agents , making sure to deal with the alerts per the predetermined policies that are stored in the service repository . It ’ s the job of the alert management system to :
• Capture alerts transmitted though the communication manager from the agents ; typically , these are alerts generated by services falling out of thresholds , or failing altogether .
• Evaluate each alert in terms of severity and connect to the analytics engine for an analysis of the issue and potential automatic corrective action . The alert management system then generates corrective action , if instructed to do so by the analytics engine . It can also alert humans .
• Record each alert , including cause and resolution , in the time series database to aid in future analysis and determine the right path to fix future performance problems .
• Trace through paths to better determine the origin of the alert and other services that should be dealt with in the resolution of the problem .
26 | THE DOPPLER | SUMMER 2016