observability

Senser Extends AIOps Reach to Manage SLOs and SLAs

Senser is extending the reach of its artificial intelligence for IT operations (AIOps) platform to now include an ability to define and maintain service level agreements (SLAs) and service level objectives (SLOs). SLOs are a set of internal performance goals that require access to telemetry data from service level indicators (SLIs), while an SLA is a formal commitment to maintaining specific levels of service. Senser CEO Amir Krayden said the company’s AIOps platform collects data from SLIs and then applies predictive AI models to enable IT teams to achieve SLOs and SLAs. The Senser AIOps platform leverages extended Berkeley Packet Filter (eBPF) and graph technology to gain visibility into the entire IT environment versus requiring IT teams to deploy agent software. Machine learning algorithms are then used to aggregate and analyze that data to define thresholds for predicting performance in addition to recommending benchmarks for tracking SLOs and SLAs. That approach provides a single source of truth for identifying the actual level of service being provided based on a topology of the infrastructure, network, applications and application programming interfaces (APIs) that makes it possible to identify the root cause of issues and the potential impact of an outage for degradation of performance. IT teams have been attempting to achieve and maintain SLAs and SLOs for decades, but given all the dependencies that exist in a distributed computing environment, it’s difficult to achieve that goal. Senser is making a case for applying AI within the context of a platform for automating the management of IT to define and maintain SLOs and SLAs to make it possible to consistently manage SLAs and SLOs in a way that reduces the level of cognitive load that would otherwise be required. Senser is also working toward adding generative AI capabilities to provide summaries that explain what IT events have occurred. Collectively, the goal is to provide IT teams with a more efficient holistic approach to monitoring and observability that legacy platforms are not going to be able to achieve and maintain, said Krayden. At the core of that capability is eBPF, a technology that allows software to run within a sandbox in the Linux microkernel. That capability enables networking, storage and observability software to scale at much higher levels of throughput because they are no longer running in user space. That’s especially critical for any application that needs to dynamically process massive amounts of data in near-real-time. As the number of organizations running the latest versions of Linux continues to increase, more hands-on experience with eBPF will be gained. IT teams may not need to concern themselves with what is occurring in the microkernel of the operating systems, but they do need to understand how eBPF ultimately reduces the total cost of running IT at scale. Ultimately, the goal is to reduce the current level of complexity that today makes effectively managing highly distributed computing environments all but impossible for IT teams to manually maintain in an era where the pace at which applications are being built and deployed only continues to accelerate.

Read More

Grafana Labs Acquires Asserts.ai to Bring AI to Observability

At its ObservabilityCON event, Grafana Labs today announced it has acquired Asserts.ai to automate configurations and customization of dashboards. In addition, the company is previewing an ability to apply artificial intelligence (AI) to incident management to make it simpler to surface the root cause of an issue. Sift is a diagnostic assistant in Grafana Cloud that automatically analyzes metrics, logs and tracing data, while Grafana Incident is a generative AI tool that summarizes incident timelines with a single click, creates metadata for dashboards and simplifies the writing of PromQL queries. Grafana Labs is also making generally available an Application Observability module for Grafana Cloud to provide a more holistic view of IT environments. Finally, Grafana Beyla, an open source auto-instrumentation project that makes use of extended Berkeley Packet Filtering (eBPF), is now also generally available. That tool enables DevOps teams to collect telemetry data for an IT environment from a sandbox environment running in the microkernel of an operating system. That approach makes it simpler to automatically instrument an IT environment, but there are instances where DevOps teams will be managing complex applications that will still require them to collect telemetry data via the user space of an application. Richi Hartmann, director of community for Grafana Labs, said collectively, these additional capabilities will make it simpler to apply observability across increasingly complex IT environments. For example, the AI technologies developed by Assert.ai will make it possible for DevOps teams to start sending data to Grafana Labs that will enable the cloud service to identify the applications and infrastructure being used. AI models will then be able to automatically generate a custom dashboard for that environment that DevOps teams can extend as they see fit, said Hartmann. In general, machine learning algorithms and generative AI are starting to be more widely applied to observability. The ultimate goal is to automatically identify issues in ways that reduce the cognitive load required to manage complex IT environments while also making it easier to launch queries that identify bottlenecks that could adversely impact application performance and availability. It’s not clear to what degree observability tools might eliminate the need for monitoring tools that track pre-defined metrics, but most DevOps teams will likely be using a mix of both for the foreseeable future. In the meantime, IT environments are only becoming more complex as various types of cloud-native applications are deployed alongside existing monolithic applications that are continuously being updated. The challenge is the overall size of DevOps teams is not expanding, so there is a greater need for tools to streamline the management of DevOps workflows. AI will naturally play a larger role in enabling organizations to achieve that goal, but it’s not likely to replace the need for DevOps engineers, said Hartmann. Conversely, many DevOps teams will also naturally gravitate toward organizations that make the tools they need to succeed available. Today, far too many manual tasks are increasing turnover as DevOps teams burn out. Organizations that want to hire and retain the best DevOps engineers will need to invest in AI. Of course, DevOps, at its core, has always been about ruthlessly automating as many manual tasks as possible. AI is only the latest in a series of advances that, over time, continue to make DevOps more accessible to IT professionals of […]

Read More

DevOps Halloween: Tricks and Treats

The world of DevOps is like a labyrinth—filled with choices at every turn. Some paths lead to efficiency and success, while others may lead to unexpected challenges and delays. In the spirit of Halloween, let’s explore the tricks and treats of DevOps choices to ensure your team ends up with a bag full of treats rather than some nasty tricks. Version Control: Treats of Consistency, Tricks of Complexity Treat: Implementing a robust version control system is crucial for any DevOps team. Tools like Git provide a reliable way to track changes, collaborate on code and maintain a history of your project’s evolution. Trick: However, the complexity of these systems can lead to confusion and errors if not properly understood. Branching strategies, for instance, need to be clearly defined to avoid chaotic merges and lost work. Continuous Integration & Continuous Deployment (CI/CD): Speedy Treats, Tricky Configurations Treat: CI/CD pipelines automate the process of code integration, testing and deployment, speeding up release cycles and ensuring more reliable software. Trick: Setting up these pipelines can be complex and error-prone. Misconfigurations can lead to failed builds, delayed releases or even the deployment of buggy code to production. Containerization: The Sweetness of Isolation, Beware of the Overhead Treat: Containers provide isolated environments for applications, ensuring consistency across development, testing and production. Tools like Docker and Kubernetes have revolutionized application development and deployment. Trick: Containerization adds an additional layer of complexity to your infrastructure. Mismanagement of containers can lead to resource inefficiencies, and Kubernetes itself has a steep learning curve. Monitoring and Logging: The Treat of Visibility, The Trick of Overload Treat: Comprehensive monitoring and logging give teams visibility into system performance and behavior, enabling proactive issue resolution and performance optimization. Trick: The sheer volume of logs and metrics can be overwhelming. Without proper tools and strategies for filtering and analysis, important information can be lost in the noise. Infrastructure-as-Code (IaC): Sweet Automation, Sour Complexity Treat: IaC tools like Terraform and AWS CloudFormation allow teams to automate and version infrastructure setup, ensuring consistency and reducing manual errors. Trick: IaC scripts can become complex and difficult to maintain. Errors in these scripts can lead to misconfigured infrastructure, potential security issues and resource waste. Collaboration and Communication: Treats of Teamwork, Tricks of Misunderstanding Treat: DevOps emphasizes the importance of collaboration between development and operations teams, fostering a culture of shared responsibility and continuous improvement. Trick: Miscommunication and lack of alignment between teams can lead to inefficiencies, mistakes and a breakdown in the collaborative process. Security: The Unseen Specter Treat: DevSecOps enables a “shift left” on security, integrating security checks and practices early in the development life cycle, ensuring safer, more secure applications. Trick: But this integration requires continuous attention and maintenance. Outdated dependencies, misconfigured settings and inadequate security practices can leave your applications vulnerable, turning the unseen specter of security issues into a ghastly reality. What about you? What areas of DevOps are ripe for trick or treat this Halloween season? In the grand scheme, DevOps offers a treasure trove of benefits, from faster releases and improved collaboration to higher-quality software. However, it’s not without its challenges. Navigating the DevOps landscape requires a careful balance between embracing automation and maintaining control. By being aware of the potential tricks and focusing on the treats, teams can build efficient, reliable and […]

Read More