NPW Insights (Free): Week 4/4 for DevOps Engineer

NPW Insights (Free): Week 4/4 for DevOps Engineer

Confidential containers on ACI. Future of service mesh. New job roles in coming years. Updates in Amazon Inspector, Azure Login Apps telemetry, Aazon Neptune. AlloyDB global.

NPW Research

Orchestration

Confidential containers on ACI

The service is now in public preview. It enables Azure Container Instances to protect data in-use by processing in encrypted memory. This happens through a hardware-managed key unique to each container group. These instances also support policies to verify that the code to process data is trusted and verified before execution. Another useful feature includes remote guest attestation, so a client can review the attestation report before sending sensitive data to the application within the container group. DETAILS

Future of service mesh

Bill Mulligan of Isovalent says it will just become another part of networking, “allowing us  to have a complete connectivity story from Layer 1 to Layer 7 for microservices, applications, and workloads.” Service mesh solves the networking complications of dynamic distributed computing – but because it is limited to application layer, it doesn’t have visibility below that layer. What’s needed is observability and control on Layer 3/4, and API communication on Layer 7. That’s what’s happening, and eBPF is accelerating this shift, he says. ARTICLE

Career

New job roles in the coming years

Next generation of cloud will be defined by modern compute, polyglot DBs, AI/ML, and hybrid multicloud. Consequently, DevOps will branch out into Cloud-native Ops, Edge-native Ops, AI/MLOps, and multi-cloud architects. Here are links to Google Cloud learning pathways and free courses for skill sets that will be needed for these roles. GOOGLE BLOG

Security

Threat Detection and Response Report

A key theme from a Google Cloud survey of 400 SecOps practitioners included comparisons between on-prem and cloud security. 25% more respondents said cloud offers more “opportunities to learn” because of richer telemetry and more automation. 84% believed they need to automate more to manage evolving threats better, and that the biggest threats on cloud, compared to on-prem, were crypto mining and data leakage. And that skills and knowledge of SecOps teams, who the majority thought were well-staffed, are inadequate. REPORT

Open source security report

Synopsis report looked at findings of 1,703 commercial codebase audits. 96% contained open source code, and 89% had an open source codebase that was more than 4 years out of date - an increase of 5 percent over 2022. 91% didn’t apply updates because they weren’t aware an update was available. RECOMMENDATIONS

Code scan Lambda functions

The Amazon Inspector feature, in preview, will enable vulnerability and best practice scans for custom proprietary code within a Lambda function. UPDATE

Primer on Cloud DevSecOps

DevSecOps makes security a requirement for all stakeholders. Cloud DevSecOps applies it specifically to cloud. It improves the speed at which application security is improved and minimizes delays from security problems. To implement it, tag cloud resources, automate cloud security operations. Use DAST and SAST tools in Build and Test, and SIEM for monitoring. Threat model in Plan phase and collect Logs and use WAF and RASP in operate phase. CNCF BLOG

Observability

Updated Azure Logic Apps telemetry

The recently updated Application Insights for Azure Logic Apps simplifies how observability data is queried, accessed and stored. It reduces cost of storage of telemetry data, reduces verbosity of traces, offers more descriptive exceptions, reduces redundant data, and provides better filtering. The blog is a detailed guide to major actions in Requests, Traces, Exceptions and Dependencies tables,  querying and filtering. AZURE BLOG

CloudWatch Internet Monitor enters GA

The feature lets you monitor internet availability and performance metrics between end users and AWS applications. UPDATE

ML-based telemetry analytics in AWS

Post describes architecture to collect telemetry from data pipeline jobs and identify abnormal runtimes, slow-running jobs, detect insider threats, and monitor proactively. Automated monitoring metrics are collected from AWS analytics services and sent to CloudWatch, alarm is set for event detection, and notification is set to SNS topic. CloudWatch provides anomaly detection on metrics, and OpenSearch Service is used to combine query access  times with employee data to detect insider threats. AWS ARCHITECTURE BLOG

SaaS outages and observability

Modern apps depend on multiple cloud services which, when combined, lower the reliability of the application. Lack of visibility into these dependencies slows incident response teams as status pages are usually manually updated. How to fix? Treat cloud dependencies as your network and infrastructure, and build observability into their performance, availability, and functionality. ARTICLE

Guide to distributed tracing

Distributed tracing is a way of tracing a single transaction across multiple cloud-native services. The overview here is comprehensive and includes specifics about how to implement distributed tracing, how to analyze trace data, some best practices and the tools available. GUIDE

App Building

Google Cloud AlloyDB global

The PostgreSQL-compatible database service is now available in 16 new regions, and coming to more soon. Google Cloud claims the service is 4 times faster for transactional workloads and up to 100 times faster for analytical queries than standard PostgreSQL. UPDATE

Amazon Neptune serverless scales down

Minimum scaling requirements reduced from 2.5 Neptune Capacity units to 1; will use 2.5X lesser resources when the graph database isn’t responding to queries. UPDATE

Spring Cloud GCP 4.0 announced

Brings support for Spring Boot 3.x, but migration (guide included) involves breaking changes and Java 17 is a minimum requirement. GOOGLE BLOG

Operations + Developer Tools

AWS Trusted Advisor Dashboard

AWS Trusted Advisor gives recommendations to optimize around cost, performance, security, fault tolerance and service quotas. This walkthrough of its dashboard has many tips to query that data in useful ways, and how to access various features and set up useful alerts. There is a section for each of the five “pillars” with specifics that can help pinpoint issues at a granular level. Good to check out, especially for features that you might have missed so far. GUIDE

Team topologies and platform engineering

Key take-outs from the interview with co-author of Team Topologies, Manuel Pais: A platform team is a stream-aligned team (one which has build-and-run ownership) and can have multiple teams instead of being a single bucket. Platform teams can have enabling teams, which impart new skills to other teams. When scaling platform teams, there should be internal cohesion, and they should provide a consistent interface to be used by their internal consumers. This requires finding boundaries between different platforms. INTERVIEW

The cultural side of platform engineering: Spotify’s story

Director of Engineering uses Github activity and HR data to understand how developers work, interact, and make decisions. He uses this info to decide if teams should be split or expanded, and where to deploy skills, with the goal of mitigating bottlenecks at individual and team level. STORY

What CSP products got the highest attention. Topics that generated keen interest. Based on what was read by 12,000+ DevOps engineers, software engineers and solution architects the previous week.

Top reads last week for DevOps Engineers

  • Microsoft sponsored Gigaom report on SQL Server benchmarking on Azure VM versus EC2 got high attention. Not unexpected since the study showed Azure VMs outperforming EC2 both on speed and cost.
  • Google-IDC data and AI trends was widely read. The report findings came from a survey of 800 organizations (but you need to sign up to access)
  • Also trending was the Azure service in preview that allows high bandwidth, low latency read-write access service for HPC workloads