NPW Insights (Free): Week 3/4 for DevOps Engineer

NPW Insights (Free): Week 3/4 for DevOps Engineer

Kata Containers in AKS, Kubernetes 1.25 in EKS, Wasm future, CloudNativeSecurityCon 2023 highlights, WAF for App Runner, AWS Security Hub new controls, AKS Node cost optimization, Github repo on building DevOps career

NPW Research

Orchestration

Pod sandboxing support now in AKS using Kata Containers

What it does: Enables maintaining isolation boundaries between multiple Kata containers within a single guest VM in a shared Azure Kubernetes Service cluster. This isolation is useful for containers that process sensitive information.
What is Kata Containers: Open source project to build secure container runtimes with lightweight VMs through workload isolation.
How isolation is achieved: Kata Containers on AKS uses a security-hardened Azure hypervisor. Pod-level isolation is achieved through a nested Kata VM, which carves resources from parent VM node. Each Kata pod gets its own kernel.

The future of WebAssembly (Wasm)

WASI, system interface for WebAssembly, extending the reach of Wasm beyond browsers
Wasm yet to achieve the requisite maturity for backend apps
Kubernetes and Wasm to grow solve in tandem, with latter solving problems related to application runtime

Also: Amazon EKS, EKS Distro, and EKS Anywhere now support Kubernetes 1.25. See prerequisites for upgrade, and key updates in Kubernetes 1.25.

Security

Highlights of CloudNativeSecurityCon 2023

Software supply chain security: A new project that aggregates software supply chain data; approaches to build trust between software supply chain artifacts; emerging Supply Chain Levels for Software Artifacts (SLSA) standard that maps relationships between artifacts.
Spotlight: Yahoo demonstrated its approach to software supply chain security across 700+ K8s clusters – image signature and freshness check policy was a highlight, as they publish 5K+ container images daily.
Other themes: Using IaC for automating policy-based compliance; current state of software supply chain verification capabilities and projects like Kyverno, GUAC, and Sigstore policy controller.

WAF support for AWS App Runner

Lets you implement web access control lists (ACLs) in front of App Runner endpoints.
ACLs can be created with custom rules, or use Managed Groups for AWS WAF.

Configuring Google Cloud Run for least privilege access

Disallow unauthenticated access for internal users, create custom service account and grant it the right Cloud Run permissions.
When Cloud Run accesses other services, don’t use default Compute Engine service account. Create Cloud Run service identities and grant minimal permissions
Use IAM Recommender service to remove excess permissions

Also: Seven new controls in AWS Security Hub automate security checks against best practices for Amazon ElastiCache.

Cost Management

How to optimize AKS Node Cost with on-demand & spot VMs

Baseline amount of pods are deployed on on-demand VMs for reliability, and spot node pool is scaled by load for cost savings.
Use Node affinity to constrain the nodes on which the scheduler places the pods and topology spread constraints to distribute pods across node pools in a cluster.

14 tactics to keep GKE clusters secure, cost-efficient, and highly available

What’s covered: Autoscaling best practices, mixed-instance strategy, deciding between regional and zonal topology, using spot VM groups to minimize cost. For security, using CIS benchmarks with built-in tool, implementing RBAC, securing node metadata, avoiding IP overlap, and using Cloud DNS.
What stood out: A strategy called bin packaging that goes against the idea of even pod distribution across nodes, maximizes node utilization. Pods are added to nodes in a compacted way, but some buffer is left for the shared CPU.

Tagging strategy for granular cloud cost visibility in AWS

What it does: Requires creating  tag taxonomy, documenting tagging strategy (to attribute spend to cost centers), and enforcing it across teams.
What’s covered: How to create tag policies for tags in AWS Organizations (top-down or child organizations driven); attach policy to organizational units to enforce them across organization; and use of Service Control Policies for stricter enforcement.
Also: Use Tag Editor to identify untagged resources, and AWS Config to support ongoing compliance.

2 new AWS Cost Categories features for grouping resources

Now group AWS resources by Region, and use the OR operator to define cost categories rules.
With ‘OR’, rules can now be more inclusive across dimensions (Linked Account, Charge Type, Service, Cost Allocation Tags, Region, Cost Category).

Career

Github repo: Step-by-step guide to becoming DevOps Engineer in 2023

About the guide: Milan Milanovich has created this excellent repository with a comprehensive gameplan to becoming a DevOps pro. Lists resources like ebooks, articles, video courses and key technologies in use in the industry.
The learning roadmap: Broken into segments with learning Git, a coding language, Linux as basics. Followed by networking, security, and server management, and then moving into container orchestration, Infrastructure as Code, CI/CD, observability, software engineering practices, and finally, building familiarity with one cloud environment.

Also: AWS Gallup APAC Digital Skills Report released. Read here.

Operations + Developer Tools

Site Reliability Engineering: where it's headed in 2023

Increased adoption of OpenTelemetry, and SLO codification with OpenSLO.
Observability to become foundation of SRE, with increased collaboration from Devs and QA teams
Increased ownership of code by software engineers
Policy-as-Code adoption alongside self-serve provisioning to guardrail resource usage.

Also: AWS Transfer Family resources for AS2 can now be configured and managed with CloudFormation templates

Provisioning + Runtime

Azure Managed Lustre enters public preview

Lustre is an open-source parallel file system for large-scale cluster computing, ideal for HPC and AI workloads.
Built on Azure Managed Disks, two SSD-based SKUs will be offered with 125MBps and 250MBps per TB of capacity, scalable up to 768TBs.

SQL Server on Azure VMs better price-performance than on EC2: GigaOm report

Runs 57% faster and costs 54% less than on EC2 with 3-year commitment and Azure Hybrid Benefit (study commissioned by Microsoft)
Azure Ebdsv5 VMS optimized for database workloads, and PRemium SSD v2 Disk Storage

Azure portal updates from Jan 2023

New VM Scale Sets use Flexible orchestration mode by default instead of Uniform orchestration.
Force delete for VMs and VM Scale Sets, which bypasses graceful shutdown and some cleanup operations.

Also: Azure HPC Cache Premium Read-Write, which provides up to 84TB capacity for a single cache and 20GBps read throughput at low latency is now in preview; Azure HPC Cache – Standard price dropped.

Observability

How to use recently added in-product metrics in Azure Logic Apps to monitor workflow performance and health.

What CSP products got the highest attention. Topics that generated keen interest. Based on what was read by 12,000+ DevOps engineers, software engineers and solution architects the previous week.

CSP trends last week

  • Databases saw the most important announcements, both from Azure
  • In fact, Azure updates accounted for 52% of total attention on stories
  • Google Cloud and Azure had important updates in VMs, the second most active topic

Products that trended last week

  • Caching becoming possible in Azure Container Registry instance
  • Confidential GKE Nodes' availability in confidential VMs
  • Azure SQL updates including automatic key rotation for CMKs
  • Azure Cache for Redis allowing enhanced passive geo-replication
  • Stateful firewall rules to tag-based resources in AWS Firewall Network

Trending topics last week

  • Provisioning related news accounted for 45% of total attention on stories
  • App building related updates accounted for another 27%