Dataplex, Google’s data governance service, has announced data profiling and AutoDQ in public preview.
What it does: Automates data and profile quality scans with flexible data models.
Key features: Data Scans in Dataplex are serverless, require zero data copy, can be scheduled, or triggered on demand by data consumers, producers and governors. Data profile scan results offered in UI with rich insights; recommends rules with passing thresholds for data quality dimension.
➝ AsyncEventsReceived measures total events successfully queued for processing.
➝ AsyncEventAge measures time between successful queuing and function invocation
➝ AsyncEventsDropped measures events dropped without successful execution.
➝ MTA supports large-scale Java app modernization and migration projects by providing line-by-line recommendations for your source code.
➝ Azure’s contributions include rulesets to provide guidance for configuring data sources, using Java Key Store and file systems
The service, which offers private clouds powered by VMware vSphere clusters on bare-metal Azure infrastructure, has brought four features to general availability
➝ Azure Log Analytics for AVS with prebuilt queries
➝ New Node SKUs powered by Intel Xeon and NVMe-based SSDs
➝ Customer Managed Keys with Azure Key Vault
➝ Azure NetApp Files volumes as file share for AVS
Also: Stretched clusters, which assures 99.99% uptime for critical applications through automatic failover, is entering preview.
Must-read Analysis & Advice
Agenda: Google Cloud’s Kelsey Hightower initiated the discussion based on a user query. Said it is like running a db on a VM but Kubernetes on PostGres is not the same as Cloud SQL.
What others had to say:
➝ Kubernetes does not provide high availability for applications, it only provides automatic recovery.
➝ Traditional db were not designed with the assumption that machines will fail.
➝ So for proper scaling, backups and upgrades, you will need a Kubernetes expert who is also a db expert.
➝ What that means is having additional knowledge of stateful sets, and a domain-specific understanding of how kubernetes handles storage.
Conclusion: Most thought db on Kubernetes was not such a great idea
➝ Despite operating in a secured perimeter, compromise of a single service within a microservices architecture can offer an entry point into the entire application.
➝ Authenticating inter-service communications, and encrypting connections between services key to prevent unauthorized access
Beranger Netanelic, a veteran Cloud Functions user provides a layer-wise list, with code snippets.
Key takeouts: Don’t use the default runtime service account. Don’t create a public HTTP cloud function, but configure a background function which triggers on events. And if an HTTP function is required, it should have an authentication process. If a function is called by another function, then the caller needs to be identified and authorized. If the function needs to be accessed by an external call, API key authentication is not supported by Cloud functions. Set up an API Gateway then.
➝ Using session ID or access token breaks the least privilege principle, and exposes sensitive information beyond the organizational perimeter.
➝ Identity distribution enables continuous data verification by ensuring each service in an API performs informed authorization based on signed certificates or tokens.
➝ Securing all traffic, encrypting connections, using established standards, and token sharing techniques are a few approaches to identity distribution.
Google Cloud adds multi-architecture support to fix the issue of deploying multi-architecture container images to Cloud Run.
Automatic backup support for webapps deployed to Azure App Service Environment V2 & V3 and functions deployed to Azure Functions dedicated hosting enters general availability.
Azure Application Gateway adds support for mTLS and online certificate status protocol. See when to use mTLS, and how to verify mTLS setup.
Apache Spark Structured Streaming Connector for Google Cloud Pub/Sub Lite now generally available; see supported configurations, and how to use it on Dataproc.
Google Cloud shares best practices in hybrid API management to determine size and placement of Kubernetes clusters, handling upgrades and security, and monitoring setups
Tech layoffs are not the end of the world. Non-tech companies advertising significantly more tech jobs.