Showing posts with label azure. Show all posts
Showing posts with label azure. Show all posts

Thursday

Bot State with Azure

We can use Azure Bot Application using FastAPI that integrates with Azure Cache for Redis for session management and uses Azure Cosmos DB for state management. Here are the steps to achieve this:

  1. State Management with Azure Cosmos DB:

    • Why do you need state?
      • Maintaining state allows your bot to have more meaningful conversations by remembering certain things about a user or conversation.
      • For example, if you’ve talked to a user previously, you can save previous information about them, so that you don’t have to ask for it again.
      • State also keeps data for longer than the current turn, so your bot retains information over the course of a multi-turn conversation.
  2. Storage Layer:

    • The backend storage layer is where the state information is actually stored.
    • You can choose from different storage options:
      • Memory Storage: For local testing only; volatile and temporary.
      • Azure Blob Storage: Connects to an Azure Blob Storage object database.
      • Azure Cosmos DB Partitioned Storage: Connects to a partitioned Cosmos DB NoSQL database.
      • Note: The legacy Cosmos DB storage class has been deprecated.
  3. State Management:

    • State management automates reading and writing bot state to the underlying storage layer.
    • State is stored as state properties (key-value pairs).
    • The Bot Framework SDK abstracts the underlying implementation.
    • You can use state property accessors to read and write state without worrying about storage specifics.
  4. Setting Up Azure Cosmos DB for Bot State:

    • Create an Azure Cosmos DB Account (globally distributed, multi-model database service).
    • Within the Cosmos DB account, create a SQL Database to store the state of your bot effectively.
  5. Implementing in Your Bot:

    • In your bot code, use the appropriate storage provider (e.g., Cosmos DB) to manage state.
    • Initialize state management and property accessors.
    • Example (using FastAPI):
      import azure.functions as func
      from WrapperFunction import app as fastapi_app
      from bot_state import BotState, CosmosDbPartitionedStorage
      
      # Initialize Cosmos DB storage
      cosmos_db_storage = CosmosDbPartitionedStorage(
          cosmos_db_endpoint="your_cosmos_db_endpoint",
          cosmos_db_key="your_cosmos_db_key",
          database_id="your_database_id",
          container_id="your_container_id"
      )
      
      # Initialize bot state
      bot_state = BotState(cosmos_db_storage)
      
      mple: Writing a user-specific property
      async def save_user_preference(turn_context, preference_value):
          user_id = turn_context.activity.from_property.id
          await bot_state.user_state.set_property(turn_context, f"user_preference_{user_id}", preference_value)
      
      # Example: Reading a user-specific property
      async def get_user_preference(turn_context):
          user_id = turn_context.activity.from_property.id
          preference_value = await bot_state.user_state.get_property(turn_context, f"user_preference_{user_id}")
          return preference_value
      
      # Usage in your bot logic
      async def on_message_activity(turn_context):
          # Get user preference
          preference = await get_user_preference(turn_context)
          await turn_context.send_activity(f"Your preference: {preference}")
      
          # Set user preference
          await save_user_preference(turn_context, "New Preference Value")
          await turn_context.send_activity("Preference updated!")
      # Example: Writing a user-specific property async def save_user_preference(turn_context, preference_value): user_id = turn_context.activity.from_property.id await bot_state.user_state.set_property(turn_context, f"user_preference_{user_id}", preference_value) # Example: Reading a user-specific property async def get_user_preference(turn_context): user_id = turn_context.activity.from_property.id preference_value = await bot_state.user_state.get_property(turn_context, f"user_preference_{user_id}") return preference_value # Usage in your bot logic async def on_message_activity(turn_context): # Get user preference preference = await get_user_preference(turn_context) await turn_context.send_activity(f"Your preference: {preference}") # Set user preference await save_user_preference(turn_context, "New Preference Value") await turn_context.send_activity("Preference updated!") app = func.AsgiFunctionApp(app=fastapi_app, http_auth_level=func.AuthLevel.ANONYMOUS)
  6. Testing Locally and Deployment:

    • Test your bot locally using VS Code or Azure CLI.
    • Deploy your bot to Azure using the VS Code Azure Functions extension or Azure CLI.

Monday

Azure Data Factory, ADSL Gen2 BLOB Storage and Syncing Data from Share Point Folder

 

Photo by Manuel Geissinger

Today we are going to discuss data sync between on premisses SharePoint folder and Azure BLOB Storage. 

When we need to upload or download files from SharePoint folder within the home network to Azure. We must consider the best way to auto sync as well. Let's discuss them step by step.

Azure Data Factory (ADF) is a powerful cloud-based service provided by Microsoft Azure. Let me break it down for you:

  1. Purpose and Context:

    • In the world of big data, we often deal with raw, unorganized data stored in various systems.
    • However, raw data alone lacks context and meaning for meaningful insights.
    • Azure Data Factory (ADF) steps in to orchestrate and operationalize processes, transforming massive raw data into actionable business insights.
  2. What Does ADF Do?:

    • ADF is a managed cloud service designed for complex data integration projects.
    • It handles hybrid extract-transform-load (ETL) and extract-load-transform (ELT) scenarios.
    • It enables data movement and transformation at scale.
  3. Usage Scenarios:

    • Imagine a gaming company collecting petabytes of game logs from cloud-based games.
    • The company wants to:
      • Analyze these logs for customer insights.
      • Combine on-premises reference data with cloud log data.
      • Process the joined data using tools like Azure HDInsight (Spark cluster).
      • Publish transformed data to Azure Synapse Analytics for reporting.
    • ADF automates this workflow, allowing daily scheduling and execution triggered by file arrivals in a blob store container.
  4. Key Features:

    • Data-Driven Workflows: Create and schedule data-driven workflows (called pipelines).
    • Ingestion: Ingest data from disparate data stores.
    • Transformation: Build complex ETL processes using visual data flows or compute services like Azure HDInsight Hadoop, Azure Databricks, and Azure SQL Database.
    • Publishing: Publish transformed data to destinations like Azure Synapse Analytics for business intelligence applications.
  5. Why ADF Matters:

    • It bridges the gap between raw data and actionable insights.
    • Businesses can make informed decisions based on unified data insights.

Learn more about Azure Data Factory on Microsoft Learn1.

Azure Data Factory (ADF) can indeed sync data between on-premises SharePoint folders and Azure Blob Storage. Let’s break it down:

  1. Syncing with On-Premises SharePoint Folder:

    • ADF allows you to copy data from a SharePoint Online List (which includes folders) to various supported data stores.
    • Here’s how you can set it up:
      • Prerequisites:
        • Register an application with the Microsoft identity platform.
        • Note down the Application ID, Application key, and Tenant ID.
        • Grant your registered application permission in your SharePoint Online site.
      • Configuration:
  2. Syncing with Azure Blob Storage:

  3. Combining Both:

    • To sync data between an on-premises SharePoint folder and Azure Blob Storage:
      • Set up your SharePoint linked service.
      • Set up your Azure Blob Storage linked service.
      • Create a pipeline that uses the Copy activity to move data from SharePoint to Blob Storage.
      • Optionally, apply any necessary transformations using the Data Flow activity.

Remember, ADF is your orchestration tool, ensuring seamless data movement and transformation across various data sources and sinks.

On the other hand, Azure Data Lake Storage Gen2 (ADLS Gen2) is a powerful service in the Microsoft Azure ecosystem. Let’s explore how to use it effectively:

  1. Overview of ADLS Gen2:

    • ADLS Gen2 combines the capabilities of a data lake with the scalability and performance of Azure Blob Storage.
    • It’s designed for handling large volumes of diverse data, making it ideal for big data analytics and data warehousing scenarios.
  2. Best Practices for Using ADLS Gen2:

    • Optimize Performance:
      • Consider using a premium block blob storage account if your workloads require low latency and high I/O operations per second (IOP).
      • Premium accounts store data on solid-state drives (SSDs) optimized for low latency and high throughput.
      • While storage costs are higher, transaction costs are lower.
    • Reduce Costs:
      • Organize your data into data sets within ADLS Gen2.
      • Provision separate ADLS Gen2 accounts for different data landing zones.
      • Evaluate feature support and known issues to make informed decisions.
    • Security and Compliance:
      • Use service principals or access keys to access ADLS Gen2.
      • Understand terminology differences (e.g., blobs vs. files).
      • Review the documentation for feature-specific guidance.
    • Integration with Other Services:
      • Mount ADLS Gen2 to Azure Databricks for reading and writing data.
      • Compare ADLS Gen2 with Azure Blob Storage for different use cases.
      • Understand where ADLS Gen2 fits in the stages of analytical processing.
  3. Accessing ADLS Gen2:

    • You can access ADLS Gen2 in three ways:
      • Mounting it to Azure Databricks using a service principal or OAuth 2.0.
      • Directly using a service principal.
      • Using the ADLS Gen2 storage account access key directly.

Remember, ADLS Gen2 empowers you to manage and analyze vast amounts of data efficiently. Dive into the documentation and explore its capabilities! 

Learn more about Azure Data Lake Storage Gen2 on Microsoft Learn1.

Let’s set up a data flow that automatically copies files from an on-premises SharePoint folder to Azure Data Lake Storage Gen2 (ADLS Gen2) whenever new files are uploaded. Here are the steps:

  1. Prerequisites:

    • Ensure you have the following:
      • An Azure subscription (create one if needed).
      • An Azure Storage account with ADLS Gen2 enabled.
      • An on-premises SharePoint folder containing the files you want to sync.
  2. Create an Azure Data Factory (ADF):

    • If you haven’t already, create an Azure Data Factory using the Azure portal.
    • Launch the Data Integration application in ADF.
  3. Set Up the Copy Data Tool:

    • In the ADF home page, select the Ingest tile to launch the Copy Data tool.
    • Configure the properties:
      • Choose Built-in copy task under Task type.
      • Select Run once now under Task cadence or task schedule.
  4. Configure the Source (SharePoint):

    • Click + New connection.
    • Select SharePoint from the connector gallery.
    • Provide the necessary credentials and details for your on-premises SharePoint folder.
    • Define the source dataset.
  5. Configure the Destination (ADLS Gen2):

    • Click + New connection.
    • Select Azure Data Lake Storage Gen2 from the connector gallery.
    • Choose your ADLS Gen2 capable account from the “Storage account name” drop-down list.
    • Create the connection.
  6. Mapping and Transformation (Optional):

    • If needed, apply any transformations or mappings between the source and destination.
    • You can use the Data Flow activity for more complex transformations.
  7. Run the Pipeline:

    • Save your configuration.
    • Execute the pipeline to copy data from SharePoint to ADLS Gen2.
    • You can schedule this pipeline to run periodically or trigger it based on events (e.g., new files in SharePoint).
  8. Monitoring and Alerts:

    • Monitor the pipeline execution in the Azure portal.
    • Set up alerts for any failures or anomalies.

Remember to adjust the settings according to your specific SharePoint folder and ADLS Gen2 requirements. With this setup, your files will be automatically synced from SharePoint to ADLS Gen2 whenever new files are uploaded! 

Learn more about loading data into Azure Data Lake Storage Gen2 on Microsoft Learn1.

Tuesday

Data Masking When Ingesting Into Databricks

 

Photo by Alba Leader

Data masking is a data security technique that involves hiding data by changing its original numbers and letters. It's a way to create a fake version of data that's similar enough to the actual data, while still protecting it. This fake data can then be used as a functional alternative when the real data isn't needed. 



Unity Catalog is not a feature within Databricks. Instead, Databricks provides the Delta Lake feature, which includes data governance capabilities such as row filters and column masking.

Unity Catalog in Databricks allows you to apply data governance policies such as row filters and column masks to sensitive data. Let’s break it down:

  1. Row Filters:

    • Row filters enable you to apply a filter to a table so that subsequent queries only return rows for which the filter predicate evaluates to true.
    • To create a row filter, follow these steps:
      1. Write a SQL user-defined function (UDF) to define the filter policy.
      • CREATE FUNCTION <function_name> (<parametergoog_1380099708_name> <parameter_type>, ...) RETURN {filter clause whobe a boolean};
  2. Apply the row filter to an existing table using the following syntax:
    ALTER TABLE <table_name> SET ROW FILTER <function_name> ON (<column_name>, ...);
      1. You can also specify a row filter during the initial table creation.
    • Each table can have only one row filter, and it accepts input parameters that bind to specific columns of the table.
  3. Column Masks:

    • Column masks allow you to transform or mask specific column values before returning them in query results.
    • To apply column masks:
      1. Create a function that defines the masking logic.
      2. Apply the masking function to a table column using an ALTER TABLE statement.
      3. Alternatively, you can apply the masking function during table creation.
  4. Unity Catalog Best Practices:

  5. When setting up Unity Catalog, consider assigning a location to a catalog level. For example:
    CREATE CATALOG hr_prod
    LOCATION 'abfss://mycompany-hr-prod@storage-account.dfs.core.windows.net/unity-catalog';

You can apply column masks to transform or conceal specific column values before returning them in query results. Here’s how you can achieve this:

  1. Create a Masking Function:

    • Define a function that specifies the masking logic. This function will be used to transform the column values.
    • For example, let’s say you want to mask the last four digits of a credit card number. You can create a masking function that replaces the last four digits with asterisks.
  2. Apply the Masking Function to a Column:

    • Use an ALTER TABLE statement to apply the masking function to a specific column.
    • For instance, if you have a column named credit_card_number, you can apply the masking function to it:
      ALTER TABLE my_table SET COLUMN MASK credit_card_number USING my_masking_function;
      
  3. Example Masking Function:

    • Suppose you want to mask the last four digits of a credit card number with asterisks. You can create a masking function like this:
      CREATE FUNCTION my_masking_function AS
      BEGIN
          RETURN CONCAT('************', RIGHT(credit_card_number, 4));
      END;
      
  4. Query the Table:

    • When querying the table, the masked values will be returned instead of the original values.

Let’s focus on how you can achieve column masking in Databricks using Delta Lake:

  1. Column Masking:

    • Delta Lake allows you to apply column-level transformations or masks to sensitive data.
    • You can create custom masking functions to modify specific column values before returning them in query results.
  2. Creating a Masking Function:

    • Define a user-defined function (UDF) that specifies the masking logic. For example, you can create a function that masks the last four digits of a credit card number.
    • Here’s an example of a masking function that replaces the last four digits with asterisks:
      def mask_credit_card(card_number):
          return "************" + card_number[-4:]
      
  3. Applying the Masking Function:

    • Use the withColumn method to apply the masking function to a specific column in your DataFrame.
    • For instance, if you have a DataFrame named my_table with a column named credit_card_number, you can apply the masking function as follows:
      from pyspark.sql.functions import udf
      from pyspark.sql.types import StringType
      
      # Register the UDF
      spark.udf.register("mask_credit_card", mask_credit_card, StringType())
      
      # Apply the masking function to the column
      masked_df = my_table.withColumn("masked_credit_card", udf("credit_card_number"))
      
  4. Querying the Masked Data:

    • When querying the masked_df, the transformed (masked) values will be returned for the masked_credit_card column.

You can find different related articles here kindly search.


GenAI Speech to Sentiment Analysis with Azure Data Factory

 

Photo by Tara Winstead

Azure Data Factory (ADF) is a powerful data integration service, and it can be seamlessly integrated with several other Azure services to enhance your data workflows. Here are some key services that work closely with ADF:

  1. Azure Synapse Analytics:

    • Formerly known as SQL Data Warehouse, Azure Synapse Analytics provides an integrated analytics service that combines big data and data warehousing. You can use ADF to move data into Synapse Analytics for advanced analytics, reporting, and business intelligence.
  2. Azure Databricks:

    • Azure Databricks is a fast, easy, and collaborative Apache Spark-based analytics platform. ADF can orchestrate data movement between Databricks and other data stores, enabling you to process and analyze large datasets efficiently.
  3. Azure Blob Storage:

    • ADF can seamlessly copy data to and from Azure Blob Storage. It’s a cost-effective storage solution for unstructured data, backups, and serving static content.
  4. Azure SQL Database:

    • Use ADF to ingest data from various sources into Azure SQL Database. It’s a fully managed relational database service that supports both structured and semi-structured data.
  5. Azure Data Lake Store:

    • ADF integrates well with Azure Data Lake Store, which is designed for big data analytics. You can use it to store large amounts of data in a hierarchical file system.
  6. Amazon S3 (Yes, even from AWS!):

    • ADF supports data movement from Amazon S3 (Simple Storage Service) to Azure. If you have data in S3, ADF can help you bring it into Azure.
  7. Amazon Redshift (Again, from AWS!):

    • Similar to S3, ADF can copy data from Amazon Redshift (a data warehouse service) to Azure. This is useful for hybrid scenarios or migrations.
  8. Software as a Service (SaaS) Apps:

    • ADF has built-in connectors for popular SaaS applications like Salesforce, Marketo, and ServiceNow. You can easily ingest data from these services into your data pipelines.
  9. Web Protocols:

    • ADF supports web protocols such as FTP and OData. If you need to move data from web services, ADF can handle it.

Remember that ADF provides more than 90 built-in connectors, making it versatile for ingesting data from various sources and orchestrating complex data workflows. Whether you’re dealing with big data, relational databases, or cloud storage you can harness its power.

Let’s tailor the integration of Azure Data Factory (ADF) for your AI-based application that involves speech-to-text and sentiment analysis. Here are the steps you can follow:

  1. Data Ingestion:

    • Source Data: Identify the source of your speech data. It could be audio files, streaming data, or recorded conversations.
    • Azure Blob Storage or Azure Data Lake Storage: Store the raw audio data in Azure Blob Storage or Azure Data Lake Storage. You can use ADF to copy data from various sources into these storage services.
  2. Speech-to-Text Processing:

    • Azure Cognitive Services - Speech-to-Text: Utilize the Azure Cognitive Services Speech SDK or the REST API to convert audio data into text. You can create an Azure Cognitive Services resource and configure it with your subscription key.
    • ADF Pipelines: Create an ADF pipeline that invokes the Speech-to-Text service. Use the Web Activity or Azure Function Activity to call the REST API. Pass the audio data as input and receive the transcribed text as output.
  3. Data Transformation and Enrichment:

    • Data Flows in ADF: If you need to perform additional transformations (e.g., cleaning, filtering, or aggregating), use ADF Data Flows. These allow you to visually design data transformations.
    • Sentiment Analysis: For sentiment analysis, consider using Azure Cognitive Services - Text Analytics. Similar to the Speech-to-Text step, create a Text Analytics resource and configure it in your ADF pipeline.
  4. Destination Storage:

    • Azure SQL Database or Cosmos DB: Store the transcribed text along with sentiment scores in an Azure SQL Database or Cosmos DB.
    • ADF Copy Activity: Use ADF’s Copy Activity to move data from your storage (Blob or Data Lake) to the destination database.
  5. Monitoring and Error Handling:

    • Set up monitoring for your ADF pipelines. Monitor the success/failure of each activity.
    • Implement retry policies and error handling mechanisms in case of failures during data movement or processing.
  6. Security and Authentication:

    • Ensure that your ADF pipeline has the necessary permissions to access the storage accounts, Cognitive Services, and databases.
    • Use Managed Identity or Service Principal for authentication.

Get more details here Introduction to Azure Data Factory - Azure Data Factory | Microsoft Learn 

Friday

Databricks with Azure Past and Present

 


Let's dive into the evolution of Azure Databricks and its performance differences.

Azure Databricks is a powerful analytics platform built on Apache Spark, designed to process large-scale data workloads. It provides a collaborative environment for data engineers, data scientists, and analysts. Over time, Databricks has undergone significant changes, impacting its performance and capabilities.

Previous State:

In the past, Databricks primarily relied on an open-source version of Apache Spark. While this version was versatile, it had limitations in terms of performance and scalability. Users could run Spark workloads, but there was room for improvement.

Current State:

Today, Azure Databricks has evolved significantly. Here’s what’s changed:

  1. Optimized Spark Engine:

    • Databricks now offers an optimized version of Apache Spark. This enhanced engine provides 50 times increased performance compared to the open-source version.
    • Users can leverage GPU-enabled clusters, enabling faster data processing and higher data concurrency.
    • The optimized Spark engine ensures efficient execution of complex analytical tasks.
  2. Serverless Compute:

    • Databricks embraces serverless architectures. With serverless compute, the compute layer runs directly within your Azure Databricks account.
    • This approach eliminates the need to manage infrastructure, allowing users to focus solely on their data and analytics workloads.
    • Serverless compute optimizes resource allocation, scaling up or down as needed.

Performance Differences:

Let’s break down the performance differences:

  1. Speed and Efficiency:

    • The optimized Spark engine significantly accelerates data processing. Complex transformations, aggregations, and machine learning tasks execute faster.
    • GPU-enabled clusters handle parallel workloads efficiently, reducing processing time.
  2. Resource Utilization:

    • Serverless compute ensures optimal resource allocation. Users pay only for the resources consumed during actual computation.
    • Traditional setups often involve overprovisioning or underutilization, impacting cost-effectiveness.
  3. Concurrency and Scalability:

    • Databricks’ enhanced Spark engine supports high data concurrency. Multiple users can run queries simultaneously without performance degradation.
    • Horizontal scaling (adding more nodes) ensures seamless scalability as workloads grow.
  4. Cost-Effectiveness:

    • Serverless architectures minimize idle resource costs. Users pay only for active compute time.
    • Efficient resource utilization translates to cost savings.


Currently, Azure does not use BLOB storage for Databrick compute plane, instead ADSL Gen 2, also known as Azure Data Lake Storage Gen2, is a powerful solution for big data analytics built on Azure Blob Storage. Let’s dive into the details:

  1. What is a Data Lake?

    • A data lake is a centralized repository where you can store all types of data, whether structured or unstructured.
    • Unlike traditional databases, a data lake allows you to store data in its raw or native format, without conforming to a predefined structure.
    • Azure Data Lake Storage is a cloud-based enterprise data lake solution engineered to handle massive amounts of data in any format, facilitating big data analytical workloads.
  2. Azure Data Lake Storage Gen2:

    • Convergence: Gen2 combines the capabilities of Azure Data Lake Storage Gen1 with Azure Blob Storage.
    • File System Semantics: It provides file system semantics, allowing you to organize data into directories and files.
    • Security: Gen2 offers file-level security, ensuring data protection.
    • Scalability: Designed to manage multiple petabytes of information while sustaining high throughput.
    • Hadoop Compatibility: Gen2 works seamlessly with Hadoop and frameworks using the Apache Hadoop Distributed File System (HDFS).
    • Cost-Effective: It leverages Blob storage, providing low-cost, tiered storage with high availability and disaster recovery capabilities.
  3. Implementation:

    • Unlike Gen1, Gen2 isn’t a dedicated service or account type. Instead, it’s implemented as a set of capabilities within your Azure Storage account.
    • To unlock these capabilities, enable the hierarchical namespace setting.
    • Key features include:
      • Hadoop-compatible access: Designed for Hadoop and frameworks using the Azure Blob File System (ABFS) driver.
      • Hierarchical directory structure: Organize data efficiently.
      • Optimized cost and performance: Balances cost-effectiveness and performance.
      • Finer-grained security model: Enhances data protection.
      • Massive scalability: Handles large-scale data workloads.

Conclusion:

Azure Databricks has transformed from its initial open-source Spark version to a high-performance, serverless analytics platform. Users now benefit from faster processing, efficient resource management, and improved scalability. Whether you’re analyzing data, building machine learning models, or running complex queries, Databricks’ evolution ensures optimal performance for your workloads. 


Sunday

Azure CLI

 account : Manage Azure subscription information.

  acr         : Manage private registries with Azure Container Registries.

  ad         : Manage Azure Active Directory Graph entities needed for Role Based Access

             Control.

  advisor       : Manage Azure Advisor.

  afd         : Manage Azure Front Door Standard/Premium.

  ai-examples     : Add AI powered examples to help content.

  aks         : Manage Azure Kubernetes Services.

  ams         : Manage Azure Media Services resources.

  apim        : Manage Azure API Management services.

  appconfig      : Manage App Configurations.

  appservice     : Manage App Service plans.

  aro         : Manage Azure Red Hat OpenShift clusters.

  backup       : Manage Azure Backups.

  batch        : Manage Azure Batch.

  bicep        : Bicep CLI command group.

  billing       : Manage Azure Billing.

  bot         : Manage Microsoft Azure Bot Service.

  cache        : Commands to manage CLI objects cached using the `--defer` argument.

  capacity      : Manage capacity.

  cdn         : Manage Azure Content Delivery Networks (CDNs).

  cloud        : Manage registered Azure clouds.

  cognitiveservices  : Manage Azure Cognitive Services accounts.

  config       : Manage Azure CLI configuration.

  configure      : Manage Azure CLI configuration. This command is interactive.

  connection     : Commands to manage Service Connector local connections which allow local

             environment to connect Azure Resource. If you want to manage connection for

             compute service, please run 'az webapp/containerapp/spring connection'.

  consumption     : Manage consumption of Azure resources.

  container      : Manage Azure Container Instances.

  containerapp    : Manage Azure Container Apps.

  cosmosdb      : Manage Azure Cosmos DB database accounts.

  databoxedge     : Support data box edge device and management.

  deployment     : Manage Azure Resource Manager template deployment at subscription scope.

  deployment-scripts : Manage deployment scripts at subscription or resource group scope.

  disk        : Manage Azure Managed Disks.

  disk-access     : Manage disk access resources.

  disk-encryption-set : Disk Encryption Set resource.

  dla         : Manage Data Lake Analytics accounts, jobs, and catalogs.

  dls         : Manage Data Lake Store accounts and filesystems.

  dms         : Manage Azure Data Migration Service (classic) instances.

  eventgrid      : Manage Azure Event Grid topics, domains, domain topics, system topics

             partner topics, event subscriptions, system topic event subscriptions and

             partner topic event subscriptions.

  eventhubs      : Eventhubs.

  extension      : Manage and update CLI extensions.

  feature       : Manage resource provider features.

  feedback      : Send feedback to the Azure CLI Team.

  find        : I'm an AI robot, my advice is based on our Azure documentation as well as

             the usage patterns of Azure CLI and Azure ARM users. Using me improves

             Azure products and documentation.

  functionapp     : Manage function apps. To install the Azure Functions Core tools see

             https://github.com/Azure/azure-functions-core-tools.

  group        : Manage resource groups and template deployments.

  hdinsight      : Manage HDInsight resources.

  identity      : Managed Identities.

  image        : Manage custom virtual machine images.

  interactive     : Start interactive mode. Installs the Interactive extension if not

             installed already.

  iot         : Manage Internet of Things (IoT) assets.

  keyvault      : Manage KeyVault keys, secrets, and certificates.

  kusto        : Manage Azure Kusto resources.

  lab         : Manage Azure DevTest Labs.

  lock        : Manage Azure locks.

  logicapp      : Manage logic apps.

  login        : Log in to Azure.

  logout       : Log out to remove access to Azure subscriptions.

  managed-cassandra  : Azure Managed Cassandra.

  managedapp     : Manage template solutions provided and maintained by Independent Software

             Vendors (ISVs).

  managedservices   : Manage the registration assignments and definitions in Azure.

  maps        : Manage Azure Maps.

  mariadb       : Manage Azure Database for MariaDB servers.

  ml         : Manage Azure Machine Learning resources with the Azure CLI ML extension

             v2.

  monitor       : Manage the Azure Monitor Service.

  mysql        : Manage Azure Database for MySQL servers.

  netappfiles     : Manage Azure NetApp Files (ANF) Resources.

  network       : Manage Azure Network resources.

  policy       : Manage resource policies.

  postgres      : Manage Azure Database for PostgreSQL servers.

  ppg         : Manage Proximity Placement Groups.

  private-link    : Private-link association CLI command group.

  provider      : Manage resource providers.

  redis        : Manage dedicated Redis caches for your Azure applications.

  relay        : Manage Azure Relay Service namespaces, WCF relays, hybrid connections, and

             rules.

  resource      : Manage Azure resources.

  resourcemanagement : Resourcemanagement CLI command group.

  rest        : Invoke a custom request.

  restore-point    : Manage restore point with res.

  role        : Manage user roles for access control with Azure Active Directory and

             service principals.

  search       : Manage Azure Search services, admin keys and query keys.

  security      : Manage your security posture with Microsoft Defender for Cloud.

  servicebus     : Servicebus.

  sf         : Manage and administer Azure Service Fabric clusters.

  sig         : Manage shared image gallery.

  signalr       : Manage Azure SignalR Service.

  snapshot      : Manage point-in-time copies of managed disks, native blobs, or other

             snapshots.

  sql         : Manage Azure SQL Databases and Data Warehouses.

  ssh         : SSH into resources (Azure VMs, Arc servers, etc) using AAD issued openssh

             certificates.

  sshkey       : Manage ssh public key with vm.

  stack        : A deployment stack is a native Azure resource type that enables you to

             perform operations on a resource collection as an atomic unit.

  staticwebapp    : Manage static apps.

  storage       : Manage Azure Cloud Storage resources.

  survey       : Take Azure CLI survey.

  synapse       : Manage and operate Synapse Workspace, Spark Pool, SQL Pool.

  tag         : Tag Management on a resource.

  term        : Manage marketplace agreement with marketplaceordering.

  ts         : Manage template specs at subscription or resource group scope.

  upgrade       : Upgrade Azure CLI and extensions.

  version       : Show the versions of Azure CLI modules and extensions in JSON format by

             default or format configured by --output.

  vm         : Manage Linux or Windows virtual machines.

  vmss        : Manage groupings of virtual machines in an Azure Virtual Machine Scale Set

             (VMSS).

  webapp       : Manage web apps.

Thursday

Cloud Resources for Python Application Development

  • AWS:

- AWS Lambda:

  - Serverless computing for executing backend code in response to events.

- Amazon RDS:

  - Managed relational database service for handling SQL databases.

- Amazon S3:

  - Object storage for scalable and secure storage of data.

- AWS API Gateway:

  - Service to create, publish, and manage APIs, facilitating API integration.

- AWS Step Functions:

  - Coordination of multiple AWS services into serverless workflows.

- Amazon DynamoDB:

  - NoSQL database for building high-performance applications.

- AWS CloudFormation:

  - Infrastructure as Code (IaC) service for defining and deploying AWS infrastructure.

- AWS Elastic Beanstalk:

  - Platform-as-a-Service (PaaS) for deploying and managing applications.

- AWS SDK for Python (Boto3):

  - Official AWS SDK for Python to interact with AWS services programmatically.


  • Azure:

- Azure Functions:

  - Serverless computing for building and deploying event-driven functions.

- Azure SQL Database:

  - Fully managed relational database service for SQL databases.

- Azure Blob Storage:

  - Object storage service for scalable and secure storage.

- Azure API Management:

  - Full lifecycle API management to create, publish, and consume APIs.

- Azure Logic Apps:

  - Visual workflow automation to integrate with various services.

- Azure Cosmos DB:

  - Globally distributed, multi-model database service for highly responsive applications.

- Azure Resource Manager (ARM):

  - IaC service for defining and deploying Azure infrastructure.

- Azure App Service:

  - PaaS offering for building, deploying, and scaling web apps.

- Azure SDK for Python (azure-sdk-for-python):

  - Official Azure SDK for Python to interact with Azure services programmatically.


  • Cloud Networking, API Gateway, Load Balancer, and Security for AWS and Azure:


AWS:

- Amazon VPC (Virtual Private Cloud):

  - Enables you to launch AWS resources into a virtual network, providing control over the network configuration.

- AWS Direct Connect:

  - Dedicated network connection from on-premises to AWS, ensuring reliable and secure data transfer.

- Amazon API Gateway:

  - Fully managed service for creating, publishing, and securing APIs.

- AWS Elastic Load Balancer (ELB):

  - Distributes incoming application traffic across multiple targets to ensure high availability.

- AWS WAF (Web Application Firewall):

  - Protects web applications from common web exploits by filtering and monitoring HTTP traffic.

- AWS Shield:

  - Managed Distributed Denial of Service (DDoS) protection service for safeguarding applications running on AWS.

- Amazon Inspector:

  - Automated security assessment service for applications running on AWS.


Azure:


- Azure Virtual Network:

  - Connects Azure resources to each other and to on-premises networks, providing isolation and customization.

- Azure ExpressRoute:

  - Dedicated private connection from on-premises to Azure, ensuring predictable and secure data transfer.

- Azure API Management:

  - Full lifecycle API management with features for security, scalability, and analytics.

- Azure Load Balancer:

  - Distributes network traffic across multiple servers to ensure application availability.

- Azure Application Gateway:

  - Web traffic load balancer that enables you to manage traffic to your web applications.

- Azure Firewall:

  - Managed, cloud-based network security service to protect your Azure Virtual Network resources.

- Azure Security Center:

  - Unified security management system that strengthens the security posture of your data centers.

- Azure DDoS Protection:

  - Safeguards against DDoS attacks on Azure applications.

 

Wednesday

Cloud Computing Roles

So, what kind of roles currently exist within the cloud computing and what do they do? 

 
There are many different roles; let’s look at some.


Cloud engineers design, implement, and maintain cloud and hybrid networking environments. It is a hands-on role and often involves a significant amount of service orchestration, planning, and monitoring. 
 
Cloud security engineers focus on ensuring the integrity, confidentiality, and availability of data and resources in the cloud. It is also a hands-on role and involves coding and problem solving.

Data-center technicians are very hands on. They provide hardware and network diagnostics followed by physical repair. Data-center technicians install equipment, create documentation, innovate solutions, and fix problems within the data-center space.

As a cloud administrator, you work with information technology, known as IT, and information systems, or IS, teams to deploy, configure, and monitor hybrid and cloud solutions. This is a hands-on role and can include planning and document writing.

Cloud software developers work with IT or IS teams to develop, maintain, and re-engineer hybrid and cloud-based applications. It is a hands-on role and includes coding and problem solving.  -source aws

A few more roles are depending on the requirements of a particular organization. As an example when I started with aws in 2012 I was a Technical/Solutions Architect to develop REST API server based applications.

Gradually learned and started developing microservices architecture and serverless application development in aws and other cloud service providers in later years.

Last 6 years working mainly for artificial intelligence machine learning #iot #microservices applications in the cloud

Last year required to jump with generatieveai and ai but the cloud remains the house of all types of applications.

Now pursuing a cloud specialized mtech from IIT Patna due to my love for both cloud computing and artificial intelligence

Azure Data Factory Transform and Enrich Activity with Databricks and Pyspark

In #azuredatafactory at #transform and #enrich part can be done automatically or manually written by #pyspark two examples below one data so...