Cloud Case Studies

Case studies for real-world investigations for cloud platform compromises across AWS, Azure, and GCP. Includes unauthorized accesses to cloud storage buckets and kubernetes clusters, as well as cryptomining incidents.

Cloud Storage Misconfiguration and Compromise
AWS

Incident: In a critical incident involving an improperly configured Amazon S3 bucket, sensitive data was inadvertently exposed to unauthorized access. The bucket, intended for internal use, was misconfigured to allow public read access, leading to potential data leakage. The exposure was detected during a routine security audit.

Detection and Initial Response: The incident was initially detected through an alert from AWS CloudTrail, which indicated unexpected access patterns to the S3 bucket. Immediate actions were taken to restrict access by modifying the bucket's permissions settings to private. This quick response prevented further unauthorized access.

Investigation and Analysis: A thorough investigation was conducted using detailed logs from AWS CloudTrail and AWS Access Logs. These logs were instrumental in tracing the source and extent of the access. By analyzing the access patterns and IP addresses involved, we could confirm that the exposed data was accessed by external entities.

Mitigation Strategies: Post-detection, the primary focus was on damage control and preventing future incidents. The following steps were undertaken:

  • Immediate Revocation of Public Access: The S3 bucket permissions were corrected to ensure that only authenticated users with specific roles had access.
  • Audit and Review of All S3 Buckets: A comprehensive audit of all S3 buckets under our management was conducted to ensure proper configurations were in place and to rectify any other potential misconfigurations.
  • Enhanced Monitoring Setup: Implementation of additional monitoring tools to detect and alert any improper modification of permission settings on all storage resources.
  • Policy Updates and Training: Updated the security policies regarding cloud storage and conducted training sessions for the team to prevent similar incidents in the future.

Resolution and Lessons Learned: The incident was resolved with no evidence of malicious exploitation of the exposed data, though the potential for data leakage had been high. This incident highlighted the critical need for rigorous monitoring and validation of cloud storage configurations. Lessons learned from this incident have led to strengthened security practices and policies around cloud resource management to ensure such exposures do not recur.

Technologies Used:

  • AWS CloudTrail
  • AWS S3 Access Logs
  • Custom monitoring tools for real-time security alerts

Outcome: This incident served as a pivotal learning experience, reinforcing the importance of secure cloud storage configurations and proactive security practices. By rapidly addressing the issue and implementing strategic changes, we ensured the integrity and security of our cloud environments against future vulnerabilities.


Unauthorized Cryptomining Operation
Microsoft

Incident: In this high-impact security incident, a threat actor compromised an Azure environment and illicitly spun up approximately 1200 Virtual Machines (VMs) for cryptomining activities. This unauthorized activity led to substantial resource consumption and increased operational costs.

Detection and Initial Response: The anomalous activity was initially detected by Azure Monitor, which flagged an unusual spike in resource usage and network traffic inconsistent with the typical organizational patterns. Immediate steps were taken to isolate the affected VMs and halt all suspicious processes. This rapid containment helped prevent further exploitation of resources.

Investigation and Analysis: A detailed investigation was launched using Azure Security Center and Azure Log Analytics to trace the origin and scope of the breach. By examining system logs and network traffic data, we identified the entry point of the attack—a set of compromised user credentials that allowed the attacker to access our Azure management portal. See below for a sample Azure Log Analytics record for how the entry point of the attack was identified.


"TimeGenerated": "2024-05-01T14:30:00Z",
"ResourceType": "Microsoft.Compute/virtualMachines",
"OperationName": "Create or Update Virtual Machine",
"ActivityStatus": "Succeeded",
"CallerIpAddress": "185.220.100.242",
"ResourceGroupName": "RG-Production",
"SubscriptionId": "abcd1234-5678-90ab-cdef-1234567890ab",
"Identity": {
    "Authorization": {
        "action": "Microsoft.Compute/virtualMachines/write",
        "scope": "/subscriptions/abcd1234-5678-90ab-cdef-1234567890ab/resourceGroups/RG-Production/providers/Microsoft.Compute/virtualMachines/HackedVM01"
    },
    "Claims": {
        "http://schemas.xmlsoap.org/ws/2005/05/identity/claims/nameidentifier": "john.doe@company.com",
        "http://schemas.microsoft.com/identity/claims/scope": "user_impersonation",
        "http://schemas.xmlsoap.org/ws/2005/05/identity/claims/name": "John Doe",
        "http://schemas.microsoft.com/identity/claims/tenantid": "9876efab-c123-4567-89ab-cdef12345678",
        "http://schemas.microsoft.com/claims/authnmethodsreferences": "password"
    }
},
"Properties": {
    "statusCode": "Accepted",
    "serviceRequestId": "9abc1234-def5-6789-ghij-0123456789ab",
    "statusMessage": "{\"status\":\"Succeeded\"}"
},
"Caller": "john.doe@company.com",
"Region": "East US"

Explanation of Key Fields

  • TimeGenerated: The UTC timestamp when the VM creation operation was logged.
  • OperationName: Indicates the operation type (creation or updating of a virtual machine).
  • CallerIpAddress: The IP address from which the operation was initiated, critical for identifying potential external unauthorized access.
  • ResourceGroupName and SubscriptionId: Identify the scope within the organization’s Azure infrastructure.
  • Identity.Claims: Contains claims about the user who performed the operation. (john.doe@company.com is identified as the user, with the operation being authenticated via a password).
  • Properties.statusCode and Properties.statusMessage: Provide additional details about the operation’s execution status.
  • Records have been modified from their original form per NDA restrictions.

Mitigation Strategies: To address the incident and fortify our systems against future attacks, we implemented several key measures:

  • Credential Reset and Role Re-assignment: All compromised credentials were immediately reset. We reviewed and tightened the role-based access controls (RBAC) for Azure resources, ensuring that only necessary permissions were granted to users.
  • Enhanced Monitoring and Alerts: We enhanced our monitoring capabilities by configuring more granular alert rules in Azure Monitor to detect any unusual activity quickly.
  • Security Hardening of VMs: All virtual machines were subjected to a security hardening process, including the application of the latest security patches and strict firewall rules.
  • Education and Policy Updates: Conducted an organization-wide security awareness training focusing on phishing defense and safe credential management. Updated the security policies to include stricter controls on resource provisioning and management.

Resolution and Lessons Learned: The quick response and comprehensive investigation allowed us to terminate the unauthorized cryptomining operation without significant damage or data loss. Key lessons learned included the necessity of continuous monitoring for unusual resource usage, the importance of employing strong, multi-factor authentication for all cloud-based accounts, and the need for regular security audits of cloud environments.

Technologies Used:

  • Azure Security Center
  • Azure Monitor
  • Azure Log Analytics
  • Custom scripts for automated response

Outcome: This incident underscored the critical importance of robust security protocols in cloud environments. By swiftly addressing the unauthorized activity and enhancing our defensive measures, we reinforced our cloud infrastructure’s security and resilience against similar threats.


API Security Breach Leading to Kubernetes Compromise
GCP

Incident: This incident involved a significant security breach where exposed APIs on Google Cloud Platform (GCP) were exploited by attackers to compromise a Kubernetes pod. The vulnerability in the API configuration allowed unauthorized access, which extended to the Kubernetes node and cluster, affecting multiple systems.

Detection and Initial Response: The breach was detected when abnormal activity was noted in the Kubernetes environment, including unusual pod behavior and unexpected external communications. Our immediate response involved isolating the compromised pod and conducting a rapid assessment of the Kubernetes cluster to prevent further unauthorized activities.

Investigation and Analysis: Utilizing GCP’s Log Explorer, we conducted a deep dive into the system logs of the affected Kubernetes node and other related resources. The investigation revealed that the exposed API was accessed multiple times by unauthorized external IP addresses. By tracing these accesses and mapping the attack vectors, we identified additional compromised systems within the cluster. See below for a sample record that helped identify the accesses:


"logName": "projects/example-project/logs/cloudaudit.googleapis.com%2Factivity",
"resource": {
    "type": "k8s_cluster",
    "labels": {
        "cluster_name": "example-cluster",
        "location": "us-central1"
    }
},
"severity": "ERROR",
"timestamp": "2024-05-01T15:20:00Z",
"jsonPayload": {
    "actor": {
        "user": "anonymous"
    },
    "action": "kubernetes.io/apiservers/proxy",
    "metadata": {
        "target": "api/v1/namespaces/default/pods/http:web-server:",
        "method": "GET",
        "url": "http://example-cluster/api/v1"
    },
    "authorizationInfo": {
        "granted": false,
        "resourceAttributes": {
            "namespace": "default",
            "verb": "proxy",
            "group": "",
            "resource": "pods"
        }
    },
    "responseStatus": {
        "code": 200,
        "message": "OK"
    },
    "requestMetadata": {
        "callerIp": "185.220.100.242",
        "callerSuppliedUserAgent": "curl/7.58.0",
        "requestId": "abcdef1234567890"
    }
},
"insertId": "1abcd2efgh34ijkl"

Explanation of Key Fields

  • logName: Specifies the type of log recorded in GCP, here indicating an activity log.
  • resource.type: Identifies the type of resource, a Kubernetes cluster in this instance.
  • severity: The level of the log, marked as “ERROR” due to unauthorized access.
  • timestamp: The date and UTC timestamp when the log was recorded.
  • jsonPayload.actor.user: Shows that the actor was anonymous, indicating unauthorized or unauthenticated access.
  • jsonPayload.metadata: Shows that the actor was anonymous, indicating unauthorized or unauthenticated access.
  • jsonPayload.authorizationInfo: Details whether the request was authorized; here, it shows that authorization was not granted, yet the request still succeeded.
  • jsonPayload.responseStatus: Provides the HTTP status code of the request, indicating success despite being unauthorized.
  • jsonPayload.requestMetadata: Contains the IP address of the caller and the user agent, crucial for identifying the unauthorized access source.
  • Records have been modified from their original form per NDA restrictions.

Mitigation Strategies: To address the breach and secure our infrastructure, we implemented the following key actions:

  • Immediate API Restriction: Temporarily disabled the exposed APIs until proper security measures, like API Gateway controls and OAuth scopes, were implemented.
  • Enhanced Security for Kubernetes Clusters: Applied stricter security configurations, including network policies and role-based access controls (RBAC), to all Kubernetes clusters. This also involved updating Kubernetes to the latest stable version with patched vulnerabilities.
  • Comprehensive System Audits: Initiated a full audit of all GCP resources using GCP Security Command Center to identify and remediate any other potential vulnerabilities or misconfigurations.
  • Forensic Analysis: Continued detailed forensic analysis using GCP's security tools to understand the depth of the compromise and to ensure that all malicious footholds were removed.

Resolution and Lessons Learned: The incident was resolved with the containment and eradication of all threats from our systems. Key lessons included the importance of securing APIs with appropriate authentication and authorization checks, the necessity of regular security reviews for cloud-native environments, and the value of proactive monitoring and logging to detect and respond to incidents swiftly.

Technologies Used:

  • GCP Log Explorer
  • GCP Security Command Center
  • GCP API Gateway
  • Kubernetes Engine

Outcome: The breach highlighted critical security considerations for cloud environments, particularly in relation to API exposure and Kubernetes security. Our prompt and effective response not only mitigated the incident without significant damage but also strengthened our security posture against future cyber threats.