0


亚马逊云科技服务之安全巡检及优化

基础设施保护是信息安全的基石,对企业而言至关重要。它的核心目的是防止企业遭受未经授权的访问、恶意入侵和缺陷利用等威胁。随着数字化转型的推进,企业越来越依赖云计算和网络基础设施,这也使得它们面临的安全风险显著增加。客户需要采取积极的措施来管理其云端配置。以下几点凸显了基础设施防护的重要性:

**· **防止数据泄露:基础设施保护能够防止敏感数据被未经授权的人员访问,从而保护企业的商业机密和客户隐私。

**· **防范缺陷利用:定期的安全扫描和补丁管理能够及时发现并修补系统和应用中的缺陷,防止其他技术人员利用这些缺陷进行入侵。

**· **确保合规性:许多行业都有严格的数据保护和隐私法规(如GDPR、HIPAA等),基础设施保护措施能够帮助企业符合这些法律法规的要求,避免高额罚款和声誉损失。

**· **增强客户信任:良好的安全记录和强有力的基础设施保护措施能够增强客户对企业的信任,从而提升企业的市场竞争力和品牌形象。

伊克罗德认识到基础设施防护和云端配置管理的安全在信息安全中的关键作用,并制定了全面的安全防护方案。通过数据分析和先进的安全技术,伊克罗德帮助客户提升云端安全性,减少业务受入侵的风险。自2023年以来,伊克罗德已经处理了多起客户反馈的安全事件,包括AK/SK泄露、S3 Bucket安全问题以及网络安全组公开等问题。这些问题多源于运维人员对云端安全意识的不足。针对这些问题,伊克罗德售后团队制定了详细的安全检查和优化方案,确保客户的云端环境始终处于最佳保护状态。

(一) AWS服务巡检方案

伊克罗德作为亚马逊云科技的高级合作伙伴,推出的AWS服务巡检方案能及时检查客户在AWS上的资源状况及利用情况。方案通过API收集并分析资源利用率和账单数据,覆盖AWS EBS、EC2、VPC、RDS、S3、EKS、ElastiCache、Redshift、ECS等常用服务。

巡检内容:

1. 全面的安全合规检查

**· **AWS Security Top 10:我们的巡检方案依据AWS Security Top 10,确保环境的安全性、合规性和性能。

**· **身份和访问管理:定期检查IAM用户和角色权限,确保最小权限原则的实施,减少潜在入侵面。

**· **日志记录和监控:审查CloudTrail配置,确保所有关键操作都有记录,并设置及时的告警和监控机制。

**· **数据保护:检查EBS数据存储的安全性,审查S3上的数据保护措施以及KMS的安全管理,保障数据的机密性和完整性。

2. 资源优化与成本控制

**· **识别未使用资源:通过定期检查未使用的资源(如闲置的EBS卷),帮助客户优化资源配置,降低不必要的成本。

**· **账单分析:利用Cost Explorer数据分析资源使用情况,识别费用异常的资源,并提出优化建议,避免资源浪费。

3. 性能监控与优化

**· **CPU使用率监控:持续监控实例的CPU使用率,帮助用户识别高负载实例,及时进行性能调优,保障业务的高效运行。

**· **定期巡检与报告:通过定期巡检和生成详细报告,让客户清楚了解其AWS环境的健康状况和潜在风险。

4. 定制化的可视化Dashboard

**· **直观的数据可视化:通过自定义Dashboard,客户可以直观地查看资源使用情况,界面美观、易于理解,并能清晰展示资源的变化趋势。

**· **灵活的分析工具:利用自定义代码和强大的图表工具,实现灵活的资源使用概览,帮助客户实时监控和优化其AWS资源。

权限配置:

  1. 使用Role:配置具有ReadOnly权限的SwitchRole角色;

  2. 使用SSO:提供sso_start_url 及ReadOnly的权限SwitchRole角色;

适用对象:

所有AWS Global/CN客户

巡检处理流程:

操作频率:

Ÿ 安全设定检查:每24小时至少一次

Ÿ 费用异常:依据AWS原生费用异常检测告警

Ÿ 闲置资源:每周至少1次

优点:

  1. 根据AWS Security Top 10,定时查看云端的安全配置情况

  2. 识别审查云端资源,减少可被入侵面

  3. 定期检查云端资源的使用情况,减少资源浪费

(二) 伊克罗德AWS巡检方案示例

伊克罗德根据客户需求定制巡检服务,以下是服务处理流程::

示例1:成本优化

通过API获取账单信息,筛选出本月费用超过上个月10%的部分,并分类显示。

代码示例如下:

Code Snippet:
select
  case
    when prev_month.service_name is null then 'arn:' || base_month.partition || ':::' || base_month.account_id || ':cost/' || base_month.service
    else 'arn:' || prev_month.partition || ':::' || prev_month.account_id || ':cost/' || prev_month.service
  end as resource,
  case
    when base_month.cost is null then 'skip'
    when prev_month.cost is null then 'ok'
    -- adjust this value to change threshold for the alarm
    when (prev_month.cost - base_month.cost) between (base_month.cost * 0.1)  and 50 then 'info'
    when (prev_month.cost - base_month.cost) > (base_month.cost * 0.1) and (prev_month.cost - base_month.cost) > $1 then 'alarm'
    else 'ok'
  end as status

显示结果(以下为AWS CN Account检测数据):

示例2:巡检数据图形化产出

根据客户需求,定时导入数据到内部开发的图表工具中,生成可视化看板,帮助客户清晰查看资源状态。通过图表,客户可以直观分析AWS资源使用和费用情况,快速识别异常资源和潜在浪费,优化资源管理。

示例3:伊克罗德可视化监测面板

伊克罗德基于客户 AWS 资源使用情况,提供自定义 Dashboard 解决方案。通过代码识别闲置资源(如未使用 EBS 卷、EIP、NAT等)、分析实例 CPU 利用率等,帮助优化资源配置,降低成本,提升性能表现。通过监控实例的CPU使用率,帮助用户识别高负载实例,及时进行性能调优。使用图表进行数据可视化,界面美观且易于理解,能够清晰显示资源变化情况。

  1. 面板样例

仪表板包含顶部的AWS基本资源概况、分析及性能与利用率三部分。内容如下:

Ÿ 基本资源概况:EBS、EC2、unattached EBS、VPC、RDS、S3。

Ÿ 分析视图:实例按状态、区域、账户分布。

Ÿ 性能与利用率:过去7天中CPU使用率最高的前10个EC2及RDS实例。

  1. 配置设定

以上示例为根据AWS基础服务资源实现的一个简易Dashboard,其配置示例如下:

Code Snippet:
mod "local" {
  title = "Insight-mod-ECR"
}
dashboard "dashboard_total_ec2" {
  title = "ECR Dashboard"
  text {
    value = "ECR will use this dashboard to show you account resource usage and help you optimize your observation resource usage!"
  }
  container {
     title = "AWS Basic Resources Overview"
    # Analysis
    card {
      query = query.ebs_volume_count
      width = 2
    }
    card {
      sql = query.ec2_instance_count.sql
      width = 2
    }
    card {
      query = query.ebs_volume_unattached_count
      width = 2
    }
    card {
      query = query.vpc_count
      width = 2
    }
    card {
      query = query.rds_db_cluster_count
      width = 2
    }
    card {
      query = query.s3_bucket_count
      width = 2
    }
}
  container {
     title = "Analysis"
    chart {
      title = "Instances by State"
      query = query.ec2_instance_by_state
      type  = "donut"
      width = 4
    }
    chart {
      title = "Instances by Region"
      query = query.ec2_instance_by_region
      type  = "column"
      width = 4
    }
    chart {
      title = "Instances by Account"
      query = query.ec2_instance_by_account
      type  = "column"
      width = 4
    }

}
  container {
    title = "Performance & Utilization"
    chart {
      title = "Top 10 CPU - Last 7 days"
      query = query.ec2_top10_cpu_past_week
      type  = "line"
      width = 6
    }
    chart {
      title = "Top 10 CPU - Last 7 days"
      query = query.rds_db_instance_top10_cpu_past_week
      type  = "line"
      width = 6
    }
  }
}

#AWS Basic Resources Overview
query "ebs_volume_count" {
  sql = <<-EOQ
    select
      count(*) as "Volumes"
    from
      aws_ebs_volume;
  EOQ
}
query "ec2_instance_count" {
  sql = <<-EOQ
    select count(*) as "Instances" from aws_ec2_instance
  EOQ
}
query "ebs_volume_unattached_count" {
  sql = <<-EOQ
    select
      count(*) as value,
      'Vol Not In-Use' as label,
      case count(*) when 0 then 'ok' else 'alert' end as "type"
    from
      aws_ebs_volume
    where
      jsonb_array_length(attachments) = 0;
  EOQ
}
query "vpc_count" {
  sql = <<-EOQ
    select count(*) as "VPCs" from aws_vpc;
  EOQ
}
query "rds_db_cluster_count" {
  sql = <<-EOQ
    select count(*) as "DB Clusters" from aws_rds_db_cluster;
  EOQ
}
query "s3_bucket_count" {
  sql = <<-EOQ
    select count(*) as "Buckets" from aws_s3_bucket;
  EOQ
}
#Analysis
query "ec2_instance_by_region" {
  sql = <<-EOQ
    select
      region,
      count(i.*) as total
    from
      aws_ec2_instance as i
    group by
      region
  EOQ
}
query "ec2_instance_by_account" {
  sql = <<-EOQ
    select
      a.title as "Account",
      count(i.*) as "total"
    from
      aws_ec2_instance as i,
      aws_account as a
    where
      a.account_id = i.account_id
    group by
      a.title
    order by
      count(i.*) desc;
  EOQ
}
query "ec2_instance_by_state" {
  sql = <<-EOQ
    select
      instance_state,
      count(instance_state)
    from
      aws_ec2_instance
    group by
      instance_state
  EOQ
}

#Performance & Utilization
query "ec2_top10_cpu_past_week" {
  sql = <<-EOQ
    with top_n as (
    select
      instance_id,
      avg(average)
    from
      aws_ec2_instance_metric_cpu_utilization_daily
    where
      timestamp  >= CURRENT_DATE - INTERVAL '7 day'
    group by
      instance_id
    order by
      avg desc
    limit 10
  )
  select
      timestamp,
      instance_id,
      average
    from
      aws_ec2_instance_metric_cpu_utilization_hourly
    where
      timestamp  >= CURRENT_DATE - INTERVAL '7 day'
      and instance_id in (select instance_id from top_n)
    order by
      timestamp;
  EOQ
}
query "rds_db_instance_top10_cpu_past_week" {
  sql = <<-EOQ
    with top_n as (
      select
        db_instance_identifier,
        avg(average)
      from
        aws_rds_db_instance_metric_cpu_utilization_daily
      where
        timestamp  >= CURRENT_DATE - INTERVAL '7 day'
      group by
        db_instance_identifier
      order by
        avg desc
      limit 10
  )
  select
      timestamp,
      db_instance_identifier,
      average
    from
       aws_rds_db_instance_metric_cpu_utilization_hourly
    where
      timestamp  >= CURRENT_DATE - INTERVAL '7 day'
      and db_instance_identifier in (select db_instance_identifier from top_n)
    order by
      timestamp;
  EOQ
}

(三) 巡检项目清单示例

检查项目(Security)

  1. Accurate account information

Ensure security contact information is registered

R

  1. Use multi-factor authentication (MFA)

IAM root user MFA should be enabled

R

IAM users with console access should have MFA enabled

R

IAM administrator users should have MFA enabled

R

  1. No hard-coding secrets

EC2 auto scaling group launch configurations user data should not have any sensitive data

CloudFormation stacks outputs should not have any secrets

R

CodeBuild project plaintext environment variables should not contain sensitive AWS values

R

EC2 instances user data should not have secrets

R

ECS task definition containers should not have secrets passed as environment variables

R

  1. Limit security groups

EC2 instances should not be attached to 'launch wizard' security groups

R

VPC default security group should not allow inbound and outbound traffic

R

VPC Security groups should only allow unrestricted incoming traffic for authorized ports

R

VPC security groups should restrict ingress from 0.0.0.0/0 or ::/0 to cassandra ports 7199 or 9160 or 8888

R

VPC security groups should restrict ingress from 0.0.0.0/0 or ::/0 to memcached port 11211

R

VPC security groups should restrict ingress from 0.0.0.0/0 or ::/0 to mongoDB ports 27017 and 27018

R

VPC security groups should restrict ingress from 0.0.0.0/0 or ::/0 to oracle ports 1521 or 2483

R

VPC security groups should restrict ingress Kafka port access from 0.0.0.0/0

R

VPC security groups should restrict ingress redis access from 0.0.0.0/0

R

VPC security groups should restrict ingress SSH access from 0.0.0.0/0

R

VPC security groups should restrict ingress TCP and UDP access from 0.0.0.0/0

R

Security groups should not allow unrestricted access to ports with high risk

R

  1. Intentional data policies

API Gateway REST API endpoint type should be configured to private

R

Ensure the S3 bucket CloudTrail logs to is not publicly accessible

R

EBS snapshots should not be publicly restorable

R

EC2 AMIs should restrict public access

R

EC2 instances should not have a public IP address

R

ECR repositories should prohibit public access

R

EFS file systems should restrict public access

R

EKS clusters endpoint should restrict public access

R

ELB load balancers should prohibit public access

R

EMR public access should be blocked at account level

R

KMS CMK policies should prohibit public access

R

Lambda functions should restrict public access

R

RDS DB instances should prohibit public access

R

Redshift clusters should prohibit public access

R

S3 bucket policy should prohibit public access

R

AWS S3 permissions granted to other AWS accounts in bucket policies should be restricted

R

S3 buckets should prohibit public read access

R

S3 buckets should prohibit public write access

R

S3 public access should be blocked at account level

R

S3 public access should be blocked at account and bucket levels

R

SNS topic policies should prohibit public access

R

SQS queue policies should prohibit public access

R

SSM documents should not be public

R

  1. Centralize CloudTrail logs

At least one multi-region AWS CloudTrail should be present in an account

R

At least one trail should be enabled with security best practices

R

At least one enabled trail should be present in a region

R

  1. Validate IAM roles

Ensure that IAM Access analyzer is enabled for all regions

R

IAM Access analyzer should be enabled without findings

R

IAM roles should not have read only access for external AWS accounts

R

IAM roles that have not been used in 60 days should be removed

R

IAM role trust policies should prohibit public access

R

  1. Rotate keys

Ensure there is only one active access key available for any single IAM user

R

IAM user access keys should be rotated at least every 90 days

R

检查项目(Cost)

EC2

Application load balancers having no targets attached should be deleted

R

Gateway load balancers having no targets attached should be deleted

R

EC2 instances should not use older generation t2, m3, and m4 instance types

R

Network load balancers having no targets attached should be deleted

R

EBS

EBS volumes attached to stopped instances should be reviewed

R

Are there any EBS volumes with low usage?

R

Still using gp2 EBS volumes? Should use gp3 instead

R

Are there any unattached EBS volumes?

R

VPC

Unattached elastic IP addresses (EIPs) should be released

R

Unused NAT gateways should be deleted

R

RDS

Are there RDS instances using previous gen instance types?

R

RDS DB instances with a low number of connections per day should be reviewed

R

S3

Buckets should have lifecycle policies

R

(四) 巡检告警通知样例

检查名称

Ensure MFA is enabled for the root account

安全维度

身份和访问控制

检查编号

check113

资源类型

IAM

威胁描述

The root account is the most privileged user in an AWS account. MFA adds an extra layer of protection on top of a user name and password. With MFA enabled when a user signs in to an AWS website they will be prompted for their user name and password as well as for an authentication code from their AWS MFA device. When virtual MFA is used for root accounts it is recommended that the device used is NOT a personal device but rather a dedicated mobile device (tablet or phone) that is managed to be kept charged and secured independent of any individual personal devices. ("non-personal virtual MFA") This lessens the risks of losing access to the MFA due to device loss / trade-in or if the individual owning the device is no longer employed at the company.

缓解措施

Using IAM console navigate to Dashboard and expand Activate MFA on your root account.

参考文档

​​https://docs.aws.amazon.com/IAM/latest/UserGuide/id_root-user.html#id_root-user_manage_mfa​​

资源列表

us-east-1: MFA is not ENABLED for root account

(五) 参考链接

​​https://aws.amazon.com/cn/blogs/security/top-10-security-items-to-improve-in-your-aws-account/​​

​​https://docs.aws.amazon.com/zh_cn/IAM/latest/UserGuide/getting-started-roles.html​​

​​https://docs.aws.amazon.com/zh_cn/singlesignon/latest/userguide/using-the-portal.html​​

如果您对此方案感兴趣,可以通过如下方式联系:

联系邮箱:tech-support@ecloudrover.com

标签: 大数据

本文转载自: https://blog.csdn.net/eCloudrover_2014/article/details/141392154
版权归原作者 伊克罗德信息科技 所有, 如有侵权,请联系我们删除。

“亚马逊云科技服务之安全巡检及优化”的评论:

还没有评论