AWS Cloud Concepts

Cloud Concepts	AWS Global Infrastructure	AWS Well-Architected Framework
IAM	AWS Organization and Control Power	EC2
High Availability and Scaling	AWS networking services	S3
Loosly Decouple	Serverless Applications	Containers
AWS database services	Big Data	Deployment
Monitoring	Security	Migration
Machine Learning	Code	Others Services
Billing and Pricing	Support	Some Shots
Conclusion

AWS has a lot of certifications and a few of them together defines a Role. You can see all the journeys here. As soon as you are prepared, you can schedule your exam here. Two important benefits are 30 minutes more if you are not a native English speaker, and 50% in your next test.

A optional test is AWS Certified Cloud Practitioner (CLF-C01), and the first mandatory is AWS Certified Solutions Architect Associate (SAA-C03).

The first step to start your journey is creating your account to do your tests. AWS has a HowTo for it. Don't forget to not use the root to do your tasks. You have to create user in AWS IAM service by AWS Management Console. The default region available to the user is North Virginia.

Some courses:

Cloud Concepts

Definition: Cloud computing is the on-demand delivery of IT resources over the Internet with pay-as-you-go pricing.

Cloud vs Traditional:

Cloud: On-demand, Broad network access, Resource pooling, Rapid elasticity, Measured Service.
Traditional: Requires human involvement, Internal accessibility, limited public presence, Single-tenant, can be virtualized, Limited scalability, Usage is not typically measured

Problems solved: Flexibility; Cost-Effectiveness, Scalability, Elasticity, Agility, High-availability and fault-tolerance

Benefits: Agility, Elasticity, Cost saving (trade fixed expenses for variable expenses), deploy globally in minutes

Advantages of cloud computing

On-Demand: Pay only when you consume computing resources, and pay only for how much you consume
Economies of scale: lower pay as-you-go prices
Elasticity: Scale up and down as required with only a few minutes
Increase speed and agility: the cost and time it takes to experiment and develop is significantly lower; speed to create resources; experiment quickly; scalable compute capacity
Stop spending money running and maintaining data centers
Go global in minutes

Cloud Computing Models

IaaS (Infrastructure as a Service): not responsible by the underline hardware and hypervisor but by the operating system (OS), Data and Application. Ex: EC2, CloudFormation.
PaaS (Platform as a Service): responsible for the applications and data. The customers only need upload their code/data to create the application. Ex: AWS Elastic Beanstalk; Azure WebApps; Compute App Engine.
SaaS (Software as a Service): Not manage anything, only use the service (Facebook, Salesforce). Signup an account.

Cloud Computing Deployment Models

Public Cloud (AWS, Azure, GCP): the resources are owned and operatad by the provider, and the services delivered by internet
Hybrid Cloud: keep some services on primise. It has the control of sensitive assets and flexibility of the public.
Private Cloud (on-premise): not exposed; it allows automatize some process but all the management of the stack is responsability of the company. It must incluse self-service, multi-tenancy, metering, and elasticity. Benefits: Complete control, security (keep the data and application in house)
Multicloud - use private/public from multiple providers

Serverless: technologies for running code, managing data, and integrating applications, all without managing servers. Serverless technologies feature automatic scaling, built-in high availability, and a pay-for-use billing model to increase agility and optimize costs. It eliminates infrastructure management tasks like capacity provisioning, patching and OS maintenance. It not mean no server.

Aditional References:

AWS Global Infrastructure

AWS Global Infrastructure: make possible a global application (decrease latency, disaster recovery, attack protection)

Availability Zones (AZ): one or more discrete data centers with redundant power, networking, and connectivity. Each AZ has independent power, cooling, and physical security and is connected via redundant, ultra-low-latency networks. AZs give customers the ability to operate production applications and databases that are more highly available, fault tolerant, and scalable than would be possible from a single data center. All traffic between AZs is encrypted. AZs are physically separated by a meaningful distance.. Minimum of two AZ to achieve high availability.
AWS Regions: physical location around the world where we cluster data centers. Each AWS Region is isolated, and physically separate AZs within a geographic area. Minimum of three AZs by region. Criterias to choose the region: Compliance, Proximity to the customer, available service (List of AWS Services Available by Region) and pricing.
Local Zones: place compute, storage, database, and other select AWS services closer to end-users. Each AWS Local Zone location is an extension of an AWS Region.
Edge Locations: Content Delivery Network (CDN) endpoints for CloudFront. Delivery content closer the user.
Regional Edge Caches: between your CloudFront Origin servers and the Edge Locations
Architecture: Single Region + SingleAZ; Single Region + Multi AZ; Multi Region + Active-Passive; Multi Region + Active-Active
In Active-Passive failover is possible to apply the routing policy Failover routing

AWS Outsposts

Virtually any on-premises or edge location
It brings AWS data center close to on-premises (racks)
Hybrid cloud, fully managed infra, consistency
Outposts Racks: Complete Rack (42 U rack)
Outposts servers: 1U or 2U
Low latency, local data, data residency, easier migration, fully managed service

Aditional References:

AWS Well-Architectured Framework

AWS Well-Architectured Framework helps to build secure, high-performing, resilient, and efficient infrastructure [1][2][3][4].

AWS Best Practices

Scalability (vertical and horizontal)
Disposable Resources
Automation (serverless, IaaS,etc)
Loose Coupling
Services not Server
Design for failure -> Distributing workloads across multiple Availability Zones
Provision capacity for peak load

Principles

Stop guessing the capacity needs
Test systems at production scale
Automate
Evolutionary architecture
Drive architecture using data
Simulate applications for flash sale days

Pillars:

Operational Excellence: run and monitor system
- Design Principles: IaaC annotate doc; frequent, small, reversible changes; refine operations; anticipate failure; learn with failures
- Best Practices: creates, use procedures and validate; collect metrics; continuous change
Security: protect information, systems and assets
- Design Principles: strong identity foundation; traceability; apply at all layers; automate; protect data in transit and at rest; keep people away from data; prepare for security events
- Best Practice: control who do what; identify incidents; maintain confidentiality and integrity of data
Reliability: system recover from infra or service disruptions
- Design Principles: test recovery procedures; automatically recover from failure; scale horizontally; stop guessing capacity; manage change in automation
- Best Practices: Foundations, Change Management, Failure Management
- Foundation Services: Amazon VPC, AWS Service Quotas; AWS Trust Advisor
- Change management: CloudWatch, CloudTrail, AWS Config
Performance Efficiency: use compute resources efficiently
- Design Principles: democratize advanced technology; go global in minutes; experiment more often; Mechanical sympathy
- Best practices: Data-driven approach; review the choices;make trade-offs;
Cost Optimization: run system to delivery value at the lowest price
- Design Principles: adopt a consumption mode, measure overall efficiency; stop spending money on data center operations; analyze and attribute expenditure; use managed and application level services to reduce cost
- Best Practices: using the appropriate services, resources, and configurations for the specific workloads
Sustainability (shared responsibility): minimizing the environmental impacts of running cloud workloads
- Design Principles: understand impacts; establish sustainability goals; maximize utilization; anticipate and adopt new solutions; use managed services; reduce downstream impact

IAM - AWS Identity and Access Management

Identify AWS access management (IAM) capabilities. It is the AWS service to specify how people, tools and applications will access AWS services and data [1][2][3].

IAM is a Global (not apply to regions) service used to control the access to AWS resources (authentication/authorization). It can be used to manage users, groups, access policies, user credentials, user pwd policies, MFA and API keys.
Root has full administrative permissions and complete access to all AWS services and resource. Actions allowed only to root: change account setting, close account, restore IAM permission, change or cancel AWS support plan, register as a seller, config S3 bucket to enable MFA, edit/delete S3 bucket policies. SCPs can limit the root account.
When a Identity Federation (AD, facebook, SAML, OpenID) is configured, IAM user account is not necessary
Power User has a lot of permission but not to manage groups and users in IAM.
IAM Security Tool:
- IAM Credential Report (account-level): account's users and their credential status. Access it by IAM menu Credential Report
- IAM Access Advisor (user-level): services permissions and last access. Access it by IAM User menu. Use it to identify unnecessary permissions that have been assigned to users.

User is an entity (person or service) created without permissions (by default) with access to an AWS Accounts. They are create with NO access to any AWS services, only login to the AWS console. The permissions must be explicitly given. They log in using user name and password. They can change some configurations or delete resources in your AWS account. Users created to represent an application are known as "service accounts". It's possible to have 5000 users per AWS account.

Groups are a way to organize the users (only) and apply policies (permissions) to a collection of users in the same time. A user can belong to multiple groups. Only users and cannot be nested (groups with groups). It is not an identity so cannot be referenced in policies.

Roles delegate permissions. Roles are assumed by users, applications, and services. It can provides temporary security credentials (STS - Security Token Service) for customer role session. Also, the IAM roles make possible to access cross-account resources. It is a trusted entity.

The policy manage access and can be attached to users, groups, roles or resources. When it is associated with an identity or resource it defines their permissions. It is a document written in JSON. The policy is evaluate when a user or role makes a request, and the permission inside that determine if the request is allowed or denied. Best practices: least privilege. The types of policies are: identity-based policies (user, groups, roles), resource-based policies (resource), permissions boundaries (maximum permission), AWS Organizations service control policy (SCP)(maximum permission for an oganization), access control list (ACL), and session policies (AssumeRole* API action). Policy main elements:

Version
Effect: allow/deny
Action: type of action that should be allowed or denied
Resource: specifies the object or objects that the policy statement covers
Condition: circumstances under which the policy grants permission
Principal: account, user, role, or federated user

IAM authentication is just another way to authenticate the user's credentials while accessing the database.

Access keys are used to programmatic access (API/SDK). It is generated through the AWS Console

SSH key is an IAM feature to allow developer to access AWS services through the AWS CLI.

Best Practices

Use IAM user instead of root user in regular activities
Add user into groups
Strong password
Use MFA
Create roles for permissions to AWS services
User Access Keys for programmatic access (CLI/SDK)
Audit permissions through IAM Credential Reports and IAM Access Advisor
Protectect your access key
Prefer customer managed policies (managed policies cannot be edited)
Use roles for applications that run EC2 and delegate permissions
Rotate credentials
Give only credentials that is really needed (Least privilege)

Good Reads:

AWS Organization and Control Tower

AWS Organization[1][2]:

it is a collection of AWS accounts where is possible manage these accounts, apply polices, delegate responsibilities, apply SSO, share resources within the organization, use CloudTrail across the accounts.
RAM easily share resources across AWS accounts. Free to use, pay by shared resources. Participants cannot modify shared resources.
OU: Organization unit has a logical group of account
For across-account access is better to create a role instead create a new IAM user. It gives temporary credentials.
It provides volume discounts or EC2 and S3 aggregated across the member AWS account.
Consolidate billing: bill for multiple accounts and volume discounts as usage in all accounts is combined, easy to tracking or charges across accounts, combined usege across accounts and sharing of volume pricing discounts, reserved instance discounts and saving plans.
Service Control Policies (SCPs) is in AWS Organization and can control a lot of available permissions in AWS account, but NOT grant permissions. It can be used to apply the restrictions across multiple member accounts (deny rule). It affects only IAM users and roles (not resources policies)

Control Tower:

It is over organization and give support to some adicional features, as create Landing Zone (multi-account baseline) and CT will deploy it.
it set up and govern a secure and compliant multi-account AWS environment.
Monitor compliance through a dashboard. Supports Preventive Guardrail using SCP (e.g, restrict regions across accounts); and Detective Gardrail using AWS Config (e.g, identity untagged resources).
Features:
- Landing Zone: well-architected, multi-account environment based on compliance and security best practices.
- Guardrails: high-level rules providing continuous governance -> preventive (ensures accounts maintain governance by disallowing violation actions; leverages service control policies; status of enforced or not enabled; supported in all Regions) and Detective (detects and alerts whithin all accounts; leverages AWS config rules; status of clear, in violation, or not enabled; apply to some regions)
- Account Factory: configurable account template
- CloudFormation StackSet: automated deployment of templates
- Shared accounts: three accounts used by Control Tower created during landing zone creation

EC2 - Elastic Compute Cloud

Amazon EC2 (Elastic Compute Cloud)[1][2] is a virtual machine that is managed by AWS.

IaaS (Infrastructure as a service)
A new instance combine CPU, memory, Storage and networking. The different types were created to optimize different use case. The `t2.micro` is a general example of a type of instance. The first letter 't' represent the instance class, the number '2' represents the generation, and the last part is the size. Each category try to balance different characteristics:
- compute: require high performance (batch, media transcoding, HPC, machine learning, gaming) - Ex. C8g
- memory: process large data sets in memory (relational/non-relational database, distributed cache, in-memory database for BI, real-time unstructured data) - Ex. R8g
- storage: high, sequencial read and write access to large data sets on local storage (OLTP, relational/non-relational database, cache, data wharehouse, distributed file systems) - Ex. I4g
- networking
Secure, resizable compute capacity in the cloud. Designed to make web-scale cloud computing easier for developers
It can run virtual server instances in the cloud
Each instance can run Windows/Linux/MacOS
It can storing data (EBS/EFS), distributing load (ELB), scaling services (ASG)
Volumes: EBS (persist) and Instance Store (Non-Persistent)
Bootstrap scripts: script that runs when the instance first runs (EC2 User data scripts). It can install updates, softwares, etc. Those scripts run with root user.
Instance metadata is information about the instance. User data and metadata are not encrypted. The metadata is available at http://169.254.169.254/latest/meta-data. To review scripts used to bootstrap the instances at runtime you can access http://169.254.169.254/latest/yser-data
When the instance is stopped and started again the public IP will change. The private IP not change.
If you have a legacy, the EC2 instance is a good solution to migrate to cloud that is right-sized (right amount of resources for the application)
Key pair to access EC2: public key (stored in AQS) + private key file (stored locally). It is used to connect to EC2 instance.
Get metrics in CloudWatch, logs in CloudTrail
Shared Responsability
- AWS: Infrastructure (global network security), Isolation on physical hosts, Replacing hoardware, Compliance Validation
- Customer: Security Groups rules, OS patches and updates, Software and utilities installed on the EC2 instance, IAM Roles assigned to EC2 and IAM user access management, Data security on your instance.

VMWare on AWS: for hybrid cloud, cloud migration, disaster recovery, leverage AWS.

Amazon EC2 Spot Instances

Let you take advantage of unused EC2 capacity in the AWS cloud. Spot Instances are available at up to a 90% discount compared to On-Demand prices. You can use Spot Instances for various stateless, fault-tolerant, or flexible applications such as big data, containerized workloads, CI/CD, web servers, high-performance computing (HPC), and test & development workloads.
Useful for workloads resilient to failure (batch, data analysis, Image processing, distributed workload, CI/CD and testing). However, it is not suitable for critical jobs or persistent workload and databases.
Useful when workload is not immediate and can be stopped for a moment and continue from that point after
Spot fleed: collection of spot on-demand. It will try and match the target capacity with your price restraints. Strategies: capacityOptimized, lowestPrice, diversified, InstancePoolsToUseCount

EC2 Image Builder

It creates Virtual Machine or container images
Automate the creation, maintain, validate and test EC2 AMIs
The execution can be scheduled and after the process the AMI can be distributed (multiple regions)

Amazon AMI (Amazon Machine Image)

Template of root volume + launch permissions + block device mapping the volumes to attach
Launch EC2 - one or more pre-configured instance
It can be customized
it is build for a specific region. The AMI must be in the same region as that of the EC2 instance to be launched; but can be copied to another one where want to create another instance.
It can be copied to other regions by the console, command line, or the API
An EBS snapshot is created when an AMI is builded
Category:
- Amazon EBS: created from an Amazon EBS snapshot. It can be stopped. The data is not lost if stop or reboot. By default, the root device volume will be deleted on termination
- Instance Store: created from a template stored in S3. If delete the instance the volume will be deleted as well. If the instance fails you lose data, if reboot the data is not lost. It cannot stop the volume.

EC2 Hibernate: suspende to disk. Hibernation saves the contents from RAM to EBS root volume. When start again, EBS root volume is restored; RAM contents are reloaded. Faster to boot up. Maxmin days an instance can be in hibernation: 60 days.

Storage:

EC2 Instance Store is an alternative to EBS with a high-performance hardware disk, better I/O performance. However, it lose their storage when they stop (but not when reboot). So, the best scenarios to be used are, e.g, buffer, cache, temporary content.
- AWS: Infrastructure, Replication for data to EBS volumes and EFS drives, replacing faulty hardware, Ensuring their employees cannot access your data.
- Customer: backups and snapshot procesures, data encryptation, analysis the risk
EBS - Amazon Elastic Block Store [1][2][3][4]
- EBS Volume: attached to one instance.
- Designed for mission-critical workload
- High Availability: Automatically replicated within a single AZ
- Scalable: dynamically increase capacity and change the volume type with no impact
- The EBS volumes not need to be attached to an instance. There is the root volume. Good practice create your own volume.
- It allows the instance to persist data even after termination, however, Root EBS volumes are deleted on termination by default
- The EBS volumes cannot be accessed simultaneously by multiple EC2 instance (only with constrains): Attach a volume to multiple instances with Amazon EBS Multi-Attach Same AZ, only to SSD volume, allowed only in some regions, and others restrictions)
- It can be mounted to one instance at a time and can be attached and detached from EC2 instance to another quickly. However it is locked to an AZ. To move to another AZ is necessary to create a snapshot and it can be copy across AZ or Region.
- A snapshot is a backup of the EBS Volume at a point in time. The snapshots are stored on Amazon S3 and they are incremental. EBS Snapshot features are EBS Snapshot Archive and Recycle Bin for EBS Snapshot. The process with snapshots (creating, deletion, updates) can be automated with DLM (Data Lifecycle Manager).
- It has a limited performance.
- Pricing: Volumes type (performance); storage volume in GB per month provisioned; Snapshots (data storage per month); Data Transfer (OUT)
- EBS Volume Types:
  - gp2/gp3 (SSD): general puerpose; balance between price and performance; 3K IOPS and 125 MB/s (up to 16K IOPS and 1K MiB/s). Use cases: high performance at a low cost (MySQL, virtual desktop, Hadoop).
  - io1 (SSD): high perfomance and most expensive. 64K IOPs per volume, 50 IOPS per GiB. critical low latency or high throughput]; Use cases: large database, legacy
  - io2 (SSD): Higher durability. 500 IOPS per GiB (same price of io1). I/O intensive apps, large database
  - st1(HDD): low cost, frequently accessed and thoughput intensive workload (big data, data warehouse);
  - sc1 (HDD): lowest cost; less frequently accedded]. Onlys SSDs can be boot volume.
- Encryption: use KMS; if the volume is created encrypted the data in trasit is encrypted, the snapshots are encrypted, the volumes created from snapshots are encrypted. A copy of an uncrypted volume can be encrypted.
EFS - Amazon Elastic File System [1][2][3]
- Network File System (NFS) for Linux instances and linux-based applications in multi-AZ.
- Shared File storage service using EC2.
- It is considered highly available, scalable, expensive, pay per use (Expensive).
- Tiers: frequent access (Standard) and not frequent access (IA)
- EFS Infrequent Access (EFS-IA) is a storage class that is cost-optimized for files not accessed and has lower cost than EFS standard. It is based on the last access. You can use a policy to move a file from EFS Standard to EFS-IA.
- Encryption at rest using KMS
Amazon FSx
- for Windows File Server: fully managed Microsoft Windows file servers, manage native Microsoft windows file system. Make the migration easy
- for Lustre: managed file system that is optimized for compute-intensive workloads (HPC, Machine Learning, Media data Processing). Fo Linux File System. It can store data in S3.

Network:

ENI (Elastic Network Interface): Virtual Network card
- Attributes: Primary and secondary IPv4 address, Elastic IPv4, public IPv4/IPv6, security group, MAC address, source/destination check flag, description
- Attached to instances
- eth0 is ENI created when an Ec2 instance is launched
- Use cases: create a management network; use network and security compliances in your VPC; create dual-homed instances with workloads/roles on distinct subnets; create a low-budget, high-availability solution
ENL (Enhaneed Networking): uses single root I/O virtualization (SR-IOV) to provide high performance (1-Gbps - 100 Gbps)
- ENA (Elastic Network Adapter): it enable Enhanced network which provides higher bandwidth, higher packet-per-second (PPS) performance, and consistently lower inter-instance latencies (up to 100 Gbps)
- VF (Intel 82599 Virtual Function Interface): For older instances (up to 10 Gbps)
EFA (Elastic Fabric Adapter): ENA with more capabilities. It is a network interface for Amazon EC2 instances that enables customers to run applications requiring high levels of inter-node communications at scale on AWS.

Security

Security Group (SG):

Virtual firewall to ENI/EC2 instance
Instance level (can be attached to multiple instances)
Applied to the network security, controlling the traffic into or out of the EC2 instance
By default, inbound traffic is blocked and outbound traffic is authorised
ALLOW rule: IP and other security groups -> It contains only rules and these rules can reference by IP or by security group.
Stateful: if there is an inbount rule that allow the traffic, the outbound automatically allowed without rules; if the outgoing request is done by instance and the rule allow the outbound traffic then the inbound return is automatically allowed. Supports only allow rules. Security Groups can be associated with a NAT instance. The rules are evaluated before deciding whether to allow traffic.
Locked down to a region/VPC combination.
Protect against low level network attack like UDP floods.
They regulate access to Ports and authorised IP ranges.
A good practices is to create a separate security group for SSH access.
Tips: errors with time out is a security group issue; error of connection refused can be an application error.
Security groups can be changed for an instance when the instance is in the running or stopped state.

Peformance

Placement groups: It is an strategy of optimization to meet the needs of your workload which you can launch a group of interdependent EC2 instances into a placement group to influence their placement. It can be:

Cluster: grouping of instances within a single AZ. Low latency and high throughput. It can't span multiple Azs
Partition: set of racks that each rack has its own network and power source. Multiple EC2 instance
Spread: group of instances that are each placed on distinct underlying hardware. Small number of critical instances that should be separate from each other. A spread placement group can span multiple Availability Zones in the same Region. You can have a maximum of seven running instances per Availability Zone per group.

EC2 Pricing: the price for it depends the instance (number, type), load balance, IP adreess, etc. You can use AWS Pricing Calculator to simulate to cost.

On-Demand: short workload, predictable pricing, billing per second/hour, pay for what you use, no long-term commitment, highest cost, no discount. Best use to short-term and un-interrupted worloads.
Reservations (1-3 years): predicted workload. Various services like Ec2, DynamoDB, ElastiCache, RDS and RedShift. Pay up Front. The remaining term of the reserved instances can be sold on Marketplace
- Reserved instances (RI): long workloads; has a big discount and has as scope Regional or Zonal. Indicated for steady-state usage application. It cannot be interrupted (up to 72% off the on-demand price)
- Convertible Reserved Instances: long workload with flexible instances; gives a big discount. This model change the attributes of the RI as long as the exchange results in the creation of RIs of equal or greater value (up to 54% off the on-demand price)
EC2 Savings Plan: reduce compute cost based on long term (1-3y). Locked to a specific instance family and region. Lot of flexibility (EC2, Fargate, Lambda). No Upfront or Partial Upfront or All Upfront Payments
Spot Instance: High discount (up to 90%). It is the most cost-efficient instances in AWS. Urgent Capacity; Flexible; Cost Sensitive. Use for app with flexible start and end times; app with low compute prices (Image rendering, Genomic sequence). Not use if need a guarantee of time.
Dedicated host (single customer, your VPC): physical server with EC2 instance dedicated, can use your own licenses. It can be purchasing On-Demand or Reserved. It is the most expensive.
Dedicated Instance: single customer, isolated hardware dedicated to your application, but this hardware can be shared with other instances in the same account. Compliance, Licensing, on-Demand, Reserved.
Minimum charge: one-minute for Linux based EC2 instances.

Aditional References:

High Availability and Scaling

These are features to be used to ensure elasticity and high availability. They can be used together.

Scalability

Handle greater loads by adapting
Scale Up: scale by adding more power (CPU/RAM) to existent machine/node. Operation running on only one computer.
Scale Out: scale by adding more instance to existent pool of resources. Fault Tolerance is achieved by scale out operation.
Scale In: decrease the number of instances.
Vertical: inscrease the size of the instance. Common for non distributed system. Limited, e.g, by hardware.
Horizontal [1]: increase the number of instances. Distributed system. Common for web applications. Auto Scaling Group and Load Balancer [2]. Instances that are launched by your Auto Scaling group are automatically registered with the load balance[3].
High Availability: Direct relatioship with horizontal scalability. No interruption even with failover. Run across multi AZ, at least in 2 AZ

Set Up

Launch Template: all the settings needed to build an EC2 instance; for all EC2 auto scaling features; supports versioning; more granularity; recommended. The specification of a network interface has considerations and limitations that need to be taken into account in order to avoid errors. [1]
Launch Configuration: only certain EC2 Auto Scaling feature; immutable; limited configuration options; specific use cases

Auto Scaling: create and remove instance when is necessary. It can use a launch configuration (an instance configuration template) that an Auto Scaling Group uses to launch Amazon EC2 instances. [1][2][3]

ASG contains a collection of EC2 instance (logical group)
Replace unhealthy instances.
Only run at an optimal capacity.
AWS EC2 Auto Scaling provides elasticity and scalability.
High availability can be achieved with Auto Scaling balancing your EC2 count across the AZs
Scaling Policies: minimum, maximum and desired capacity
- Step scaling policy: launch resources in response to demand. It's not a guarantee the resources are ready when necessary
- Simple Scaling Policy: Relies on metrics for scaling needs, e.g., add 1 instance when CPU utilization metric > 70%.
- Target Tracking Policy: Use scaling metrics and value that ASG should maintain at all times, e.g, Maitain ASGAverageCPUUtilization equal 50%
Instance Warm-Up: stops instances behind load balancer, failing the helath check and being terminated prematuraly
Cooldown: pause AS for a set amount of time to not launch or terminate instances;
Avoid Thrashing: create instance very fast
Scaling types:
- Reactive scaling: Monitors and automatically adjusts the capacity; predictable performance at the lowest possible cost. It, e.g, add/remove (Scale out/in) EC2 instances when the load is increased/decreased.
- Scheduled Scaling (predictable workflow) can be configured for known increase in app traffic.
- Predictive Scaling: uses daily and weekly trends to determine when scale
Strategy: Manual or Dynamic (1. SimpleStep Scaling (CloudWatch); 2.Target Tracking Scaling; 3. Scheduled Scaling

Scaling Relational Database

Vertical Scaling: resize the database
Scaling Storage: resize storage to go up, but is not able scale back down (RDS, Autora)
Read Replicas: realy only copies to spread out the workload. Use multiple zones.
Aurora Serverless: offload scaling to AWS. Unpredictable workloads. Aurora is the only engine that offers a serverless scaling option.

Scaling Non-Relational Database

AWS do this
Types:
- Provisioned: predictable workflow; need to review past usage to set upper and lower scaling bounds; most cost-effective model
- On-Demand: sporadic workflow; less cost effective;
Concepts:
- Read Capacity Unit (RCU): DynamoDB unit of measure for reads per second for an item up to 4KB in size. As an example: if you have objects that are 7KB in size, then will be necessary 2 RCU for 1 strongly consistent read per second (1 RCU = 4KB/1 Strongly Consistent Read -> 2 RCU = 8KB)
- Write Capacity Unit (WCU): DynamoDB unit of measure for writes per second for an item up to 1KB in size. As an example: if you have an object that are 3KB in size, then will be necessary 3 WCU (1 WCU = 1KB * Write per Second -> 3 KB * 1 WCU = 3 WCUs )

Disaster Recovery

RPO - Recovery Point Objective: point in time to recover (24h, 5 minutes, etc) (How often you run backups - back in time to get the data to recover)
RTO - Recovery Time Objective: how fast to recover; how long the business support (when recover after disaster)
Strategies
- Backup and Restore: Restore from a snapshot (Chepest but slowest)
- Pilot Light - not consume the same level; provision 100% of the services to keep the applications up (faster than backup and restore but some downtime)
- Warm Standby - provision the services necessary to keep the applications up (quicker recovery time than Pilot Light but more expensive)
- Multi Site / Hot Site Approach: low RTO (expensive); full production scale is running AWS and on-premise
- All AWS Multi Region - the best approach for Data replication is use Aurora Global
- Active/Active Failover: is necessary to have a complete duplicated services (the most expensive but no downtime but lowest RTO and RPO)
AWS Elastic Disaster Recovery(DRS): recover physical, virtual and cloud-based servers into AWS

AWS Backup:

Supports PITR for supported services
It can be done on-demand or schedule
Cross-Region/Cros-Account backups
Tag-based backup policies
Backup Plans - frequency, retention
Backup Vault Lock: enforce WARM (WriteOnceReadMany) for backups

AWS networking services

VPC (Virtual Private Cloud): isolated network in AWS cloud that can be fully customized. It is a virtual data center. A range of IP[1][2] is defined when the VPC is created [3][4][5][6]

Subnet: partition of the network inside the VPC and AZ. The public is accessible from the internet. Instances are launch into subnets. Two subnets configured in one AZ (High avalilability)
Elastic IP: static IP for a public IP in EC2 instance
Route Tables: make possible the access of the internet and between subnets.
Internet Gateways: helps VPC to connect to internet. The public subnet has a route to the internet gateway, but private subnet does NOT have a route to Internet Gateway.
NAT Gateway (AWS-managed) and NAT instance (self-managed): allows the instance inside the private Subnets to access the internet or other services. But denying inbound traffic from internet. automatically assigned a public IP (elastic IP). It is Reduntand inside AZ. Nat Gateway does not need to patch. NAT instance supports port forwarding and it's associated with security groups
VPC Endpoint[1]: connect to AWS services using private Network. It can be combined with PrivateLink and is not necessary NAT, gateways,etc. It is not leaving AWS environment. They are horizontaly scaled, redundant, and highly available. Types: Interface endpoints (elastic network interface with private IP - supports many services) and Gateway Endpoints (virtual device you provision, similar to NAT GW - supports connection to S3 and DynamoDB)

VPC Security

Network ACL (Access Control List): it is the first line of defense. It is subnet level: firewall to subnets (only IP), controlling traffic in and out of one or ore subnets. Stateless: have to allow inbound and outbound traffic (checks for an allow rule for both connections). Supports allow and deny rules. Customer is responsible for configure it. The default ACLs allows all outbound and inbound traffic. The custom ACL denies inbound and outbound traffic by default. A subnet will be associated with the default ACL. A subnet is associated with only one ACL but ACL can be associated with multiple subnets. Rules are evaluating in order starting with the lowest number rule (first match wins).

VPCs Connections

VPC Peering [1]: connect two VPC via direct network route using private IP; cross account and region, no transitive peering, must not have overlapping CIDRS. It can use Security Group cross account but not cross region.
PrivateLink: provides private connectivity between VPCs, AWS services, and your on-premises networks, without exposing your traffic to the public internet. It is the best option to expose a service to many of customer VPC. It does not need VPC peering, or route tables or Gateways. It needs a Network Load Balancer (NLB) on the service VPC and an ENI on the customer VPC.
VPM CloudHub: Multiple sites with its own VPN connection. The traffic is encrypted.
VPN - Virtual Private Network[1]:
- Establish secure connections between your on-premises networks and VPC using a secure and private connection with IPsec and TLS. Encrypted network connectivity
- Site to Site VPN: it connects two VPCs via VPN. It needs Virtual Private Gateway (VGW), a VPN concentrator on the AWS side of VPN connection; and a customer Gateway (CGW) in the customer side of the VPN. It can be used as a backup connection in case DX fail. Connect (encrypted) on premises VPN to AWS over the public internet
- AWS Managed VPN: Tunnels from VPC to on premises
- VPN Gateway: connect one VPC to customer network
- Customer Gateway: installed in customer network
- Client VPN: connect to your computer using OpenVPN. Connect to EC2 instance over a private IP.
Direct Connect (DX)[3][4]: physical connection (private) between on premises and AWS. Types: dedicated network connection (physical ethernet connection associated with a single customer) or hosted connection (provisioned by a partner). No public internet. The company should use AWS Transit Gateway. Using only DX data in transit is not encrypted but is private; DX + VPN privides an IPSec-encrypted private connection. Resiliency: Use two Direct Connection locations each one with two independent connections.
AWS Transit Gateway [1]: connect VPC and on-premise network using a central hub working as a router. It allows a transitive peering; works on a hub-and-spoke model; works on a regional basis (cannot have it across multiple regions but can use it across multiple accounts.)

5G Networking with AWS WaveLength: Infrastructure embedded within the telecommunication provides datacenters at 5G network

VPC Flow Logs: capture information about the IP traffic going to and from network interfaces in your VPC. For this, configure the Bastion Host security Group to allows inbound from internet on port 22.

Bastion Host is a instance in public subnet that handle the communication between internet and one or more EC2 instance in a private subnet via ssh.

VPC sharing allows multiple AWS accounts to create their application resources into shared and centrally-managed Amazon VPCs. To set this up, the account that owns the VPC (owner) shares one or more subnets with other accounts (participants) that belong to the same organization from AWS Organizations. After a subnet is shared, the participants can view, create, modify, and delete their application resources in the subnets shared with them. Participants cannot view, modify, or delete resources that belong to other participants or the VPC owner. You can share Amazon VPCs to leverage the implicit routing within a VPC for applications that require a high degree of interconnectivity and are within the same trust boundaries.

Route 53 [1][2][3]:

Global Managed DNS supported by AWS:
- DNS: Convert name to IP
- TTL: time to live
- CNAME (canonical name): map a domain name to another domain name. CAnnot be used for naked domain names
- Alias: Map a host name to an AWS resource. You can't set TTL and cannot set an ALIAS record for an EC2 DNS name. An AWS DNS alias record is a type of record that points a domain name to an AWS resource, such as an Elastic Beanstalk environment, an Amazon CloudFront distribution, or an Amazon S3 bucket. It is used to create subdomains or point a domain name to a different AWS service. [1]
- Record/alias can be used to naked domain names.
- DNS: Port 53
- DNS does not route any traffic but responds to the DNS queries
Reliability and cost-effective way to route end users
Helth check
It is a hybrid architecture.
It's not possible to extend Route 53 to on-premises instances.
Paied for hosted zone, queries, traffic flow, health checks, domain name.
Policies:
- Weighted routing policy[1] is used to route traffic to multiple resources (associated with a single domain/subdomain) and to choose how much traffic is routed to each resource (split traffic based on different weights assigned). It can be used, e.g, for load balancing purpose. Assigning 0 to a record will stop the traffic to that resource. Assigning 0 to all records then all records returns equally.
- Simple Routing Policy route the traffic to a single resource. It allows one record with multiple IP; if the record is multiple value the Route 53 returns all values to the user in a random order.
- Geolocation Routing Policy: choose where the traffic will be sent based on the geographic location of the users (which DNS queries originate). Also, can restrict distribution of content.
- Geoproximity Routing: based on geographic location of the resources, and can choose to route more traffic to a given resource.
- Latency Routing Policy: based on the lowest network latency for the end user (which regions will give them the fastest response time)
- Failover routing policy: use primary and standby configuration that sends all traffic to the primary until it fails a health check and sends traffic to the secondary. This solution does not good enough for lowest latency. It is used when you want to create an active/passive set up. It cannot be associated with Health Checks.
Health Checks: Only for public resources

ELB (Elastic Load Balancer): distribute the traffic across healthy instances with targets. [1][2][3]

It can be across multiple AZs but Single Region
Internal (private) or external (public)
Servers that handle the traffic and distribute it across, e.g., EC2 instance, containers and IP address.
It has only one point of access (DNS).
Benefits: High availability across zones, automatic scaling and Fault Tolerance.
AWS responsibilities: guarantees it will work, upgrade, maintenance, high availability
Types:
- ALB (Application Load Babancer / Inteligent LB): HTTP/S; Static DNS (URL); Layer 7; It is a single point of contact for client. ALBs allow you to route traffic based on the contents of the requests. Distributes incoming application traffic across multiple targets in multiple AZ. Good for microservices and container-based application
- NLB (Network Load Balancer / Performance LB): high performance/low latency (TCP/UDP); static IP throught Elastic IP; layer 4 (connection level). It distributes traffic. When the NLB has only unhealthy registered targets, the Network Load Balancer routes requests to all the registered targets, known as fail-open mode.
- GLB (Gateway Load Balancer / Inline Virtual Appliance LB): route traffic to firewalls managing in EC2 instance (Layer 3);
- Classic: Layer 4 and 7; more used to test or dev. 504 error means gateways has time out
Sticky Session: redirect to the same instance. Use cookies (LB or Application). Stickiness allows the load balancer to bind a user's session to a specific target within the target group. The stickiness type differs based on the type of cookie used. Can't be turned on if Cross-zone load balancing is off.
Shared responsibility: AWS is responsable to keep it working, upgrade, maintain, and provide only few configurations.

Performance

AWS Global Accelerator [1][2]

Network service that send users' traffic through AWS's global network infrastructure via accelerators.
Improve global application availability and performance
Increase performance with IP cache. Can help deal with IP caching issues by providing static IPs
By default, GA provides two static Anycast IP address
Good when use static IP and need determinist and fast regional failover.
For TCP and UDP (Major difference from CloudFront)
Optimize the route to endpoints
Use Edge Locations to the traffic
No caching and has proxy packets at the edge
Integration with Shield for DDoS protection
Target: EC2 instances or ALB
Both Route 53 and Global Accelerator can create weights for application endpoints

Amazon CloudFront [1][2]:

Global (and fast) Content Delivery Network (CDN)
It works with AWS and on-site architecture
It can block countries, but the best place to do to it is WAF
Replicate part of your application to AWS Edge Locations (content is served at the edge)
Edge location: location to cache the content
It can use cache at the edge to reduce latency. Improves read performance
It's possible to force the expiration of content or use TTL
Security: Defauls to HTTPS connections and can add custom SSL certificate; DDoS protection, integration with Shield, Firewall
Customer origin: ALB, EC2 instance, S3 website
S3 bucket: distribute files and caching at the edge, security with OAC (Origing Access Control)
Can be integrated with CloudTrail
Great for static content that must be available everywhere; in oposite of S3 Cross Region Replication that is great for dynamic content that needs to be available at low latency in few regions
Pricing: Traffic distribution; Requests; Data transfer out. Price is different for region

S3

There are three categories to Storage Service:

File storage: storage files in a hierarchy
block storage: storage in a fixed size blocks. Any change only a block is changed
object storage: storage as a object. Any change then all the objects are changed

S3[1][2] is an AWS Storage service for object. It allows store and retrieve data from anywhere at a low cost. Basically for static files. The files are storage in bukets and is highly available and highly durable. Data is stored redundantly across multiple AZs. The name must be global unique name even the bucket is regional level. S3 is design for Frequent Access.

As characteristics, S3 offer different storage classes for different use cases (Tiered Storage); it has a lifecycle Management automatically to move the objects between different storage tiers or delete them through rules to be more cost effective; and use versions to retrieve old objecs. Also, it has a Strong Read-After-Write Consistency.

Object store and global file system.
Used to store any files until 5TB without limits in buckets (directories/containers)
These objects have a key.
You can have version of the objects (bucket level)
Virtually unlimited amount of online highly durable object storage.
Each bucket is inside of a region
Write-once-read-many (WORM) - prevention of deletion or overwritten
Use cases: backup, disaster recovery, archive, application hosting, media hosting, Software delivery, static website
Versioning: stores all versions of an object
Performance: S3 Transfer Acceleration: Accelerate global uploads & downloads into Amazon S3; and Increase transfer speed to Edge Location (enables fast, easy, and secure transfers of files over long distances between your client and your Amazon S3 bucket)
Move Objects
Shared Responsibility
- AWS: Infrastructure (global security, durability, availability), Configuration and vulnerability analysis, Compliance validation, AWS employees can't not access the customer data, separation between customers
- Customer: version, bucket policies, replication, logging and monitoring, storage class, data encryptation, IAM user and roles
Pricing: Depends the storage class; storage quantity; number of request; transition request; data transfer. Requester payes

Security

Access Control Lists (ACLs): It is account level control. It defines which account/groups can access and type of access. It can be attach by object or bucket.
Bucket Policies: It is account level and user level control. It defines who and what is allowed or denied. - allows across account
IAM Policy: It is user level control. A policy attached to a user can give permission to access S3 bucket
IAM Role for EC2 instance can allow EC2 instance access S3 bucket
Access Point
Encrypting S3 Object:
- Types:
  - in transit: SSL/TLS; HTTPS (Buket policy can force encryption)
  - at Rest (server-side): SSE-S3 (AES 256-bit -> default -> all objects); SSE-KMS (advantages: user control + audit key usage using CloudTrail); SSE-C: customer-provider keys (S3 does NOT store the encryption key you provide; must use HTTPS; key must be provided in header). Default encryption on a bucket will encrypt all new objects stored in the bucket
  - at rest: client-side: encrypt before upload file
- Enforcing Server-side Encryption adding a parameter in PUT Request Header:
  - x-amz-server-side-encryption: AES256
  - x-amz-server-side-encryption: aws:kms
  - PS: You can create bucket policy to denies any S3 PUT request without this parameter
- CORS: needs to be enabled
- MFA delete: for delete object or suspend version. Only the owner can enbled it.

Buckets are private by default. For securing a bucket with public access is necessary to allow public access to bucket and objects. It involves Object ACL (Individual Object level) and Bucket Policies (entire bucket level).

S3 is a good option to hosting a static website. It will scales automatically. For that, the bucket access should be public and you can add a policy to allow read permission for the objects. Other use cases can be: backup and storage, disaster recovery, archive, hybrid cloud storage, data lakes and big data analytics.

Lock

S3 Object Lock: WriteOnce, ReadMany (WORM)
- Governance Mode: overwrite or delete an object version only with special permissions
- Compliance Mode: even root cannot overwrite or delete protected object version for the duration of the retention period.
- Legal hold can be placed and removed by user with s3:PutObjectLegalHold permission
Glacier Vault Lock: WORM. Deploy and enforce compliance controls for individual S3 Glacier valts with vault lock policy. Objects can be deleted.

Optimizing S3 Performance:

Use Preifx (folders inside the bucket) to increase performance
The use of KMS impact in performance because is necessary to call GenerateDataKey when upload a file and call Decrypt in KMS API to download a file. Prefer SSE-S3.
Multipart Uploads can increate performance. It is recomended for files over 100MB and required for files over 5GB. The performance can be improved parallelizing uploads.
S3 Byte-Range Fetches: Parallelize downloads by specifying bytes ranges
Tranfer speed can be increate transfering the data to an edge location
Transfer acceleration (CloudFront)

Backup

Versioning: all versions are stored; good for backup; once enable it cannot be disabled (only suspended); it can be integrated with lifecycle rules and supports MFA. For the static webpage, the last version will be available, not the previous. It can use Lifecycle Management rules to transit versions throgh tiers.
Replication [1][2]:
- Replicate the object from one bucket to another and the version must be enables in both sides
- Deleted objects are not replicated by default
- Cross-Region Replication (CRR): compliance, lower latency access
- Same-Region Replication (SRR): log aggregation, live replication
- Copying is asynchronous
- Only new objects are replicated
S3 sync command can be used to copy objects between buckets and lists the source and target buckets.
Amazon S3 Batch Replication provides you a way to replicate objects that existed before a replication configuration was in place, objects that have previously been replicated, and objects that have failed replication. This is done through the use of a Batch Operations job.

Classes

Standard: High Availability and durability; Designed for Frequent Access; Suitable for Most workflows; low latency and high throughput. Ex: Big Data analytics, mobile, gaming, content distribution.
Standard-IA (Infrequent Access): Rapid Access; Pay to access the data; better to long-term storage, backups and disaster recover. Ex: disaster recover, backup. Comparing with Glacier, it is the best option if is necessary retrieves IA data immediately.
One Zone-Infrequent Access: data stored redundantly inside a single AZ; Costs 20% less than Standard-IA (lower cost); Better to long-lived, IA, non-critical data. Ex: secondary backup copies of on-premises data.
Intelligent-Tiering: Good to optimize cost. It is used to move the data into classes in a cost-efficient way if you don't know what is frequent or not. No performance impact or operational overhead. More expensive than Standard-IA
Gacier: archive data; pay by access; cheap storage. You can use Server-side filter and retrieve less data with SQL
- Gacier Instant Retrieval (minimum duration - 90 days)
- Glacier Flexible Retrieval -> not need immediate access, retrieve large sets of data at no cost (backup, disaster recover)(minimum duration - 90 days)
- Glacier Deep Archive -> Cheapest -> data sets for 7 -10 years (12h). (minimum duration - 180 days)

Last notes

S3 Event Notification: enables you to receive notifications when certain events happen in your bucket. To enable notifications, you must first add a notification configuration that identifies the events you want Amazon S3 to publish and the destinations where you want Amazon S3 to send the notifications. An S3 notification can be set up to notify you when objects are restored from Glacier to S3. It can be trigger to SNS, SQS, Lambda, EventBridge (archive, replay events, reliable delivery).
S3 Storage Lens: Tools to analyse S3
S3 Access Log: A new buket will store the logs. They must be in the same region.
Pre-Signed URL: can be created by Cosole or CLI.

Loosly Decouple

The AWS recommendation for architecture is Loosly Coupling. It can be achieve by ELB and multiple instances. However, in some scenarios ELB may not be available. For this, other resources can be used to achive that. Here are some services that go on this direction

SQS (cloud native service) [1][2]:

Fully managed message that use queue for decouple and scale microservices, distributed system and serverless application.
Retention of message: default is 4 days, maximum 14 days, the minimun can be customized (e.g.,1 minutes).
Deletion: after the retention period or to be read.
Pay-as-you-go pricing.
Asynchronous process.
Settings: Delivery delay (0 up to 15 minutes)
Message size: up to 256KB of text)
Require Message Group ID and Message Deduplication ID
AWS recommend using separate queues when you need to provide prioritization of work
Security
- Encryption: message are encrypted in trasit (HTTPS) by defaul but not at rest (can do using KMS)
- Access Control with IAM policy - SQS API
- SQS Access policy with resource policy - useful for cross-account or other services
Strategy:
- FIFO: Guaranteed ordering, no message duplication, 300 trasanction per second (can achieve 3000 with baching). FIFO High throughput process up to 9000 transaction per second, per API without batching; and up to 90000 messages using batching APIs.
- Standard: better performance; the order message can be implemented; message can be duplicated; can use message group ID to process the message in order based on the group; unlimited throughput
In scenarios of real-time should use Kinesis instead of SQS

DLQ (Dead-Letter Queues):

Target for messages that cannot be processed sussessfully.
It works with SQS and SNS.
Useful for debugging applications and messaging systems.
Redrive capability allows moving the message back into the source queue.
It is a SQS queue and use FIFO.
Benfits: can use alarms
Publicly accessible by default

SNS (cloud native service) [1][2] :

Push-based messaging service.
Delivery messages to the endpoints that are subscribed.
Fully managed message for application-to-application (A2A) and application-to-person (A2P).
Cycle: Publisher -> SNS topic / Subscriber -> get all messages from the topic
Subscribers: Kinesis, SQS, Lambda, emails, SMS and endpoints
Size: 256KB of text
Extended Library allows sending messages up to 2GB. The payload is stored in S3 and then SNS publishes a reference to the object.
DLQ Support
FIFO or Standard
Security
- Encryption in transit by default (HTTPS) and can add at rest via KMS
- Access Policies: can be attached a resource policy, useful across-account access
FanOut (SNS+SQS) [1] : messages published to SNS topic are replicated to multiple endpoint subscription (1:N)
Message Filtering: send to specific subscriber
Publicly accessible by default

Amazon MQ [1]:

Message broker
Good when migrating to the cloud
Supports ActiveMQ or RabbitMQ
Topics and queues
1:1 and 1:N message design
AmazonMQ requires private network (VPC, Direct Connect or VPN)
Not scale as SNS or SQS
Runs on servers and can run in multiple AZs with failover

API Gateway [1][2][3]:

Fully managed service for developers; easy for developers create, publish, maintain, monitor, and secure APIs at any scale
Restful and WeSocket
Front door of the application.
Integrate with lambda, HTTP, endpoints, etc
Features:
- Security - protect endpoints by attaching a WAF (Web Application Firewall);
- Stop abuse - users can easily implements DDoS protection and rate limiting to curb abuse of their endpoints;
- Ease of Use
Enpoint Types:
- Edge-optimized - (default option): API requests get sent through a CloudFront edge location (best for global users);
- Regional - ability to also leverage with CloudFront; reduce latency; can be protected with WAF
- Private - only accessible via VPCs using interface VPC endpoint or Direct Connect
Securing APIs: user authentication (IAM, roles, Cognito, custom authorizer); ACM certs for edge-optimized endpoints and regional endpoints; WAF

AWS Batch

Run batch computing workloads within AWS (EC2 or ECS/Fargate): Fargate is more recomended because require fast start times (<30 sec), 16 VCPU or less, no GPU, 120 GiB of memory; EC2 needs more control, require GPU and custom AMIs, high levels of cuncurrency, require access to Linux Parameters
Simple: Automatically provision and scale, no intallation is required
AWS provisions the compute and memory. Customer only need submit or schedule the batch job.
Components: Jobs; Jobs definition (how jobs will run); Jobs Queues; Compute Environment
Batch vs Lambda
- Time Limits: lambda has 15 minutes execution time limit; batch does not have this
- Disk space: lambda has limited disk space, and EFS requires functions live within a VPC
- Runtime limitations: lambda is fully serverless, but limited runtimes; and Batch can use any runtime because uses docker

Step Function [1]

Coordenate distributed apps
Orchestration service; graphical console;
Main componenets: state machine (workflow with different event-driven steps) and tasks (specific states within a workflow (state machine) representing a single unit of work). State is every single step within a workflow
Execution: instances where you run your workflows in order to perform your task
Types:
- Standard: one execution, can run upto one year, useful for long-running workflow - auditable history; up to 2000 executions per second; pricing based per state transition
- express: at least one execution, can run for up to five minutes, useful for high-event-rate workflows, e.g IoT data streaming; pricing based on number of execution, durations and memory

AppFlow [1]:

Integration service for exchanging data between SaaS apps and AWS services;
Pulls data records from third party SaaS vendors and stores them in S3;
Bi-directional data transfer
Transfer up to 100 gibibytes per flow, and this avoids the Lambda function timeouts
Data mapping (how sources data is stored);
Filter (controls which data is transferred);
Trigger (how the flow is started)
Use case: salesforce records to Redshift; analyzing conversations in S3; migration to Snowflake

Serverless Applications

Lambda [1][2]:

FaaS
Example of integration: ELB, API Gateway, Kinesis, DynamoDB, S3, CloudFront, CloudWatch, SNS, SQS, Cognito
Virtual functions
By default, Lambda function is launched outside of VPC, but it can be done in VPC
Synchronous: CLI,SDK, API Gateway
Asynchronous: S3, SNS, CloudWatch, etc
Run on-demand
Scaling automatically
Event-driven
Lambda needs IAM role to access AWS APIs [3]
Can be monitoring through CloudWatch
Pricing: Pay per call (request) and duration (time of execution). Free tier of 1.000.000 requests and 400.000 GB of compute per month. After that, pay per request.
Compute:
- 1K councurrent execution
- Short-term execution (900 seconds - 15minutes). If is necessary more time, use RC2, Batch, EC2
Storage:
- 512MB-10GB disk storage (integration with EFS);
- 4KB for all environment variables;
- 128MB-10GB memory allocation
- Easily set memory - up to 10.240MB, and CPU scales proportionaly with memory
Deployment and configuration
- Compressed deployment package <= 50MB
- Uncompressed package <= 250MB
- Request and response payload size up to 6BM
- Streamed responses up to 20MB
Lambda@Edge: function attached to CloudFront to run close the user and minimize latency

Serverless Application Repository

It allows users find, deploy, publish their own serverless application
Privately share applications
Deeply integrated with Lambda service
Options:
- Publish: makes apps available for others to find and deploy;SAM templates helps to define apps; private by default
- Deploy: Find and deploy published apps; Browse public apps wihtout an AWS account; Browse withoin Lambda Console

Aurora Serverless

On-Demand and Auto Scaling for Aurora database
Automation of monitoring workloads and adjusting capacity for database
Pricing: charged for resources consumed by DB cluster
Concepts: Aurora Capacity Units (ACU - how the cluster scale); allocated by AWS-managed warm pools; 2GiB of memory, matching CPU and networking capability; resiliency (six copies of data across three AZs)
Use case: variable workflows; multi-tentant apps (service manage capacity for each app); new apps; dev and test new features; mixed use apps; capacity planning

GraphQL: (AWS AppSync): robust, scalable GraphQL Interface for application developers; combines data from multiple sources; enable integration for developers via GraphQL (data language used by apps to fetch data from servers)

Other serverless service

SWF [1]
AWS AppSync: store and sync data between mobile and web app

Containers

Here are serverless services for containers [1].

Orchestrations

ECS (Elastic Container Service) [2]
- ECS Launch Types: EC2 and Fargate
- Components:
  - Cluster: logical grouping of tasks or services. It can have ECS Container instances in different AZ
  - Task: a running Docker container
  - Container instance: EC2 instance running the ECS agent
  - Service: Defines long running tasks. It can control number of tasks with Auto Scaling, and attach an ELB for distrubute traffic across containers
- Why: ECS manages anywhere many containers; ECS will place the containers and keep them online; containers are registered with LB; containers can have roles attached to them; easy to set up and scale
- Best used when all in on AWS
- ECS Pricing
- Shared responsibility
  - AWS start and stop the containers
  - Customer has to provision and maintain the infrastructure (EC2 instance).
- ECS anywhere:
  - on-Premise, no orchestration, completely managed, Inboud traffic has to be managed separately (no ELB support)
  - Requirement: SSM Agent, ECS Agent, Doker; register external instances; installation script within ECS console; execute scripts on on-premises VMs; deploy containers using EXTERNAl launch type
EKS (Amazon Elastic Container Service for Kubernetes) [3]:
- Manage Kubernetes clusters on AWS
- Can be used on-premise and the cloud
- Best used when is not all in on AWS
- More work to configure and integrate with AWS
- EKS-D: managed by developer. Self-managed Kubernetes deployment (EKS anywhere)
- EKS anywhere:
  - on-Premise EKS, EKS Distro (deployment, usege and management for cluster), full lifecycle management
  - Concepts: control plane, location, updates (manual CLI), Curated Packages, Enterprise Subscription

Fargate [1]

Serverless compute engine for Docker container
AWS manage the infrastructure
Works with ECS and EKS
Benefits: no OS access, pay based on resources allocated and time ran (pay for vCPU and memory allocated - pricing model); short-running task; isolated environment per container; capable of mounting EFS file system for persistent, shared storage. In some use cases it can be advantage comparing with EC2.
Comparing with Lambda, select Fargate when the workload is more consistent (predictable). Also, Fargate allows docker use across the organization and some control by developer. By other hand, lambda is better to unpredictable or inconsistent workload; good for a single funcion.
It is for containers and applications that need to run longer
Shared responsibility
- AWS: Automatically provision resources. AWS runs the containers for the customer.
- The customer don't need provision the infrastructure.

ECR - Elastic Container Registry

Managed container image registry
Secure, scalable, reilable infrastructure
Private conteiner image repository
Components: Registry (private); Authorization token (to push and pull images to and from retristries); Repository; Images
Secure: permission via IAM; repository policy
Cross-Region; Cross-account; configured per repository and per region
Store customer images to be runned by ECS or Fargate
Integration with customized container infrastructure, ECS, EKS, locally (linux - for development purpose)
Use rules to expire and remove unsused images
Scan on push repository can identify vulnerabilities

AWS database services

It's possible to install database in EC2 instance. It can be necessary when is needed full control over instance and database; and using a third-party database engine [1][2].

RDS - Amazon Relational Database Service[1][2][3].

Use EC2 instance
Benefits to deploy database on RDS instead EC2: hardware provision, database setup, Automated backup and software patching. It reduce the database administration tasks. There is no need to manage OS. You can only create read replicas of databases running on RDS.
RDS types for it: SQL Server, Oracle, MySQL, PostgreSQL, MariaDB
It's possible encrypt the RDS instances using AWS Key Management Service (KMS) and snapshot
Sales up by increaing instance size (compute and storage)
Replicas is only to ready. It improves database scalability.
Read Replica: read-only copy of the primary database. It can be cross-AZ and cross-region. Not used for recovery disaster, only for performance. It requeres Automatic backup.
Multi-AZ: when a Multi-AZ DB instance is provisioned, RDS cautomatically creates a primary DB Instance and synchronously replicates the data to a standby instance in a different Availability Zone (AZ). RDS will automatically failover to the standby copy.
It can use Auto scaling to add replicas
Serveless
You can't use SSH to access instances.
It is suited for OLTP workloads (real-time)
Security through IAM, Security Groups, KMS, SSL in transit
Support for IAM Authentication (IAM roles)
RDS Proxy: Allows app to pool and share DB connections improving db efficience, as well reduce the failover
Shared Responsibility
- AWS: Manage the underlyning EC2 instance, disable ssh access; Automated DB and OS patching, guarantee the hardware
- Consutmer: Check ports, IP, Security groups inbound rules; users and permissions for database; create database (public/private access); config DB to only allow SSL connection; database encryptation setting; createing schema (table, indexes,etc), Schema Optimization.
Pricing: Depends the Clock hours of server uptime; Database characteristics (size, mem); Database purchase type (on demand, reserved instance); Number of database instances; Provisioned storage; Additional storage; Requests; Deployment type; Data transfer (OUT); Reserved Instances.

Aurora[1][2][3][4]

Relational DB from AWS fully managed.
Compatible with MySQL, PostgreSQL, Oracle, Microsoft SQL Server
Storage: data is stored in 6 replicas across 3 AZ
Compute: cluster of DB instance across multiple AZ, auto scaling (up 128 TB) of Read Replicas. automatic backup enabled
User case: unpredictable and intermittent workloads, no capacity planning
Aurora Global: up to 16DB read instances in each region
Perform Machine Learning

Amazon ElastiCache[1][2]

Manage Mem cached
Managed Redis
Can be used in front of any database but betther for RDS
Service that adds caching layers on top of your databases
In-Memory databases with high performance and low latency (under a millisecond)
Support for clustering (Redis) and Multi AZ
Security through IAM, Security Groups, KMS, Redis Auth
Shared Responsibility: AWS takes care of OS maintenance / patching, optimizations, setup, configuration, monitoring, failure recovery and backups

Amazon DynamoDB[1][2][3]

NoSQL database
Key/value store (Tables, Items [maximum size 400KB] and Attributes)
Not generally suited to storing documents or images
Highly available with replication across 3 AZ.
Stored on SSD storage
Hight performance: reads and writes for online transaction processing (OLTP) workloads
Low latency retrieval
Eventually consistent reads (default - better performance) or Strongly consistent reads
Multi-Region replication. Ative-Active with cross region support.
Distributed serverless database
Integrated with IAM for security, authorization and administration
Low cost and auto scaling
Horizontal Scaling
Standard and IA (Infrequent Access) Table class
Cache: DAX (DynamoDB Accelarator)
- It is fully managed in memory cache, the performance is improved, highly scalable and available. Only used with DynamoDB
- Lives inside the VPC
Pricing: throughput; Indexed data storage; Data tranfer; Global tables; reserved capacity; On-demand capacity mode; Provisioned capacity mode
Security: Encryption at rest using KMS; Site-to-Site VPN, Direct Connect (DX), IAM policies and roles; Integrate with CloudWatch and CloudTrail; VPC endpoints to communicate directly with DynamoDB
ACID with DynamoDB -> Dynamo transaction across 1 or more tables within a single AWS account and region. Used when the application needs coordenation. This feature needs to be enable.
Backup: On-Demand: full backups at any time; no performance impact, same region of source table
Recovery: Point-in-Time Recovery (PITR): protect agains accidental writes or deletes; restore to any point in the last 35 days; incremental; not default; latesst restorable in the past 5 minutes
Considering a point-in-time recovery (PITR)(continuous backup) for DynamoDB, the customer is responsible to configure (turn on) and AWS is responsible for the backup. Amazon RDS database instance can be restored to a specific point in time with a granularity of 5 minutes
Streams: time-ordered sequence of titem-level changes in a table. Stored for 24 hours
Global Table: managed multi-master, multi-region replication: globally distributed applications; based on DynamoDB streams; replication latency under 1 second
Time to Live (TTL): define when an item expire and can be automatically deleted

DocumentDB[1][2]: Implementation of MongoDB. It is fully managed service; storage scales sutomatically up tp 64TB, high avai;ability and replicates six copies of the data across 3 AZs. Used to migrate MongoDB to cloud. Backup to S3. Ex: User profile.

QLDB(Quantum Ledger Database): Fully managed graph database; no decentralization component; immutable ledger database. Ex: review a complete history of all the changes. NoSQL. Use cryptography. Immutable database.

Managed Blockchain: create and manage blockchain networks with open-source frameworks

Amazon Keyspaces: run Apache Cassandra workloads. Distributed database that uses NoSQL. Main application is big data but can be used for backend. Fully manage database. Pay for resource is used.

Neptune[1][2]: Fully managed graph database. Good to app with highly connected datasets, as fraud detection and knowledge graphs. Used for analysis, build connections between identities, build knowledge, detect fraud patterns, security.

Timestream: time series database service for IoT and operational application. For analyses.

Big Data

Characteristics of big data

Volume: Ranges from terabytes to petabytes of data.
Variety: different sources and formats
Velocity: needs short period of time

Amazon Redshift[1][2][3][4][4]

Based on PostgreSQL (but not OLTP)
Relational database for a analytic purpose
OLAP - online analytical processing (analytics and data warehouseing)
Parallel Query
Run SQL against data warehouse
Redshift Spectrum run queries against Amazon S3 without loading the data from Amazon S3 into data warehousing solution. Massive parallelism
Size: up to 16PB of data
Pricing: Pay as you go
BI tools: AWS Quicksight or Tableau
High Availability: supports Multi-AZ deployments
Snapshots: are incremental and point-in-time. Always contained in S3
Performance: always facor large batches inserts

Amazon EMR (Elastic MapReduce)[1][2]

Help with ETL processing
EMER is made up of EC2 instances
Managed big data platform
Storage:
- Hadoop Distributed File System (HDFS) - distributes stored data across instance. Used for caching results
- EMR File System (EMRFS) - extends hadoop to access data in S3 which store input and output data.
- Localfile system - locally connected disck created with each EC2 instance (instance store volume)
EMR has cluster and Nodes
Helps to create Hadoop clusters (Big Data)
Take care of all the provisioning and configuration
Auto Scaling
Purchasing Options and Cluster types: OnDemand, reserved (min 1 year), Spot (cheapest option), long-running or transient
Ex: machine learn and big data

Kinesis [1][2][3][4]: it is a message broker for real-time. it is a kind of big data pathway connected to a AWS account. It ingest, process and anlyze rel-time streaming data.

Amazon Kinesis Data Streams (KDS) [1] - For real-time for ingesting data. It can be used to continuously collect data. The developer is responsible for creating the consumer and scaling the stream. It does not automatically scale
Data Firehose: data transfer tool to get information to S3, Redshift, Elasticsearh, or Splunk. Near real time (60s). It is plug and play with AWS architecture. It scale automatically
Kinesis Data Analytics and SQL: Easy, no servers, cost (pay for resources consumed). Easiest way to process data going through Kinesis using SQL. It analyzes the data after it receives the data
It is more use to Big Data, but in scenarios that is necessary real data, it is better than SQS.

Athena [1][2]:

Analyze data in S3 using SQL;
it is serverless (no infrastructure to manage);
Pricing: you pay only for the queries that you run. Ex: BI, analytics, reporting
Use case: Query logs -> Serverless solution - the only service that allow you to directly query data that's stored in S3

Amazon Glue[1][2]

Serverless data integration
Discover, prepare, and combine data for analytics
ETL: extract, transform, and load service
The AWS Glue Data Catalog is a central repository to store structural and operational metadata for all your data assets.
Good working together with Athena: Athena can work by itself and Glue can design schema for the data
It's possible to specify the number of DPUs (data processing unit) for an ETL job. A Glue ETL job must have a minimum of 2 DPUs. AWS Glue allocates 10 DPUs to each ETL job by default

QuickSight[1]:

Primarily used for analyzing log files and docs, specially within an ETL process
scalable, serverless, embeddable, machine learning-powered business intelligence (BI) service
Data visualization service - dashboards
Features: visualization, ad-hoc data analytics; integrates with RDS, Aurora, Athena, S3...
Robust in-memory engine
Colum-level security (CLS)
Pricing per-session and use-based

AWS AWS Data Pipeline

It is a managed ETL (Extract, Transform, Load ) service within AWS
Implement automated workflows for movement and transformation of data between different compute and storage services
Data source can be on-premise
Define data-driven workflows
Define parameters for data transformation
Highly Availability and fault tolerant
Handling Failures - automatically retries failed activities
Integrate with DynamoDB, RDS, Redshift and S3
Works with EC2 and EMR
Components: Pipeline definition (business logic), Managed Compute (create EC2 instance), task runners, Data Notes (location and types of data)
Use cases: processing data in EMR using Hadoop streaming; Importing or exporting DynamoDB data; Copying CSF files or data between S3 buckets; Exporting RDS data to S3
Can use SNS for failure notification andsuccess and other event-driven workflow

Managed Streaming for Apache Kafka (Amazon MSK)

Managed service to dun data streaming that needs Kafka
Control Plane: creates, updates and delete clusters
Data Plane: leverage Kafka data-plane operations for producing and consuming streming data
Components: Broker Nodes (amount of broker nodes per AZ); ZooKeeper Nodes; Producer, Consumers and Topics; Flexible Cluster Operations
Resiliency: Automatic Recovery, detection of broker failures; Reduce data (reuse storage); Impact time is limited; After recovery, the same broker IP will be used again
Serverless
Security and Logging
- Integration with KMS
- Encryption at rest bu default
- TLS for encryption in trasit between brokers in clisters
- Deliver broker logs to CloudWatch, S3, Kinesis Data Firehose
- Metrics are gathered and sent to CloudWatch
- MSK API calls are logged to CloudTrail

OpenSearch[1]

Successor of Elasticsearch
Managed analytics and visualization service
Quick Analysis in clusters, usually par of an TLS process
Search any field
Can be used as a component to another database
Easily Scalable
Security: via Cognito and IAM, VPC SG (cluster can be deployed in a VPC), encryption at rest and in transit (KMS encryption, TLS), field-level security. Cannot use IP-based access policies.
Backup using snapshot
Create cluster (OS domain), specify the number of instances and type
Storage options: UltraWarm or Cold storage
Multi-AZ capable: up to three AZs
Flexible - SQL for BI apps. Support SQL is not native but can be enable via plugin
Integration with CloudWatch, CloudTrail, S3, Kinesis
Use case - creating a logging solution involving visualization of log file analytics or BI report

Deployment

Amazon CloudFormation[1][2]

declarative way of outlining your AWS Infrastructure, for any resources
It uses template to create the stack (security group, EC2 instancesm S3 bucket, ELB, etc)
Infrastructure as a Code (IaaC)
Productivity: fast to destroy and re-create an infrastructure
Automated generation of diagrams
Declarative programming
Free to use
JSON/YAML
Stack is a regional resource

AWS Elastic Beantalk[1][2]

Integrate with VPC and IAM
ZIP, WAR, Git
Plataform as a Service (PaaS)
Monitoring, metrics, and health checks are all included
Wasy-to-use service for deploying (on EC2) and scaling web applications and services developed with Java, .NET, PHP, Node.js, Python, Ruby, Go, and Docker on familiar servers such as Apache, Nginx, Passenger, and IIS.
It can fully manage the EC2 instance or developer can control that
Shared Responsibility
- Aws: performe the deployment strategy, OS, capacity, load balancing, auto-scaling, health-monitoring and responsiveness
- Customer: deployment strategy configuration, application code
Pricing: Free but you pay for the underlying instances

AWS Systems Manager [1][2]

Provides an operations console and APIs for centralized application and resource management in hybrid environments
A hybrid service that manage EC2 and OnPremises system at scale
Operations insights about state of infrastructe.
Provides interactive browser-based shell and CLI experience
Run commands and apply patches on EC2 instance
Manage the OS and Database patches
Provide secure and auditable instance management without the need to open inbound ports, maintain bastion hosts, and manage SSH keys
Centralize operational data from multiple AWS services, automate tasks, create logical groups of resources
Track and resolve operational issues across your AWS applications and resources from a central place
Summary
- Capabilities: Automation, run command, patch manager, parameter store, maintenance windows, session manager
- Session Manager: logging (to CloudWatch and CLoudTrail), SSM Agent
- System Manager Agent (SSM Agent): makes possible for System Manager to update, manage, and configure resoures where the agent is installed
- Parameter Store: free to store config and secret values

AWS Proton

It is a service that creates and manages infrastructure and deployment tooling
Automate IaC provisioning and deployments
Define standardized infrastructure
Use templates to define and manage app stacks
automatically provisions resources, configure CI/CD pipelines, and deploys code
Supports CloudFormation and Terraform

AWS Amplify: develop and deploy scalable full stack web and mobile application

AWS Devise Farm: service to test web application and mobile

AWS Pinpoint: marketing communication service (email, sms, voice) - engage with customers

AWS OpsWorks[1]

Like Chef and Puppet - perform server configuration automatically.
Alternative to SSM
Best Practices: Root Device Storage for Instances

Aditional References:

AWS Resource Access Manager

Cloud Monitoring, Audit

CloudWatch (Metrics, Logs, Alarms, Events) [1][2]

Performance Monitoring
It is a monitoring and observability service which provides metrics and insights; interactively search and analyze log data.
Regional source
Features: System Metrics, Application Metrics, Alarms
The alarms trigger notifications for metric.
Types of metrics: default (CPU utilization, Network throughput), custom (EC2 Memory utilization, EBS Storage Capacity)
The CloudWatch Logs enable real-time monitoring, can store and access customers log file from EC2 instance, CloudTrail, etc. Centralize logs, quering logs, audit, etc. It's possible to query logs to look for potential issues. For custom logs, use CloudWatch Agent, including on-premise. Features: Filter Patterns; CloudWatch Logs Insights (query using SQL); Alarms. It cannot provide the status of the customer resources. Adjustable retention.
Monitoring with Managed Service (Grafana, for Prometheus)

CloudTrail[2]:

Record API calls
It tracks events (history events/API calls).
Log, monitoring and retain account activity (Who, What, When). Track user activities and API requests and filter logs to assist with operational analysis and troubleshooting.
Governance, compliance, audit for AWS account. It can be applied to all regions or one. It has encryptation enabled as default.
Enabling the insights allows CloudTrail detect automatically unusual API activities in the customer account.

AWS Config [1][2]:

Enables customer to assess, audit, and evaluate the configurations of their AWS resources.
Continuous monitoring.
Track all changes in the resources
Auditing and recording compliance of the AWS resources, and record configurations and changes
Allows automating the evaluation of recorded configurations
Per region service; can be aggregated across regions and accounts
Inventory management and control tool (it's not preventative)
Record configuration changes (configuration history)
Receive alerts via SNS for alerting (change and compliance notification)
EventBridge can send events from Config events to other AWS service
Per Region but can be aggreated across region and account.
It can send alerts for changes and the configuration can be store inside S3.
Use case: discover the architecture in a account (query); create rules to monitor and receive alerts when that rules are violated (enforce); get the history (learn)
Pricing: pay per item and rule evaluation
Remediation: can be automatic via SSM automation document which can leverage Lambda function for custom logic

EventBridge [1]

Serverless event bus
Pass events from a source to an endpoint
Build event-driven applications at scale, schedule (cron jobs), event pattern, trigger lambda functions,send SQS/SNS message, etc.
Events Bus is the router that receives events and delivers them to targets.
Use it to trigger an action based on an event in AWS
Sucessor of CloudWatchEvents
Fastest way to respond to things happening in the environment

AWS X-Ray:

Serverless
Collect data to get insights about requests and responses
Traces - tracing headers, send trace data
Debugging in Production.
Benefits: performance, uderstand dependencies, review request, find errors, identify users, trace request across microservice/AWS Service.
Contepts: Segments, subsegments, service graph, traces, tracing headers

AWS Helth Dashboard:

Provides a dashboard for viewing both public AWS service events and account-specific events that affect resources within your AWS accounts
Provides you continuous visibility into your resource performance and availability of AWS services
Service history (Service health Dashboard) for all regions or by your account (Account Helth Dashboard).
It shows general status.

AWS Personal Health Dashboard:

Personalized view of the status of the AWS services that are part of customer Cloud architecture.
Alerts are triggered by changes in the health of your AWS resources, giving event visibility, and guidance to help quickly diagnose and resolve issues and.
Customer can quickly assess the impact on your business when AWS service(s) are experiencing issues.
It gives a personalized view of performance and availability of the services used by customer.

Audit Manager: Automated service for continuous auditing that procuces reports for PCI compliance, GDPR, etc

Amazon Managed Service for Prometheus: is a serverless, Prometheus-compatible monitoring service for container metrics. It is perfect for monitoring Kubernetes clusters at scale

AWS Managed Grafana: fully managed service for infrastructure for data visualizations (analytics and monitoring application). Features: query, correlate, and visualize operational metrics from multiple sources.

Security

Shared Responsibility has the customer responsible for security IN the cloud (data, access, authentication, configuration, encryptation, network traffic protection). AWS is responsible for the security OF the cloud, protecting/mnaging all AWS Global infrastructure (Software [compute, storage, database, networking], and hardware [regions, AZ, Edge Locations])

Examples of Atacck:

DDos: Distributed Denial-of-Service
- Attempts to make the application unavailable to the end users
- Layer 4 DDoS Atack is the same SYN flood. It works at the transport layer (TCP). It involves the 3-way handshake to establish the connection (client -> SYN packet to a server -> server replies SYN-ACK -> Client responds ACK). The atack overwhelm the server with SYN-ACK.
Amplification Attack: attacker send a third-party server a request using spoofed IP address. Th server responds to that IO.
Layer 7 attack: application receives a flood of GET or POST requests

AWS Security Services and strategies:

Loggin API Calls with CloudTrail: It allows after-the-fact incident investigation; near real-time intrusion detection; industry and regulatory compliance. Store the logs in S3

Protecting application with AWS Shield Standard

Protect AWS customers on ELB, CloudFront and Route 53
AWS Shield Standard: free for every customer, protect websites and applications (SYN/UDP floods, reflection attacks, and over layer 3 and 4 attacks).
AWS Shield Advanced: protection 24/7, optional DDoS migration services; protection on EC2, ELB, CloudFront, GLobal Acceleator, Route 53 against more sophisticated attacks. Detection and mitigation for network layer (layer 3), transport layer (layer 4) and application layer (layer 7) attacks. Near real-time notifications of DDoS attacks.
Mitigate

WAF [1][2] (Web Application Firewall):

Filtering traffic: filter specific requests based on rules - requests sent to CloudFront, ELB, API Gateway.
Protection on layer 7 (HTTP) DDoS attacks, SQL Injection and Cross-Site Scripting(XSS)
Define Web ACL (Web Access Control List - rate-based rules).
Protecting a website that is hosted outside of AWS (the on-premise IP is added to a target group).
Configuring a firewall in front of resources is a good practice to protect against DDoS
In case use CloudFront and is necessary to block an IP, WAF must be with CloudFront to block that IP and not in ALB because, in this case, ALB just can see the CloudFront ID; and for the same reason NACL is not enough.
AWS WAF features

Firewall Manager

Centralized service to manage rules across multiple AWS account and applications in AWS Organization
Simplify the management of Firewall rules across account

GuardDuty [1]:

Threat detection service to protect AWS account
Monitor suspicious activity
It uses Machine Learning and check Logs.
Identify potential security issues.
Analyse CloudTrail events, VPC Flow Logs, etc.
Ex: unusual API calls, malicious IP, unauthorized deployment, compromised instances
Feature: Alerts in GuardDuty console and ClaudWatch Event; receive feeds from thord party (e.g., AWS Security inform malicious IP); monitor CloudTrail, VPC Flow logs and DNS logs; centralize detection across multiple AWS account; automate response with CloudWatch Events and Lambda
Pricing: 30 days frew; quantity of CloudTrail Events; volume of DNS and VPC Flow Logs data

Macie [1]:

Monitoring S3 bucket
Fully managed data security and data privacy service (Identify potential security issues).
Personally Identifiable Information (PII) - personal data used to establish an individual's identity
It uses machine learning and pattern matching to discover and protect customers sensitive data in S3
Alerts to uncrypted buckets, public buckets, shared buckets

Inspector [1]:

Automated security assessment service that helps improve the security and compliance of applications deployed on AWS.
Inspect running operating systems (OS) against known vulnerabilities
Exposure, vulnerabilities, and deviations from best practices
Reports and Integration with AWS security Hub
After performing the assessment, it produces a detailed list of security findings prioritized by level of severity
Types of assessment: Netework and Host
Analyze against unintended network accessibility
Automated Security Assessments for EC2 instances, Container images, Lambda Functions.
Send findings to Amazon Event Bridge.
Continous scanning of the infrastructure when it is needed
Can use an agent to monitoring
Cannot be used to prevent Distributed Denial-of-Service (DDoS) attack

Encryptation

KMS[1] - Key Management Service
- Encryptation for Software
- AWS manage the encryptation keys
- CMK - customer master key: can be generated by KMS, customer key management or CloudHSM
  - AWS managed CMK: create, managed and used on the customers behalf by AWS; used by AWS services
  - Customer Managed CMK: create, manage, use, enable or disable; rotation policy
  - AWS owned CMK: collections of CMKs owned by AWS to use in multiple accounts. The customer cannot see those keys.
  - CloudHSM Keys [1][2]: created by the device
    - enables easily generate and use your own encryption keys in AWS cloud
    - Hardware Security Module (physical device)(level 3 compliance)
    - The customer manages the encryptation keys
- The key can be routated
- Create policies to access KMS CMKs
- Control permissions: key policy; IAM + key policy; grants + key policy
- Sometimes is necessary to encrypt the data in rest(data stored or archived on device) or in transit(being moved from an origin to a destiny throgh network)
- Encryptation is possible in all storage and database: EBS Volume, S3 bucket, Redshift, RDS, EFS
- Encryptation is automatically enabled to: CloudTrail Logs, S3 Glacier, Storage Gateway
The AWS encryption SDK is a client-side encryption library that is separate from the language–specific SDKs
SSE-S3 (S3 Managed Keys) is a server-side encryption where each object is encrypted with a unique key. As an additional safeguard, it encrypts the key itself with a root key that it regularly rotates.
SSE-KMS Key Management Service: Server-side encryption that is similar to SSE-S3, but using this service. It provides audit trail.

AWS Secrets Manager

Storing secrets
Rotation of secrets
Integration with RDS
Secrets encrypted using KMS

Parameter Store

provides secure, hiararchical storage for configuration data management and secrets management
Pricing: FREE!!!
Limits: 10.000

ACM - AWS Certificate Manager

Customer can provise, manage and deploy SSL/TSL certiticates
Provide encryptation for websites (HTTPS)
Free charge for TLS certificate
Integration with ELB, CloudFront, API Gateway
Pricing: free

AWS Artifact [1]: Artifact reports (AWS security and compliance document) and Artifact Agreements (AWS agreements). PS: Audits and download compliance reports. Ex: Service Organization Control (SOC) reports, Payment Card Industry (PCI)

Cognito [1][2]

AWS Cognito is a fully managed service that provides secure and scalable customer identity and access management (CIAM), enabling user authentication, authorization, and data synchronization across devices and platforms for web and mobile applications, with support for both authenticated and unauthenticated users.
Alternative to IAM.
Identity for your Web and Mobile applications users (sign-up/sign-in; social identity like Facebook)
External
Auth process: Authenticate and get tokens; Exchange tokens and get AWS credentials; Access AWS services using credentials.
Components: user pools and identity pools

AWS Detective:

deep analyses to isolate the root cause of the security issues or suspicious activities (ML/graphs)
Machine learning, statistical analysis, and graph theory
Sources: VPC Flow logs, CloudTrail logs, Kubernates audit logs, GuardDuty findings
Use case: Triage security findings; Threat Hunting

Network Firewall:

Managed service that makes it easy to deploy physical firewall protection across VPCs.
Use cases: Filter internet traffic; Finter outbound traffic; inspect VPC-to-VPC traffic.
Scenario: Filtering the network traffic before it reaches the Internet Gateway, or intrusion requirement prevention system, or any hardware firewall requirement

AWS Security Hub:

Central Security Hub to view all security alerts
Across multiple Accounts
Automate security Checks. Create a dashboard. Identify potential security issues.

AWS Directory Service[1]:

It is fully manage version of Microsoft Active Directory [Database of objects (user, accounts, computers, etc)].
Centralized security management
Build AD in AWS
Create a tunnel between AWS and on-premise AD
Standalone directory powered by Linux Samba Active Directory
Managed Microsoft AD: migrate into AWS
Ad Connector: leave on-primise but not move yet

Penetration Testing

Against customer AWS infrastructure without prior approval, e.g. EC2 instances, NAT Gateway, ELB, RDS, CloudFront, Aurora, API Gateway, Lambda, Beanstalk environment and LightSail resources
It cannot: DoS, Port flooding, etc

AWS Abuse: Report suspected AWS resources used for abusive or illegal purposes (spam, port scanning, DoS, DDoS, etc)

AWS STS - Security Token Service: temporary (short-term credentials), limited privileges credentials

AWS IAM Access Analyzer: identify the resources in your organization and accounts, such as Amazon S3 buckets or IAM roles, shared with an external entity. This lets you identify unintended access to your resources and data, which is a security risk.

Aditional References:

Migration

AWS Cloud Adoption Framework (AWS CAF) AWS experience and best practices to help migrate your business outcomes through innovative use of AWS. Perspective: Business, People, Governance, Platform, Security and Operations.

Strategy:

Rehosting: moving applications without changes (lift-and-shift)
Replatforming: few cloud optimizations to realize a tangible benefit(lift, tinker, and shift)
Refactoring/re-architecting: reimagining how an application is architected and developed by using cloud-native features
Repurchasing: moving from a traditional license to a software-as-a-service model
Retaining: keeping applications that are critical for the business in the source environment
Retiring: removing applications that are no longer needed

Snow Family

Highly-secure, portable devices to collect and process data at the edge, and migrate data into and out of AWS
Data migration:
- Snowcone: less size of storage, it is a small device, send data to AWS offline or using AWS DataSync (8TB of storage, 4GB of memory, 2vCPUs)
- Snowball (Storage Optimized (80TB) /Compute Optimized (42TB up to 81TB): data transfer throught the network, pay per data transfer job (Ex: disaster revovery), can have Storage Clustering (up to 15 nodes), it encrypts the data. EC2 does this natively support. EC2 compute instance can be hosted on a Snowball.
- Snowmobile: More capacity (100PB - exabytes), high security
Edge computing: Snowcone, Snowball Edge. Process data while it's being create on an edge location (Ex: process data, machine learning, transcoding media streams)
OpsHub manage Snow Family device.
Snowball Pricing: per data transfer job

Storage Gateway[1][2]

Hybrid cloud storage service that helps to merge on-primes resources with cloud
Types:
- File Gateway caching local files in on-premise side. It extends on-primises storage and helps with migration. The data goes to AWS to Storage Gateway or S3
- Volume gateway is a kind of backup drive in on-premise. It can help in migration. The data does throught Storage Gateway to S3
- Tape gateway help the migration sending data throught Storage Gateway to Tape Archive (S3 Glacier) in AWS
Simplify storage management and reduce costs for key hybrid cloud storage use cases
Virtually unlimited cloud storage
Cannot be used to data archival
Ex: moving backups to the cloud, low latency access, disaster recovery

AWS DataSync [1]:

Great for online data transfers to simplify, automate, and accelerate copying large amounts of data between on-premises storage, edge locations, other clouds, and AWS Storage services.
Agent-based solution for migrating on-premises storage to AWS.
Agent for self-managed locations
The agent is not necessary when transferring data betwen AWS storage in the same AWS account
Move data between NFS/SMB and AWS storage solutions
DataSync encrypts data in-transit via TLS.
DataSync works best as a solution for one-time data migration.
DataSync can transfer data to archival storage, such as S3 Glacier Deep Archive.

AWS Transfer Family [1]:

Transfer Family allows you to integrate legacy applications that use FTPS and S3 together
Move files in and out S3 or EFS using SFTP, FTPS, FTP.
Easiest way to change nothing.

AWS Application Discovery Service[1]

helps you plan your migration to the AWS cloud by collecting usage and configuration data about your on-premises servers.
Types: Agentless and Agent Based

AWS Application Migration Service (MGN) [1][2]:

Plan migration projects
Agentless Discovery or Agent-based
This service is an automated lift-and-shift (rehost) solution for expediting migration of apps to AWS and can be used for physical, virtual, or cloud servers to avoid cutover windows or disruptions.
Migrating to run natively on AWS

DMS (Database Migration Service) [1]:

Migrate to AWS (relational DBs, data warehouse, NoSQL, other data stores).
With this is possible do continuous replication (ex: send to data warehouse)
The source database remains fully operational during the migration, minimizing downtime
Option: perform a one-time migration or continuously replicate ongoing changes
Types: Full Load; Full load and change Data capture (CDC); CDC only
Source DB -> EC2 instance running DMS -> Target DB
When Multi-AZ enabled, DMS give a synchronous stand replica in different AZ

AWS Migration Hub [1]:

single location to track the progress of application migrations
Integrate with Server Migration Server(SMS) and Database Migration Servie (DMS)

AWS Server Migration Service (SMS) is a fast agentless service easy to migrate thousands of on-premises workloads to AWS

Machine Learning

Amazon Rekognition: find objects, people, text in images and videos. Create "familiar faces" database or compare against celebrities.

Amazon SageMaker: service for build, train and deploy machine models.

Amazon Transcribe: Convert speech to text.(deep learning process. Automatically remove Personal Identifiable Information (PII)

Amazon Polly: Turn text into lifelike speech. Deep learning.

Amazon Translate: Natural and accurate language translation

Amazon Lex: Automatic Speech Recognition (ASR) - speech to text (chatbots, call center bots)

Amazon Connect: receive calls, create contact flows

Amazon Comprehend: Natural Language Processing (NLP), serverless service; analyses and organize text; identify positive/negative experience

Amazon Forecast: predict future sales, reduce forecasting time. Ex. Financial planning.

Amazon Kendra: document search service. Extract answers from docs. Natural Language search.

Amazon Personalize: build apps with real-time personalized recommendation

Amazon Texttract: automatically extract text, handwriting and data from documents using AI and ML.

Amazon Elastic Transcoder: convert media files in S3 into different formats of media files

Code

CodeGuru: automated code review and application performance recommendations
AWS Cloud Development Kit (CDK)
- Open-source software development framework.
- Define your cloud infrastructure using a familiar language.
- The code is compiled into a CloudFormation template
- Provisions the resources using CloudFormation
AWS CodeDeploy
- Deploy automatically
- Works with EC2 instanes, On-Premises servers
- CodeDeploy Agent is responsable to provision and configure Servers and Instances
AWS CodeCommit: Same of Git technology
AWS CodePipeline: Orchestrate the steps until production
AWS CodeStar: UI to manage Software Development activities.
AWS Cloud9: Cloud IDE
AWS CodeBuild
- Compile code, run tests and packaged to be deployed by CodeDeploy
- Pay-as-you-go pricing. Pay for build time
- Like Jenkins
AWS CodeArtifact
- Artifacts: dependencies
- It is an artifact management
- Like maven, gradle, npm, yarn
- Developers and CodeBuild retrieve the dependencies using it.

Others Services

Amazon LightSail: Low cost, easy, preconfigured virtual servers, good to beginners. However, it not possible to deploy a scalable node.js application into a VPC

Amazon WorkSpaces: Managed Desktop as a Service (DaaS). Integrated with KMS. Pay-as-you-go.

Amazon AppStream: Desktop Application Streaming Service (web browser)

AWS IoT: connect devices to the cloud

AWS Fault Inject Simulator (FIS): based on chaos engineering. stressing test.

AWS Ground Station: control satellite communication

Aditional References:

Billing and Pricing

Notes:

AWS: Operational expenses (OpEx): Pay as you go, tax deductible in same year
Traditional: capital expenses (CapEx): Purchase server, tax deductible over depreciation lifetime
Cloud: Trade CapEx for OpEx
AWS Total Cost of Ownership (TCO) comparing with in-premises TCO: include labor costs for activities that will be reduced or eliminated.
Free services: IAM, VPC, Consolidated Billing (one bill, easy tracking, combined usege, no extra fee). Elastic Beantalk, CloudFormation and Auto Scaling Group are payed for created resources
Free Tiers:
- Always free: DynamoDB, Lambda, SNS, SQS, ClaudWatch, CloudFront, Cognito, CodeXXX, Glue, Storage Gateway, X-Ray, CloudTrail, Service Catalog, CloudFormation, Control Tower, AWS Organization
- 12 months free: EC2, S3, RDS, API Gateway, EFS, EBS, ELB, ElastiCache, Lex, Polly, Rekognition, Transcribe, Translate, Amazon MQ, IoT, OpsWork, AppSync
- Trials: ECS, SageMaker, Redshift, AppStram, Lightsail, Comprehend Medical, Inspector, QuickSight, Macie, GuardDuty, Detective, Secrets Manager, DocumentDB, Neptune

Pricing Models

Pay-as-you-go pricing: Pay for compute time; for data stored in cloud; for data transfer OUT of the cloud (Massive economies of scale)
Save when you reserve: up to 75%. The more you pay upfront the greater the discount
Pay less by using more: valume-based discount
Pay less as AWS grows

Costing Tools:

Pricing calculator: Estimating costs
Tools for Tracking cost:
- Billing dashboard
- Cost Allocation Tags: Tracking cost. Tags are used to organized resources.
- Cost and Usage Reports: set of cost and usage data available ; can automatically publish the reports to S3. Tracking cost
- Cost Explorer: Tracking costs. Visualize data as graph, understand, and manage your AWS costs and usage over time. Future cost projection. Filter by Region, AZ, tags etc. Features: Time, filter, service
Billing Alarms and Budgets: Monitoring against cost plans. The AWS Budget allows companies to track and categorize spending on a detailed level.
AWS Cost Anomaly Detection: Continuously monitor your cost and usage using ML to detect unusual spends
AWS Service Quotas: Notify when a service is close of the quota (maximum value for the resources, actions and item in account) value is achieved
AWS Trust Advisor: fully managed best-practices audit tool. Analyse account and provide real-time best practices recommentation (Cost, performance, Security, Falt tolerance and Service limits). Ex: Checks security groups for rules that allow unrestrictec access to specific port. It is account level. Check Categories: Cost Optimization; Performance; Security; Fault Tolerance; Service Limits
AWS Compute Optimizer: Analyzes configurations and utilization metrics. Reports current usage optimizations and recomendation. Reduce costs and improve performance. Use ML. Helps the customer to choose optimal configuration and right size workload, including the CPU utilization and memory utilization. It delivers recommentations to EC2 instance, EC2 Scaling groups, EBS volumes and AWS lambda functions.

AWS Service Catalog:

Stacks of authorized products
Allows organizations to create and manage catalogs of IT services
List resources like AMIs, servers, sofwtares, database,etc
Centralized service
EndUser can deploy preapproved catalog items within an organization
Catalog templates are written and listed using CloudFormation templates
Benefits: standardize, senf-service, access control, versioning

Aditional References

Supports

Security-related actions are available at no cost:

AWS Blogs
AWS Forums
AWS Whitepapers
AWS Partner Solutions (formerly Quick Starts): Partner Solutions are built by Amazon Web Services (AWS) solutions architects and partners to help you deploy popular technologies on AWS, based on AWS best practices for security and high availability

APN Consulting Partner: The AWS Partner Network (APN) is the global partner program for technology and consulting businesses that leverage Amazon Web Services to build solutions and services for customers. So, if a company does not have expertise in-house and need to design and build a new workload on AWS Cloud, this program can be the ideal.

APN Technology Partner provide hardware, connectivity services, or software solutions that are either hosted on or integrated with, the AWS Cloud. It does not help a migration process.

AWS Professional Services assistis customers with accelerating cloud adoption through paid engagements in any speciality area. It can help on migration.

AWS Managed Services (AMS): Provides infrastructure and application support on AWS; AMS business hours are 24/365; it takes care of underlyning instance or compute and therefore patching and hardening. It helps you adopt AWS at scale and operate more efficiently and securely. We leverage standard AWS services and offer guidance and execution of operational best practices with specialized automations, skills, and experience that are contextual to your environment and applications. You can easily leave a lot of the heavy lifting to AWS when you are using managed services

AWS Marketplace is a digital catalog list of software that runs on AWS. It can be sell as Image (AMI) or SaaS.

AWS Support Plans

AWS Basic Support

Customer Service & Communities - 24x7
Documentation, Whitepapers, Support Forums
Core checks from the AWS Trusted Advisor Best Practice Checks
AWS Personal Health Dashboard

AWS Developer Support

The same of the Basic
Email access to customer Support: 24 hour response time on any question and 12 hours if the customer system is impaired
When: testing or doing early development on AWS
Best practices guidance
Building-block architecture support
Client side diagnotic tool

AWS Business Support

The same of the Basic and Developer
When: you have production workloads on AWS
24x7 phone, email, and chat access to Cloud Support Engineers
Production system impaired: < 4 hours
Production system down: < 1 hour
Full access to AWS Trusted Advisor Best Practice Checks.
AWS Health API
Guidance, configuration and troubleshooting of AWS interoperability with third-party software

AWS Enterprise Support

The same of Basic, Developer and Business
Technical Account Manager (TAM)
Concierge Support Team for billing and account best practices. Experts that specialize in working with enterprise accounts. Focus on help customer achieve their outcomes.
Business-critical system down: < 15 minutes (15 minutes SLA for business critical workload)
Online training with self-paced labs
24/7 technical support
Consultative Architectural guidance

AWS Enterprise On-Ramp Support

When: you have production/business critical workloads in AWS
Business-critical system down: < 30 minutes
Expert guidance to grow and optimize in the Cloud.
Workshop to cost optimization

Some Shots

Computing Service: Batch, EC2, EC2 Image Builder, Elastic Beanstalk, Lambda, Lightsail, AWS Outsposts, Serverless Application Repository, AWS SimSpace Weaver, AWS App Runner

Serverless:

Compute: Lambda (Infra), Fargate
Application Integration: EventBridge, Step Function (orchestration), SQS, SNS, API Gateway (Infra), AppSync
Data Store: S3, EFS, DynamoDB, RDS, Aurora, Redshift, Neptune, OpenSearch

Aditionally types there are:

FaaS (Function as a Service): Lambda.
DaaS (Desktop as a Service): Amazon WorkSpace.

Hybrid: Storage Gateway, Outsposts, SSM, Route 53, Virtual Private Gateway

Audit: CloudWatch, CloudTrail, SSE-KMS, AWS Config

Encryptation: AZ traffic (Default), CloudTrail (default), Site-To-Site VPN (default), all storage (RDS, S3, EBS, Redshift, EFS)

Conclusion

The exam is not a big deal; however, there are a lot of services to remember what they mean. Generally, the exam will test you not so deeply but to see if you know what the services are and have a basic idea of where to use them.

The courses I added here I consider complementary, so I recommend doing both. After that, do a lot of simulations. The simulation will help you understand what the exam is really trying to test you on.

Good luck!