AWS Site-Site VPN
In today’s fast-paced world, businesses are increasingly relying on cloud-based services to store and process their data. As a result, secure connectivity between on-premises infrastructure and the cloud has become a critical requirement for many organizations.
AWS Site-to-Site VPN is a service that provides a secure and reliable way to connect on-premises networks to AWS VPCs (Virtual Private Clouds). This service enables organizations to establish a secure and encrypted tunnel between their on-premises network and AWS VPC, allowing them to extend their network to the cloud and access AWS resources securely.
In this blog, we will dive deep into the AWS Site-to-Site VPN service, including its benefits, use cases, and how to set up and configure a Site-to-Site VPN connection. Whether you are a network administrator, an IT professional, or a business owner, this blog will provide you with the necessary information to establish a secure and reliable connection between your on-premises network and AWS.
VPN Basics
- VPN allows hosts to communicate privately over an untrusted intermediary network like internet, in encrypted form
- AWS supports Layer 3 VPN
- VPN types
1. IPSec (IP Security) VPN which is supported by AWS managed VPN
2. Other VPNs like GRE and DMVPN are not supported by AWS managed VPN - VPN has 2 forms — Site to Site VPN and Client to Site VPN
1. Site to Site VPN connects 2 different networks.
2. Client to Site VPN connects the client device like laptop to the private network
How Site-to-Site VPN works in AWS
- VPN connection terminated at Virtual Private Gateway (VGW) on AWS end
- VGW creates 2 Tunnel endpoints in different AZ’s for High Availability
Virtual Private Gateway
- Managed gateway endpoint for the VPC
- Only one VGW can be attached to VPC at a time
- VGW supports Static routing and Dynamic routing using BGP
- For BGP, you can assign the private ASN (Autonomous System Number_ to VGW in the range of 64512 to 65534
- If you don’t define ASN, AWS assigns default ASN. ASN can not be modified once assigned (default 64512)
- VGW Supports AES-256 and SHA-2 for encryption and data integrity
Site to Site VPN Setup steps
VPN Static and Dynamic routing
- The type of routing that you select can depend on the make and model of your customer gateway device. If your customer gateway device supports Border Gateway Protocol (BGP), specify dynamic routing when you configure your Site-to-Site VPN connection. If your customer gateway device does not support BGP, specify static routing.
- In case of static routing, you must pre-define the CIDR ranges on both sides of the VPN connection. If you add new network ranges on either sides, the routing changes are not propagated automatically
- In case of Dynamic routing, both the ends learns the new network changes automatically. On AWS side, the new routes are automatically propagated in the route tables
VPN Transitive Routing
From on-premises to AWS via Virtual Private Gateway:
• You can not access Internet through VPC attached Internet Gateway
• You can not access Internet through NAT Gateway in Public subnet
• You can not access peered VPC resources through VPC peering connection via the AWS VGW
• You can not access S3, DynamoDB via the VPC gateway endpoint
• You can access AWS services endpoint e.g API gateway, SQS and customer endpoint services (powered by Privatelink) via VPC interface endpoint
• You can access Internet through NAT EC2 instance in Public subnet
From AWS to on-premises via customer gateway
- You can access Internet and other network endpoints based on routing rules setup on CGW in on-premises network
VPN Tunnels and Routing
- AWS strongly recommends using customer gateway devices that support asymmetric routing
Static Routing — Active/Active Tunnels
- Active/Active tunnel may cause Asymmetric routing and Asymmetric routing should be enabled on the CGW
- For traffic originating from AWS, one of the tunnel is selected randomly
Static Routing — Active/Passive Tunnels
- When only one tunnel is UP, its used for traffic in both the directions
- In case one tunnel is down, we would need to manually bring back tunnel up from CGW end
Dynamic Routing — Active/Active Tunnels
- When the ASPATHs are the same length, use multi-exit discriminators (MEDs)
- Set MED value lower for tunnel 1 and tunnel 2 an higher value. In this case, tunnel 1 will be be given more priority over tunnel 2.
VPN Dead Peer Detection (DPD)
- Method to detect IPSec VPN connection tunnel liveliness
- DPD is used in “IKE Phase 1”
- If DPD is enabled then AWS sends DPD message to customer gateway and waits for the response. If even after 3 consecutive request there is not response then DPD times out and tunnel is down
- Default DPD timeout value is 30 seconds and can be changed
- DPD uses UDP 500 or UDP 4500 to send DPD messages
- The action to take after dead peer detection (DPD) timeout occurs. You can specify the following:
Clear
: End the IKE session when DPD timeout occurs (stop the tunnel and clear the routes)None
: Take no action when DPD timeout occursRestart
: Restart the IKE session when DPD timeout occurs - Customer router/firewall must support DPD when using Dynamic routing (BGP)
VPN Monitoring
- TunnelState
The state of the tunnels. For static VPNs, 0 indicates DOWN and 1 indicates UP. For BGP VPNs, 1 indicates ESTABLISHED and 0 is used for all other states. For both types of VPNs, values between 0 and 1 indicate at least one tunnel is not UP. - TunnelDataIn
The bytes received on the AWS side of the connection through the VPN tunnel from a customer gateway. Each metric data point represents the number of bytes received after the previous data point. Use the Sum statistic to show the total number of bytes received during the period. - TunnelDataOut
The bytes sent from the AWS side of the connection through the VPN tunnel to the customer gateway. Each metric data point represents the number of bytes sent after the previous data point. Use the Sum statistic to show the total number of bytes sent during the period.
- The AWS Health Dashboard provides the following types of notifications for your VPN connections:
1. Tunnel endpoint replacement notifications
2. Single tunnel VPN notifications
AWS Site-to-Site VPN architecture
- Single Site-to-Site VPN connections
1. With Virtual Private Gateway (VGW)
2. With Transit Gateway (TGW) - Multiple Site-to-Site VPN connections
1. With Virtual Private Gateway (VGW)
2. With Transit Gateway (TGW) - Redundant Site-to-Site VPN connection
AWS VPN CloudHUB
- Simple hub-and-spoke model that you can use with or without a VPC. This design is suitable if you have multiple branch offices and existing internet connections and would like to implement a convenient, potentially low-cost hub-and-spoke model for primary or backup connectivity between these remote offices
- The sites must not have overlapping IP ranges
- Unique Border Gateway Protocol (BGP) Autonomous System Number (ASN) for each customer gateway
- By default, we can only have 10 Site-to-Site VPN connections per virtual private gateway
Third-party Firewall (VPN appliance) on EC2 for Site-to-Site VPN
- You can have overlapping CIDRs
- You can enable transitive routing on AWS side
- You can use other protocols for site-to-site VPN like GRE or DMVPN
- You will have Advanced Threat Protection as part of the EC2
- Bandwidth can be more than AWS Managed Site-to-Site VPN ie., more than 1.25 Gbps only if the selected EC2 instance type supports it.
- You need to disable source/destination check for EC2 ENI used for VPN termination
- You can have auto-scaling or vertical scaling or horizontal scaling(depends on Third-party appliance)