After writing about the Mercury Tours training website last week I was feeling a little nostalgic,
so I decided to install it on an Amazon EC2 instance. There is a world of difference between installing something on a local server (or inside a VM) and putting it on a public server.
Mercury Tours only runs on Windows and was always intended to just be installed locally, so it has a single-server architecture. A file-based database means that it is not possible to have a separate database server and scale out the web server tier horizontally. The only change I made to the architecture was to put an Nginx web server in front of Apache for security reasons.
- The Nginx server acts as a reverse proxy for Apache (it forwards requests), and handles HTTPS. The SSL certificate is issued by Let’s Encrypt using an ACME client for Windows. The performance of Nginx on Windows is not as good as on Linux, and third-party modules are not supported, so it is not recommended for Production use.
- Apache JServ is the the servlet container, which runs Java servlets using version 1.2.2 of the JVM. The servlets communicate with a Microsoft Access database (MTours.mdb) using JDBC-ODBC (which only works on Windows).
- Microsoft Access is a bit of an odd choice for a database (even for a demo app), but I guess there was never any expectation that Mercury Tours would run on anything other than Windows. Note that MS Access is not actually installed on the server; there is just a database file in a subdirectory. It simplifies things a little, as there is no need for a separate database service (like MySQL, etc.) to run on the server.
Nginx and Apache are set up to run as a Windows Service. A scheduled task was created to automatically renew the SSL certificate.
Security was my main worry when deploying Mercury Tours to a public server. The Common Vulnerabilities and Exposures (CVE) database had a long list of things that could be exploited. Putting a modern web server (Nginx) in front of it was a partial mitigation, but there were some other standard security practices to implement too.
- If someone gained access to the EC2 instance, I didn’t want them to have access to my internal network, or to be able to use the AWS API to discover or access whatever else was deployed in the AWS account. The EC2 instance was created within a dedicated VPC, and was not given an IAM instance role with any permissions.
- Nginx and Apache run as a Windows Service using a service account with limited permissions. The account cannot access the file system outside the Mercury Tours directory.
- Windows Firewall ports and the AWS security group were configured to only allow HTTP/HTTPS and ping (and RDP from my home IP address).
- Decompiling the source code shows some glaringly obvious SQL injection vulnerabilities. Unfortunately the Nginx Web Application Firewall (WAF) module is not supported on Windows.
- An SSL certificate from Let’s Encrypt was used to set up HTTPS, and a scheduled task was created to automatically renew it after 60 days. I did not set up HSTS or configure Nginx to redirect HTTP requests to the HTTPS URL, as I thought it would be nice to allow load testing with either HTTP or HTTPS.
- While I don’t believe in security through obscurity, a robots.txt file was created to prevent indexing by search engines. I did not suppress the version headers for Apache in HTTP responses, so anyone scanning HTTP servers will see an old/insecure Apache version (I am curious if the server will appear in the Shodan database).
- I configured Nginx to drop HTTP requests to the web server’s IP address, and only respond if the Hosts header was correct. In the 30 minutes the server was active before I did this, there were numerous requests for random URLs on the server from bots scanning IP address ranges for vulnerabilities to exploit. An attacker can still discover the DNS name of the server through the SSL certificate that is presented when attempting an HTTPS connection to the IP address.
- It is considered poor form not to block the Apache /server-status URL or the Nginx status page, but I left these publicly accessible so that server stats are available during load testing.
These days, if I saw someone manually creating resources using the AWS Management Console, I would probably worry that they didn’t know what they were doing. Infrastructure as code is such an obvious benefit that it seems weird and alien to be deploying to AWS without using CloudFormation (or something equivalent).
The CloudFormation template for the Mercury Tours infrastructure (a single EC2 instance) creates 9 resources (a new VPC, subnet, Internet gateway, Elastic IP, security group, etc.) and takes a couple of minutes to create all the cloud components. Obviously it’s nice to have everything created automatically, and to have the ability to easily re-create the stack in another region or AWS account, but I also like the ability to cleanly delete all the resources used by an application by just deleting the CloudFormation stack.
As DNS was not managed within AWS, using an Elastic IP address meant that the CloudFormation template could be updated and re-deployed without having to update the DNS entry each time the EC2 instance was replaced.
Automating the installation and configuration of the software components would have been nice, but scripting for this is slow and unpleasant on Windows. All the software was installed manually and a new AMI was created.
It should be standard practice that anything deployed to Production has basic monitoring set up: synthetics, infrastructure, logs, and APM. Fortunately, with the release of New Relic One in July 2020, a complete monitoring solution is available for free (although the free accounts have limits on data volumes and user numbers).
- The synthetic monitoring script loads the front page, logs in and searches for a flight. It does this every 10 minutes and I get an alert if it fails. This is a much more meaningful check of availability and performance than just loading the front page and checking that it returns an HTTP 200 response code.
- Infrastructure monitoring was a bit limited due to the age of the components, and the fact that Mercury Tours was running on Windows. The Windows agent was just used to pick up operating system metrics (CPU, memory, disk, etc.) with no monitoring for Nginx (unsupported on Windows), Apache (unsupported version), JServ (unsupported), and MS Access (unsupported, although query times would still have been visible via APM). Some AWS CloudWatch metrics were also collected using the New Relic Amazon EC2 monitoring integration.
- Logfile monitoring was set up to check the error logs from Nginx, Apache and JServ. Having logfiles in the same platform as all the other metrics makes problem-solving much simpler – especially when it comes to seeing what log entries correspond with an HTTP 500 error.
- Using the New Relic APM Java agent would have given great visibility into where time was spent in the code but, unsurprisingly, it doesn’t support a JVM version from 1999).
- Replace the Microsoft Access database with SQLite. It is still file-based, but can be queried via JDBC instead of ODBC, which removes the requirement of running Mercury Tours on Windows.
- Migrate from Windows to Linux. With the dependency on ODBC gone, there is no longer any need to run Mercury Tours on Windows. Running on Linux allows better monitoring options and makes the deployment more scriptable. A fully-scripted deployment would remove the requirement for manual patching when new operating system updates are released. Deployment scripts could just be run on a new instance (which is kept up to date by AWS).
- Replace JServ with a modern servlet container (Tomcat? Jetty?) and upgrade to a current-generation JVM. The biggest benefit of doing this would be to enable code-level performance timings with New Relic APM for better system observability.
- Replace the Apache web server with Nginx and set up Nginx monitoring. There is no need for Apache when Nginx can serve static content better and can efficiently pass servlet requests to the app server using the nginx_ajp_module (which doesn’t work on Windows).
- Update source code to remove SQL injection vulnerabilities. Even though there is no data that anyone cares about in the Mercury Tours database, it still seems like poor form to allow SQL injection attacks when they are so obvious.
- The artificial performance bottlenecks are due to servlet code, but it would still be useful to move more of the work to Nginx and implement some Nginx performance optimisations like enabling HTTP/2, tweaking caching headers, enabling compression and setting up a cache for the reverse proxy.