Synopsis: A scan of major Web sites reveals a disappointing level of attention
to end user security. We will look at various relatively straghtforward
techniques to significantly improve security for users and
dissuade a variety of attack vectors.
Most Web sites are shockingly insecure from the point of view of their users because they are vulnerable to various malicious attacks such as clickjacking, framing, hi-jacking and various cross-site scripting attacks. There are relatively straightforward methods to dissuade all.
This article focuses on the security extended to your end users. We are not discussing other important security topics such as authentication (including TFA), server hardening, and cross-origin resource sharing. If there is enough interest I'll cover those at a later date.
We will cover the following major topics here, and the large topic
of the Content Security Policy in the next article.
Many of the security techniques explained here, with a few notable exceptions exceptions, are implemented in HTTP Headers. It is assumed that you know how to add the headers, using for instance the ServletResponse in JSP/Servlet, Lambda for AWS, or modifying the responses in PHP, Node.js and similar platforms.
The general point to understand is that the way you add headers needs to be secure. If an attack were able to defeat your technique for adding headers and change them or modify them in transit, all your work is for nothing.
Let's start with something basic: always use HTTPS. Always. It used to be important to carefully determine which parts of your site need encryption and which do not to save processing power on both client and server for the encryption computations.
With modern hardware this is simply not a concern any more. The only time I use HTTP is occasionally on Webhook callbacks (though I stay away from that) or in development when I’m using a localhost source without a key. There are also some rare embedded device scenarios with very low power hardware. Other than specialised situations such as those, always use HTTPS.
The HTTP Header you need to require HTTPS all the time is Strict-Transport-Security. This informs all supporting clients that you only wish to communicate with HTTPS, not HTTP. You should set it with maximum age (the age I use is 2 years), include your subdomains, and you may have to include your preload intention. The header can look like what is below.
Preloading is a process of asking your domain to be included in a list of HTTPS only sites hard coded into browsers such as Chrome and Firefox. This ensures even if an attacker is able to remove this header, the browsers will anyway only communicate with you by HTTPS. For more information,
To augment this for older browsers, you should include HTTP to HTTPS redirect on your sites. For instance, for J2EE sites, you would include a javax.servlet.Filter that
examines the protocol and rewriters the URL from http: to https: then does a sendRedirect() to force a change to https.
A note about Load Balancers: If you are using a load balancer (such as Elastic Load Balancing on AWS) then the most common configuration is the external traffic is HTTPS, just as you want,
and the hop from the load balancer to your server will be HTTP. In that case, if you use the normal interfaces to see what protocol it is, you will think it is HTTP even though in reality it is HTTPS. If you do a redirect in this case, you will end up in a never-ending redirect loop.
If you are behind a load balancer, to see the protocol the client used, look at the incoming HTTP header X-Forwarded-Proto
Using both Strict-Transport-Security and HTTP to HTTPS redirects ensures always encrypted communication.
One of the most common form of malicious attacks is to frame your site within a malicious site. It looks and acts like your site, but the attacker can overlay things and otherwise intercept what users are doing – and even change what they are doing.
Framing is not used for many legitimate purposes these days, so unless you have a really good reason, you should disallow your site from being framed. There are fundamentally 3 ways of doing this, and I suggest you implement all 3.
Legitimate uses of framing: Sometimes you want your site to be framed.
This generally means you are producing bits of content you want hosted in an iframe on other sites.
Ads provided through the Amazon Affiliate network are a common example of this (the Amazon ad you likely see on this page is in an iframe - so clearly Amazon Ads allows framing).
In that case, you should allow framing for only those portions, and ideally through an entirely separate URL to prevent mistakes.
As an example, if I wanted to start serving ads, instead of www.ajmusgrove.com, I might serve them from ads.ajmusgrove.com.
The first way to prevent framing is the X-Frame-Options HTTP header.
You can set this to DENY, which will disallow all framing on browsers that support this header.
You can set this to SAMEORIGIN meaning pages may only be framed within a different part of the same site.
Finally you have certain trusted other sites, you can use ALLOW-FROM to specify another origin allowed to frame your site.
Use ALLOW-FROM carefully and ensure you really do trust the other site. In practice, I almost always use DENY.
An example frame options setting is as below.
If you are using a Content-Security-Policy (later), you should also set the
frame-ancestors rule. This will be described in the article on Content-Security-Policy.
Not all browsers support will support the required HTTP header and even fewer support Content-Security-Policy.
For those, the workaround is to embed code directly into your page that will break framing attempts.
There are lots of implementations, but the basic idea of a frame killer is your page will refuse to display if it is in a frame and will attempt to break out of the frame.
Below is the frame killing code I use and you can use (normal disclaimers: as-is, no warranty, your own risk, etc).
top.location=self.location; // attempt to break out
I recommend using a frame killer on every page you do not want
framed as a backup to HTTP headers in case of older devices
or the headers are compromised.
If you are using Content-Security-Policy, for this to work your style-src must allow 'unsafe-inline' and for script-src you must either allow 'unsafe-line' (discouraged) or if you are using 'strict-dynamic' (encouraged) with a static page you must provide the SHA hash and with dynamic either the SHA hash or (my preference) a nonce.
See the next article on Content Security Policy.
If you link to other sites - and its incredibly likely that you do -
when your users click on those links your site can effectively provide all sorts of information to the site your user has been referred to.
User Generated Content:
You may trust every site you link to.
But you also probably have user generated content. This can take many forms, but even the comment section at the bottom of this article is user generated content.
If there is a link embedded in that content, and your user clicks on it, the policy in this section determines what information will be sent to this site that you may know nothing about.
The HTTP Header that controls the information sent to a site users go to from your site is Referrer-Policy.
There are a number of options for this policy.
unsafe-url - Never use this. Not for any reason.
strict-origin-when-cross-origin - This is the policy I prefer in practically every case.
The full URL is sent with every request on your site; when going to a different site only origin (ie domain name only) is sent to the new site as long as the security is at least as high as yours, which should be HTTPS; otherwise no information is sent if a downgrade to HTTP happens.
strict-origin - only send the domain name even if the new page is in your own site and only if security is at least as high.
The use cases for this are rare, but they are generally when even parts of your own site might not be trusted, if for example they might run user-generated code.
origin-when-cross-origin and cross-origin - these are the
same as their 'strict-' variations above without the
requirement to stay within HTTPS to provide refer information.
Given my expressed belief the entire Web should be encrypted these days,
unless you have a specific use case for this, do not use it.
no-referrer - Send no information. The use of this is rare, but if you are running let's say a financial services site like customer bank-account access, you might not want third-sites to even know the last place the user was was on your site.
no-referrer-when-downgrade - This is the default. Origin information is sent unless a downgrade in security (meaning HTTPS to HTTP) occurred.
This is better than nothing. But not by much.
One type of traditional attack was to convince a browser to convert innocuous looking non-executable content into executable content.
There is a simple header to prevent this that should use, unless you have a good reason and really know what you are doing.
Cross-site scripting is a method of attack that uses reflected scripting.
We will not cover this topic in much detail and there honestly is not
that much to cover. The modern and more effective way of preventing XSS attacks is the Content Security Policy (next article).
It is still a good idea to implement the X-XSS-Protection header to account for older browsers and in case you have not yet implemented CSP.
It was a fairly safe assumption that either both the server pages and the styles and scripts would be compromised at the same time, or neither would be. Because of this reasonable assumption, separate security was unnecessary.
Now it is far more common for styles and scripts to be provided not be the server itself,
but by either another federated element perhaps,
or more likely by a Content Delivery Network (such as CloudFront which I use or CloudFlare)
or a third party. More on third parties later.
Your Own Scripts and Stylesheets
Even if your server is secure, if your CDN or other hosting site is compromised, the scripts and stylesheets provided may be different than what you expect and could be malicious in an infinite number of ways.
This could also happen, and be far more difficult to detect, if the caches of the CDN are compromised or if a man-in-the-middle attack is successful in the transit from the CDN.
How do you know the script you tested and deployed is the one that the browser actually loads?
The answer is subresource integrity (SRI).
You generate a cryptographically secure hash of script and send it along with the script load instruction in the HTML.
When the browser loads the script or style from the CDN, it will attempt to compute the same hash, compare the results, and refuse to load the script or style if they do not match.
First, let's look at an example, then explain it.
What is going on here?
The first part of the load you recognize: it is the standard script source.
The next integrity attribute might be new to you.
That is the cryptographic hash. The first part is the algorithm used, in this case sha384, then a dash, then the Base64 encode of the Hexadecimal version of the hash. To produce the hash above, the following command is what I use.
shasum -b -a 384 filename.js | xxd -r -p | base64
SHA is the only cryptographic hash supported by the algorithm.
The standard requires support for 256, 384 and 512 bit sizes.
As with all cryptographic hashes, the tradeoff is simple: smaller hashes are faster but less secure, where larger hashes are slower but more secure.
As you can see I prefer 384 bit SHA hashes. At the current state of computing power I feel it is the appropriate tradeoff.
It will not be long before I start preferring 512 – Moore's Law and all that.
You may notice the crossorigin attribute. That prevents the browser from sending much information to the CDN it is retrieving the script from. Unless your CDN really needs to know a lot, I suggest including it always.
In your Content Security Policy, you can require Subresource Integrity always for styles, scripts or both.
This is done with the require-sri-for directive. In practice, it is unlikely this will be of use to you because of use of third parties (next section).
External Content Sources
In the incredibly unlikely event you have complete control over all styles and scripts loaded on your site you can skip this section. In practice, everyone needs to understand the limitations of SRI as they apply to third-party sources.
Although SRI is a great capability, since it is relatively new, it is still not widely supported.
A very few third-party sources support SRI, and if they do you have no reason not to use it, even if you are not using it with your own content.
One that comes to mind is the fabulous and widely used
The link they provide has the SRI built in.
Most sources you will bring in do not support SRI.
For a few, you can download and locally host to do SRI yourself.
Most third-party libraries, especially the plugin variety, simply do not support SRI.
There is little you can do about it and will just have to live with it.
These include Facebook (if you use the Facebook login and/or Share functions), Twitter (tweeting from your page or including feeds), almost anything from Google (including Analytics, AdSense, and Tags), and GoSquared.
Automatic Upgrades: An argument against SRI is the old argument about automatic upgrades.
The idea is you source in a third-party providing a capability,
and as they upgrade it you get automatic benefit of the upgrades without having to do anything.
That does not work with SRI because you are requiring a very specific version of the script or style because that is the version you have computed the hash on.
There is debate on this, but I'm not a big fan of the 'automatic
upgrade' concept because of my many years in the corporate IT world - I don't want code I've not tested with suddenly appearing in my system.
Over time, I'd expect more and more sources to use SRI, and I'd advise you to take advantage of it.
I like polyfills and use polyfill.io myself, but they not only do not work with SRI, it is logically impossible they ever could. You cannot pre-compute a hash on dynamically created code.
Summary on SRI
SRI is a relatively new but very effective security capability.
It prevents surreptitious changes of code either sourced from you or supporting third parties.
It may seem complex to implement, but if you use automations in your build process and page return process, as I do,
then it becomes something you do not have to think about and provides a very powerful and almost impossible to circumvent security mechanism for the type of attack it addresses.
Public Key Pinning is a new and somewhat complex capability to
understanding, but it is useful for preventing certain types of
attacks, and if you do not implement it it will expose you to
a certain type of attack.
First, let's set the background. In order to secure the connection between the end-user and your server, the data is encrypted.
This is accomplished, at a basic level, by you handing a certificate to the end-user that certifies who you are and gives the starting point for establishing the encrypted channel. So I might hand a certificate (public key) to a browser that claims that I am www.ajmusgrove.com.
But how does your browser trust that I am who I claim to be?
I could be a malicious site merely masquerading as someone I am not, so the first time I hand you my certificate - unless we are exchanging in person, which does happen in some highly secure environments - the certificate I pass to you is signed up someone you already trust. Who do you already trust? Dozens of "Root Certificates" that comes shipped in your web browser.
The important thing to remember is security depends not only on the security of your encryption certificates,
but the security of all of those dozens of root certificates distributed with
every browser. If any one of those is compromised, a malicious actor could convince the entire world they are you by signing a new, different certificate that contains your public identity.
How can you prevent that scenario?
In the past the answer was certificate revocation lists, where if a certificate becomes compromised a central database was updated
and every piece of software subscribing to that database knew that a certain certificate (such as a signing Root CA) was no longer trusted.
It is not an altogether terrible solution, but among the many other problems, you have to know a certificate has been compromised.
And what if someone tricked a legitimate signing authority into signing a certificate certifying that I am you? At that point the Root CA was not compromised, but the effect on you is still the same.
You can tell your users that, for your site, only trust a particular certificate (normally a signing certificate) for a particular amount of time.
For instance, if you are reading this on www.ajmusgrove.com right now,
then your connection is secured with HTTPS with a certificate issued to me by Amazon,
meaning it is signed by their root CA. I've told your browser that,
for the next 2 years, if you get any certificate not signed by Amazon Root CA1 (or CA2 as the backup in case CA1 gets compromised), then that is not me.
The HTTP Header for this is Public-Key-Pins. The one I sent you is below.
You can make the max age longer or shorter depending on your needs.
Notice the 'includeSubDomains'.
That ensures if someone wants to impersonate me after hacking into my DNS by, let's say, registering www2.ajmusgrove.com, they won't be able to unless they do it using an Amazon signed certificate.
It should be obvious to you that anyone can still impersonate me by getting a certificate signed by Amazon claiming they are me.
That is not as easy as it seems, at least with Amazon.
Not only will some level of identity information have to be provided
(their credit card details, etc) they will have to prove to Amazon they own the domain.
To do that, they'd have to put in the special DNS entries required to prove ownership,
which means they'd have to hack my DNS (which I keep specially secured partially for this reason).
And if they do all that, and start to attempt to prove to Amazon that they are the rightful owners of the ajmusgrove.com domain through my compromised DNS,
since I have already proven who I am then their action would surely be flagged as suspicious by Amazon
and kicked for manual verification.
And if all that happened successfully, once I discover it and report it to Amazon,
whatever certificates the hacker acquired would be revoked
(those Certificate Revokation Lists) and removed from Amazon's systems
(most likely their ELB or CloudFront platforms), but all of my pinned certificates are still valid because the Amazon root CA itself was always secure.
Held up for Ransom
HPKP is great, but you might not to implement it for whatever reason - perhaps complexity or perhaps you don't want to be bound to a particular Root CA for an extended period of time.
Sadly, you do not have much of a reasonable choice. I've observed many sites - and I've scanned a number of major ones - are not using HPKP. They are putting themselves at major risk.
Right now, let's assume your private key used for encryption is compromised. Your reaction would be to generate a new one and stop using the old one, right?
Well, if your system is compromised and the hacker is able to both get your private key and insert a HPKP header using your private key rather than a Root CA and then get out undetected, you've got a problem.
They sit back and wait a few months as your key becomes pinned on a multiyear basis to all of your users.
Then they call you up, and demand an anonymous Bitcoin ransom or they will sell your private key, meaning your traffic is now all compromised and subject to attacks like Man-In-The-Middle.
You do not have the option of revoking and generating a new key, because that key is now pinned in your users devices and that could lock them out for years.
If you want to keep flexibility while defending against this attack right now, at the very least pin your Root CA using HPKP
with a short duration of say 30 days and manually check the HPKP header every couple of weeks to ensure it has not been changed (indicating someone is attempting this attack against you).
HTTP Public Key Pinning can ward off a variety of attacks, but a major weakness is that if you do not implement it, you are vulnerable to a particular type of ransom attack.
You should at least implement a short term HPKP header pinned to your Root CA.
Content Security Policy (CSP)
The Content Security Policy is the detailed security policy describing treatment of various types of content. It is a very rich, useful and detailed policy that should always be implemented.
It is also a big topic and worthy of an article of its own! The next article you see from me will be on the CSP.
It is disappointing how poor end-user security is across the Web, and that is subjecting users to a variety of attacks. Some of the concepts are difficult, but I hope this article has demystified some of it to the point you can secure your own sites. As always, if you have any questions you can always get in touch.
I am reachable on firstname.lastname@example.org.
All views expressed in this article are my own and do not represent the
opinions of any other entity whatsoever with which I have been, am now
or will ever be affiliated. No assurance of accuracy is given and
any use of any information provided is entirely at your own risk. The
author assumes no responsibility or liability for any errors or
omissions in the content of this article. No infomration provided is intended
to be a source of investment advise or credit analysis with respect to
any material presented or otherwise. Nothing contained in this article
is intended to defame or harm any person, business or other entity.
The author retains sole and exclusive
ownership of all material herein.