Category Archives: News

Using AI to fight hand-crafted Business Email Compromise

Younghoo Lee is a Senior Data Scientist at Sophos. Together with Joshua Saxe, Sophos Chief Scientist, he recently presented these findings at DEFCON 28 AI Village.

Business Email Compromise (BEC), is a form of targeted phishing where attackers disguise themselves as senior executives to dupe employees into doing something they absolutely shouldn’t, like wire money.

It started out as an evolution of the fraudulent international money transfer scams, and the messages were often riddled with poor punctuation and grammar, misspelt names and more that made them relatively easy to identify. Yet they still made money.

When more sophisticated cybercriminals realized this, the quality of the emails quickly ramped up.

The most convincing BEC scams are based on detailed research about a business, its key senior executives and the employees responsible for financial activity. This enables the attackers to hand craft messages that look authentic and convincing.

The challenge of detection

BEC emails can be difficult to detect using security solutions because no malware is involved. There are no embedded links to questionable URLs or booby-trapped attachments. They often abuse the company’s own email addresses.

Detecting hand crafted BEC emails is even harder because each one is unique, and skilled attackers can be very good at mimicking the style and tone of their targets.

We set out to see if we could catch the attackers out by using natural language models to spot this sort of fraud.

The complexity of language

Detecting phishing and BEC is hard because language is hard.

Language is made up of complex components. Our brains can process these naturally because they have vast reserves of context and experience to draw on and awesome computational power.

These components include:

  • Syntax – what does the structure and punctuation of a phrase tell us about the intended meaning?
  • Semantics – when used in a particular context, what do the words mean and what is the relationship between them?
  • Sentiment – the tone; what is being conveyed in terms of emotion, is it anger, irony, sarcasm?

Creating a computational model that can process all of this quickly and accurately enough to differentiate between a benign and a convincing but fraudulent email is a significant challenge for cybersecurity.

Introducing CATBERT

We started by adapting a deep learning model developed by Google and used in natural language processing (NLP).

NLP is about building computer programs that can process and analyze natural language data. Long term applications of this technology could include enhanced social robotics and human-computer interaction; real-time translations; and being able to ask computers complex open-ended questions.

The model we used is called BERT (for Bidirectional Encoder Representations from Transformers). BERT has shown impressive results in NLP tasks like story comprehension and identifying emotions. However, applying a complex BERT model to real-time security systems is difficult because they are computationally complex, and this makes them quite slow.

We decided to compress and fine-tune BERT as well as a lighter version, rather aptly named DistilBERT to create our very own cybersecurity ready model.

The cybersecurity ready model needed some important additional capability.  First, it had to be as light (fast) as possible for real-time predictions and response.

Second, it needed to be able to capture the valuable clues hiding in the email’s context information, such as sender and header details.

We called our model Context-Aware Tiny BERT, or CATBERT.

Transforming language into a mathematical model

BERT is based on something called a Transformer. Introduced in 2017, Transformers are blocks of intelligent computing power that can take natural language and then numerically encode and decode it to classify content and make predictions about meaning.

Transformers can be extensively pre-trained to understand word context and relationships and work with partial words (to allow for misspellings).

To predict both accurately and quickly whether an email was potentially malicious, we architected CATBERT with as few Transformer layers as possible.

We also enabled the model to extract and process contextual information like the email header and sender details as well as the main body copy.

Putting CATBERT to the test

We set up a series of tests for CATBERT as well as DistilBERT and others.  The tests were run using a dataset of over four million emails and meta data from a threat intelligence feed and the email system at Sophos.

Of these, 70% of samples were used for training and 30% were used for validation. The malicious samples included both BEC and phishing emails.

Most of the emails were in English, but a quarter were in other languages to reflect what is seen in the real world.

The tests showed that the CATBERT model can detect malicious phishing and BEC emails with a high degree of accuracy while also being 30% smaller and twice as fast as the lightest existing model, DistilBERT (which is itself 40% lighter than the original BERT).

The ‘Context-Aware’ architecture that takes the content features from the email text and the contextual elements from header fields further improves the model’s detection performance.

Last but not least, CATBERT was also able to pick up on subtleties in terms of email topic, tone and style, which allowed it to accurately detect new, targeted phishing attacks.

Overall, CATBERT achieved an 82% and 90% detection rate on phishing and BEC attacks respectively, with a 0.1% false positive rate.

Two emails successfully detected as malicious by CATBERT:

What to do?

As this research shows, advanced machine learning technology can help in the battle against BEC.

Nevertheless, BEC is more about social engineering and human manipulation than it is about technology and hacking.

So when it comes to protecting your organization against BEC, the most important thing you can do is to educate and continue to educate employees on what BEC is all about, the red flags to look out for and what to do if they receive a potentially suspicious message.

It’s vital to create a culture where employees feel able to question and verify a request from a colleague, however senior. These attacks deliberately exploit the fact that many employees will be too afraid to do so.

Naked Security has published a number of articles with information and advice on dealing with BEC scams, including:

You might also like to watch the recent Would you spot it video:

[embedded content]

Further information

If you’d like to learn more about our model, watch our DEFCON presentation. Further technical information can also be found in our Sophos AI blog.

Sophos Artificial Intelligence was formed in 2017 to produce breakthrough technologies in data science and machine learning for information security.

We’re currently focused on machine learning, large scale scientific computing architecture, human-AI interaction, and information visualization.

Further information on current projects, team, conference talks and publications can be found at ai.sophos.com

US liquor giant hit by ransomware – what the rest of us can do to help

US hard liquor giant Brown-Forman is the latest high-profile victim of ransomware criminals.

Even if the company’s name doesn’t ring a bell, some of its products are well-known to spirits drinkers world-wide: Brown-Forman is a multi-billion dollar business that owns Jack Daniel’s whiskey, Finlandia vodka and other global brands.

It’s a multi-billion dollar business, headquartered in Louisville, Kentucky – a US state that’s famous for American whiskey, better known as bourbon – and you can see why today’s big-money ransomware crooks might go after a company of that size and sort.

According to business media site Bloomberg, which claims to have received an anonymous tip-off from the crooks behind the attacks, the ransomware crooks involved are the infamous REvil or Sodinokibi gang.

The REVil crew make up one of what you might call a “new wave” of ransomware operators who practise three-stage attacks that end in double-barrelled blackmail:

  • First, they break into a victim’s network and scope it out. During this reconnaissance the crooks will typically work their way up to sysadmin level access, map out all the clients and servers on the network, search out where online backups are kept, locate or introduce powerful system administration tools they can use later to assist in the attack, and reconfigure (or turn off) system security settings to give them the broadest reach possible. Sometimes, they’ll even launch mini-attacks with trial samples of malware as a way to probe your defences and to find which attack techniques are most likely to succeed.
  • Second, they exfiltrate – which is a fancy word for steal – as much corporate data as they can get their hands on. In the Brown-Forman attack, in which the attackers claimed to have purloined 1 terabyte of data as part of the attack, Bloomberg says that it received links to a website where the crooks revealed “proof” of the data breach by listing sample files going back more than 10 years.
  • Third, they encrypt as many files on the network as possible, using a scrambling algorithm for which they alone have the key. The crooks typically copy the malware program across the network first, so that when they kick off the encryption process, it runs in parallel on all your devices, thus bringing maximum disruption in minimum time.

    How these stages evolved

    As you probably know, the first two stages above are fairly recent developments in ransomware criminality.

    When ransomware crooks started out – back in 2013 when the infamous CryptoLocker gang were the kings of the ransomware scene – it was all about stage 3: scrambling files and then using the decryption key as a blackmail tool: “Send us $300 or your files are gone forever”.

    The crooks generally didn’t target networks back then; instead, they went after millions of victims in parallel, with each infected computer ransomed independently.

    The criminals “targeted” everyone – from home users who probably didn’t have backups of any sort and might be willing to spend $300 to get their wedding photos or the videos of their children back – to big companies where 100 users might fall for the latest ransomware spam campaign and the business would need to spend 100 × $300 to get the unique decryption key for each now-useless computer.

    Stage 1 arrived on the ransomware scene when criminals realised that by going after entire networks one-at-a-time, they could cut their “losses” early in the case of a network that they didn’t have much success with, and focus on networks where they could cause disruption that was both sudden and total.

    Instead of pursuing thousands of individual computer users for hundreds of dollars each, the crooks could blackmail a single company at a time for tens of thousands of dollars a time.

    Indeed, the early adopters of the “all-at-once” ransomware approach often took the cynical approach of offering two prices: a per-PC decryption fee, and an “all you can eat” buffet price for a master key that would unscramble as many computers as you wanted – almost as if the crooks were doing you a favour.

    The crooks behind the SamSam malware – four Iranians have been identified and formally charged by the US, but are unlikely ever to stand trial – even offered a staged payment “service” whereby you could pay half the ransom to receive half of the decryption keys (chosen randomly by the criminals).

    If you were lucky, you might just end up with enough computers running again to save your business for just 50% of the usual price…

    …but if not, you could pay the rest of the ransom, presumably now with considerable confidence that the crooks would deliver the decryption tools as promised.

    You could even take a chance on paying the per-PC fee for your most critical computers – typically $8000 a time – to tide you over, and “top up” later, once you were “confident” in the criminals, to the master-key price, which was typically set by the SamSam crooks just below $50,000.

    Whether they chose $50,000 at a guess, or because they found it represented a common accounting department limit in the US below which it was much easier for the IT manager to get the payment approved, we never found out.

    As you can imagine, the exposure of the alleged perpetrators by US law enforcement pretty much drove the SamSam crooks out of business, albeit not before they had extorted millions of dollars from victims around the world, but ultimately didn’t make much of a dent in ransomware attacks in general.

    Price inflation

    Sadly, the SamSam gang’s fee of $50,000 a network turns out to be small by current standards.

    A recent ransomware attack that took US GPS and fitness tracker giant Garmin offline for several days was apparently “resolved” when the company coughed up a multi-million dollar payment, supposedly negotiated downwards from $10,000,000.

    That incident attracted controversy because the ransomware involved was alleged to have been the work of a Russian cybercrime outfit known as Evil Corp, and transactions with that group are prohibited by US sanctions imposed in December 2019.

    And US travel company CWT is said to have coughed up $4,500,000 recently – again, down from an opening demand of an alleged $10 million for unscrambling what the crooks claimed were 30,000 ransomed computers.

    If true, $10,000,000 for 30,000 devices comes out at $333 each, a fascinating full-circle back to the $300 price point of the 2013 CryptoLocker ransomware, which was itself an intriguing echo of the first ever ransomware attack, way back in 1989, where the criminal behind the malware demanded $378. (With no prepaid credit cards, online gift cards or cryptocurrencies to use as a vehicle for pseudoanonymous payments, this early attempt at ransomware, known as the AIDS Information Trojan, was a financial failure. Indeed, it wasn’t until the early 2010s that cyberextortion based on locking up computers or files worked out at all for the cyberunderworld.)

    The biggest tactical change

    But the biggest tactical change in ransomware is stage 2 above.

    By perpetrating data breaches up front, before unleashing the file scrambling component – in Brown-Forman’s case, the breach allegedly includes 1 terabyte; in CWT’s attack, the criminals claimed that 2 terabytes were thieved up front – the crooks now have a double-barrelled weapon of criminal demand.

    You’re no longer being extorted to pay for the crooks to do something, namely to send you a set of decryption keys, but also being blackmailed into bribing the crooks not to do something, namely not to go public with your data.

    Early ransomware had more in common with kidnapping, though with jobs at stake rather than the victim’s life: the theory was that if you paid up and the crooks released a working decryption tool, you not only got your data back but also quite clearly ended the power that the criminals had over you.

    For the crooks to ransom your data again (sadly, this happens), they’d need to break into your network again and essentially start from scratch, assuming that you worked out how they got in before and closed the holes they used last time.

    But today’s ransomware is turning into old-school, out-and-out blackmail: the crooks promise to delete the data they already stole, and thereby to “prevent” your ransomware incident turning into a publicly visible data breach, but you have no way of knowing whether they will keep their promise.

    Even worse, you have no way of knowing whether the crooks can keep their promise, even if they intend to.

    For all you know, the data they took illegally could already have been stolen from them – remember that many of the cybercrime busts written about on Naked Security, including ransomware arrests, happened because of cybersecurity blunders made by the perpetrators that allowed their evil secrets to be probed, uncovered and ultimately proved in a court of law.

    Or the criminals themselves may have been victims of “insider crime”, where one of their own decided to go rogue – after all, we’ve also written about crooks getting busted not through operational blunders but through a falling-out among thieves, where one of the gang has ratted out the others or otherwise co-operated with the authorities to save themselves

    What does this new-look ransomware mean?

    Technically, or at least from a regulatory point of view, all ransomware attacks are data breaches, even if all they do is scramble your files in place.

    After all, if an outsider is able to modify files they weren’t supposed to access at all, that clearly amounts both to unauthorised access (a crime in most jurisdictions) and to unauthorised modification (a yet more serious crime) – and even though this makes you a victim of crime, it also means you’ve failed in at least some way at protecting information you were supposed to protect.

    And ransomware crooks who steal your data before scrambling it are really in the pound seats when it comes to blackmail.

    Even if you prevent the final stage of the attack and the file scrambling failed, or if you have reliable and comprehensive offline backups that allow you to repair and reimage all your computers without relying on the crooks for decryption keys, the crooks are going to squeeze you anyway, by threatening to make a bad thing (a provable data breach) much worse: a data breach that can actively be used against you, by other crooks, by unscrupulous competitors, by activists, by regulators, by anyone who is determined to make you look bad for any reason they choose.

    The good news, in the case of the Brown-Forman attack, is that current reports suggest two important things:

    1. Brown-Forman prevented the file scrambling part (stage 3) of the attack. That’s great news, because it means that the company is unlikely to go offline like Garmin had to, which reduces the impact on the people that do business with the company, including suppliers, creditors, partners, distributors, retailers, and more.
    2. Brown-Forman has supposedly told the criminals to stick their blackmail demands where the sun doesn’t shine. In other words, they’re not planning to pay up and thereby to encourage – indeed, to help to fund – the next attack.

    All we can say to that is, “Well done, and thanks for standing firm.”

    Grubman Shire Meiselas & Sacks, a law firm that represents numerous high-profile celebrities, recently faced a demand similar to Brown Forman’s, where the ransomware criminals menaced company founder Allen Grubman in broken English with threats to auction off celebrity data in the cyberunderworld:

    We have so many value files, and the lucky ones who buy these data will be satisfied for a very long time. Show business is not concerts and love of fans only — also it is big money and social manipulation, mud lurking behind the scenes and sexual scandals, drugs and treachery. […] Mr. Grubman, you have a chance to stop that, and you know what to do.

    The company famously likened the blackmailers to terrorists and refused to pay up. (The threatened auctions haven’t yet happened – though no one knows whether that’s because the crooks felt they couldn’t trust their own or because the data stolen simply wasn’t up to what the crooks claimed.)

    To reward companies that are willing to say, “We won’t pay,” and who help to break the feedback that keeps the ransomware cycle turning, we suggest that you repay them by making sure that if their data does get dumped by crooks…

    …that you simply do not look.

    No matter how useful it might seem; no matter what items that you feel are now both “in the public domain” and in the public interest; no matter how much you might argue that companies like Brown-Forman were themselves remiss in the first place for not protecting data that they ought to have, don’t look.

    We urge you, “Just say no.”

    Brown-Forman’s breach is now a matter of public record and we assume it will be carefully investigated by law enforcement and the relevant regulators, so let’s leave them to it.

    As Sophos Cybersecurity Educator Sally Adam put it:

    There is no ‘end justifies the means’ discussion to be had here because this is nothing like the cases of whistleblowers like Edward Snowden or Chelsea Manning, where – no matter what you think of their ultimate actions – an insider identified something they perceived to be wrong. This is purely about extortion.”

    What to do?

    Clearly, prevention is way better than cure.

    It’s important to have protection in place to stop stage 3 above (after all, not all ransomware attacks do follow this three-step process, and one-off scrambling attacks are still an ever present risk.)

    We’ve got plenty of advice on how to do just that, including our popular report:

    But the earlier you block or spot the crooks, the better for everyone, including yourself.

    So we recommend you review the following handy resources too, to keep ransomware crooks out right from the very start:

Monday review – catch up on our latest articles and videos

Read the latest articles:

Watch the latest Naked Security Live videos:

[embedded content]

(Watch directly on YouTube if the video won’t play here.)


[embedded content]

(Watch directly on YouTube if the video won’t play here.)

Subscribe to our newsletter:

For a regular reminder of the articles we write on the day we write them, why not sign up for our newsletter to make sure you don’t miss anything?

You can easily unsubscribe if you decide you no longer want it.


Tor and anonymous browsing – just how safe is it?

An article published on the open-to-allcomers blogging site Medium earlier this week has made for some scary headlines.

Written as an independent research piece by an author going only by nusenu, the story is headlined:

How Malicious Tor Relays are Exploiting Users in 2020 (Part I)

[More than] 23% of the Tor network’s exit capacity has been attacking Tor users

Loosely speaking, that strapline implies that if you visit a website using Tor, typically in the hope of remaining anonymous and keeping away from unwanted surveillance, censorship or even just plain old web tracking for marketing purposes…

…then one in four of those visits (perhaps more!) will be subject to the purposeful scrutiny of cybercriminals.

That sounds more than just worrying – it makes it sound as though using Tor could be making you even less secure than you already are, and therefore that going back to a regular browser for everything might be an important step.

So let’s look quickly at how Tor works, how crooks (and countries with strict rules about censorship and surveillance) might abuse it, and just how scary the abovementioned headline really is.

The Tor network (Tor is short for the onion router, for reasons that will be obvious in a moment if you imagine an onion coming apart as you peel it), which was originally designed by the US Navy, aims:

  1. To disguise your true location on the network while you browse, so servers don’t know where you are.
  2. To make it difficult for anyone to “join the dots” by tracing your web browsing requests back to your computer.

At this point, you might be thinking, “But that’s exactly what a VPN does, and not just for my browsing but for everything I do online.”

But it’s not.

A VPN (virtual private network) encrypts all your network traffic and relays it in scrambled form to a VPN server run by your VPN provider, where it’s unscrambled and “injected” onto the internet as if it originated from that VPN server.

Any network replies are therefore received by your VPN provider on your behalf, and delivered back to you in encrypted form.

The encrypted connection between your computer is dubbed a VPN tunnel, and is, in theory, invisible to, or at least unsnoopable by, other people online.

So, as you can see, a VPN deals with the first issue listed above: disguising your true location on the network.

But a VPN doesn’t deal with the second issue, namely making it difficult for anyone to “join the dots”.

Sure, a VPN makes it difficult for most people to join the dots, but it doesn’t prevent everyone from doing do, for the simple reason that the VPN provider always knows where your requests come from, where they’re going, and what data you ultimately send and receive.

Your VPN provider therefore essentially becomes your new ISP, with the same degree of visibility into your online life that a regular ISP has.

Why not use two VPNs?

At this point, you’re probably thinking, “Why not use two VPNs in sequence? In jargon terms, why not build a tunnel-inside-a-tunnel?

You’d encrypt your network traffic for VPN2 to decrypt, then encrypt it again for VPN1 to decrypt, and send it off to VPN1.

So VPN1 would know where your traffic came from and VPN2 would know where it was going, but unless the two providers colluded they’d each know only half the story.

In theory, you’d have fulfilled both of the aims above by a sort of divide-and conquer approach, because anyone who wanted to track you back would first need to get decrypted traffic logs from VPN2, and then to get username details from VPN1 before they could start to “join the dots”.

Why stop at two?

As you can imagine, even using two VPNs, you’re not totally home and dry.

Firstly, by using the same two VPNs every time, there is an obvious pattern to your connections, and therefore a consistency in the trail that an investigator (or a crook) could follow to try to trace you back.

For all that your traffic follows a complicated route, it nevertheless takes the same route every time, so it might be worth the time and effort for a criminal (or a cop) to work backwards through both layers of VPN, even if that means double the amount of hacking or twice as many warrants.

Secondly, there’s always a possibility that the two VPN providers you choose might ultimately be owned or operated by the same company.

In other words, the technical, physical and legal separation between the two VPNs might not be as significant as you might expect – to the point that they might not even need to collude at all to track you back.

So why not use three VPNs, with one in middle that knows neither who you are nor where you’re ultimately going?

And why not chop and change those VPNs on a regular basis, to add yet more mix-and-mystery into the equation?

Well, very greatly simplified, that’s pretty much how Tor works.

A pool of computers, offered up by volunteers around the world, act as anonymising relays to provide what is essentially a randomised, multi-tunnel “mix-and-mystery” VPN for people who browse via the Tor network.

For most of the past year the total number of relays available to the Tor network has wavered between about 6000 and 7000, with every Tor circuit that’s set up using three relays, largely at random, to form a sort-of three-tunnel VPN.

Your computer chooses which relays to use, not the network itself, so there is indeed a lot of ever-changing mix-and-mystery involved in bouncing your traffic through the Tor network and back.

Your computer fetches the public encryption keys for each of the relays in the circuit that it’s setting up, and then scrambles the data you’re sending using three onion-like layers of encrpytion, so that at each hop in the circuit, the current relay can only strip off the outermost layer of encryption before handing over the data to the next.

Relay 1 knows who you are, but not where you are going or what you want to say.

Relay 3 knows where you are going but not who you are.

Relay 2 keeps the other two relays apart without knowing either who you are or where you are going, making it much harder for relays 1 and 3 to collude even if they are minded to do so.

It’s not quite that random

In the chart above, you’ll notice that the green line in the middle denotes special Tor relays known as guards, or entry guards in full, which are the subset of working relays deemed suitable for the first hop in a 3-relay circuit.

(For technical reasons, Tor actually uses the same entry guard for all your connections for about two months at time, which reduces the randomness in your Tor circuits somehwat, but we shall ignore that detail here.)

Similarly, the orange line at the bottom denotes exits, or exit nodes in full, which are relays that are deemed reliable enough to be selected for the last hop in a circuit.

Note that here are only about 1000 exit nodes active at any time, from the 6000 to 7000 relays available overall.

You can probably see where this is going.

Although Tor’s exit nodes can’t tell where you are, thanks to the anonymising effects of the entry guard and middle relay (which changes frequently), they do get to see your final, decrypted traffic and its ultimate destination, because it’s the exit node that strips off Tor’s final layer of mix-and-mystery encryption.

(When you browse to regular websites via Tor, the network has no choice but to emit your raw, original, decrypted data for its final hop on the internet, or else the site you were visiting wouldn’t be able to make any sense of it.)

In other words, if you use Tor to browse to a non-HTTPS (unencrypted) web page, then the Tor exit node that handles your traffic can not only snoop on and modify your outgoing web requests but also mess with any replies that come back.

And with just 1000 exit nodes available on average, a crook who wants to acquire control of a sizeable percentage of exits doesn’t need to set up thousands or tens of thousands of servers – a few hundred will do.

And this sort of internvention is what nusenu claims to have detected in the Tor network on a scale that may sometimes involved up to a quarter of the exit nodes available.

More specifically, Nusenu claims that, at times during 2020, hundreds of Tor relays in the “exit node” list were set up by criminally-minded volunteers with ulterior motives:

The full extent of their operations is unknown, but one motivation appears to be plain and simple: profit. They perform person-in-the-middle attacks on Tor users by manipulating traffic as it flows through their exit relays. […] It appears that they are primarily after cryptocurrency related websites — namely multiple bitcoin mixer services. They replaced bitcoin addresses in HTTP traffic to redirect transactions to their wallets instead of the user provided bitcoin address. Bitcoin address rewriting attacks are not new, but the scale of their operations is. It is not possible to determine if they engage in other types of attacks.

Simply put, Nusenu alleges that these crooks are waiting to prey upon cryptocurrency users who think that Tor on its own is enough to secure both their anonymity and their data, and who therefore browse via Tor but don’t take care to put https:// at the start of new URLs that they type in.
.

HTTP considered harmful

For better or worse, a lot of the time you can ignore the https:// when you type URLs into your browser, and you’ll still end up on an HTTPS site, encrypted and padlock-protected.

Often, the server at the other end will react to an HTTP request with a reply that says, “From now on, please don’t use plain old HTTP any more,” and your browser will remember this and automatically upgrade all future connections to that site so they use HTTPS.

Those “never use HTTP again” replies implement what is known as HSTS, short for HTTP Strict Transport Security, and they are supposed to keep you secure from snooping and traffic manipulation even if you never stop to think about it.

But there’s a chicken-and-egg problem, namely that if the crooks intercept your very first non-HTTP connection to a website that you really ought to be accessing via HTTPS only, before the “no more HTTPS” message gets across, they may be able to:

  • Keep you talking HTTP to their booby-trapped exit node while talking HTTPS onwwards to the final destination. This makes the final site think you’re communicating securely, but prevents you from realising that the destination site wants you to talk HTTPS.
  • Rewrite any replies from the final destination to replace any HTTPS links with HTTP. This prevents your browser from upgrading to HTTPS later on in the transaction, thus keeping you stuck with plain old HTTP.

What to do?

Here are some tips:

  • Don’t forget to type https:// at the start of the URL! For historical reasons, browsers still default to HTTP until they know better, so the sooner you visit a site by explicitly typing https:// at the start of the URL, the sooner you protect yourself by making your intentions obvious.
  • If you run a website, always use HSTS to tell your visitors not to use HTTP next time.
  • If you run a website where privacy and security are non-negotiable, consider applying to add your site to the HSTS Preload list. This is a list of websites for which all major browsers will always use HTTPS, whatever the URL says.
  • If your browser supports it, or has a plugin to enforce it, consider turning off HTTP support completely. For example, Firefox now has a non-default configuration feature called dom.security.https_only_mode. Some older sites might not work properly with this setting turned on, but if you are serious about security, give it a try!

      (Sadly, Firefox’s “never use HTTP” option isn’t yet available the Tor browser, which is still using an Extended Support Release version where this feature hasn’t yet appeared.)


Facial recognition – another setback for law enforcement

So far this year, the use of facial recognition by law enforcement has been successfully challenged by courts and legislatures on both sides of the Atlantic.

In the US, for example, Washington State Senate Bill 6280 appeared in January 2020, and proposed curbing the use of facial recognition in the state, though not entirely.

The bill admitted that:

[S]tate and local government agencies may use facial recognition services in a variety of beneficial ways, such as locating missing or incapacitated persons, identifying victims of crime, and keeping the public safe.

But it also insisted that:

Unconstrained use of facial recognition services by state and local government agencies poses broad social ramifications that should be considered and addressed. Accordingly, legislation is required to establish safeguards that will allow state and local government agencies to use facial recognition services in a manner that benefits society while prohibiting uses that threaten our democratic freedoms and put our civil liberties at risk.

And in June 2020, Boston followed San Fransisco to become the second-largest metropolis in the US – indeed, in the world – to prohibit the use of facial recognition.

Even Boston’s Police Department Commissioner, William Gross, was against it, despite its obvious benefits for finding wanted persons or fugitive convicts who might otherwise easily hide in plain sight.

Gross, it seems, just doesn’t think it’s accurate enough to be useful, and was additionally concerned that facial recogition software, loosely put, may work less accurately as your skin tone gets darker:

Until this technology is 100%, I’m not interested in it. I didn’t forget that I’m African American and I can be misidentified as well.

Across the Atlantic, similar objections have been brewing.

Edward Bridges, a civil rights campaigner in South Wales, UK, has just received a judgement from Britain’s Court of Appeal that establishes judicial concerns along similar lines to those aired in Washington and Boston.

In 2017, 2018 and 2019, the South Wales Police (Heddlu De Cymru) had been trialling a system known as AFR Locate (AFR is short for automatic facial recognition), with the aim of using overt cameras – mounted on police vans – to look for the sort of people who are often described as “persons of interest”.

In its recent press summary, the court decribed those people as: “persons wanted on warrants, persons who had escaped from custody, persons suspected of having committed crimes, persons who may be in need of protection, vulnerable persons, persons of possible interest […] for intelligence purposes, and persons whose presence at a particular event causes particular concern.”

Bridges originally brought a case against the authorities back in 2019, on two main grounds.

Firstly, Bridges argued that even though AFR Locate would reject (and automatically delete) the vast majority of images it captured while monitoring passers-by, it was nevertheless a violation of the right to, and the expectation of, what the law refers to as “a private life”.

AFR Locate wasn’t using the much-maligned technology known as Clearview AI, based on a database of billions of already-published facial images scraped from public sites such as social networks and then indexed against names in order to produce a global-scale mugshot-to-name “reverse image search” engine. AFR Locate matches up to 50 captured images a second from a video feed against a modest list of mugshots already assembled, supposedly with good cause, by the police. The system trialled was apparently limited to a maximum mugshot database of 2000 faces, with South Wales Police typically looking for matches against just 400 to 800 at a time.

Secondly, Bridges argued that the system breached what are known as Public Sector Equality Duty (PSED) provisions because of possible gender and race based inaccuracies in the technology itself – simply put, that unless AFR Locate were known to be free from any sort of potentially sexist or racist inaccuracies, however inadvertent, it shouldn’t be used.

In 2019, a hearing at Divisional Court level found against Bridges, arguing that the use of AFR Locate was proportionate – presumably on the grounds that it wasn’t actually trying to identify everyone it saw, but would essentially ignore any faces that didn’t seem to match a modestly-sized watchlist.

The Divisional Court also dismissed Bridges’ claim that the software might essentially be discriminatory by saying that there was no evidence, at the time the system was being trialled, that it was prone to that sort of error.

Bridges went to the Court of Appeal, which overturned the earlier decision somewhat, but not entirely.

There were five points in the appeal, of which three were accepted by court, and two rejected:

  • The court decided that there was insufficient guidance on how AFR Locate was to be deployed, notably in repect of deciding where it was OK to use it, and who would be put on the watchlist. The court found that its trial amounted to “too broad a discretion to afford to […] police officers.”
  • The court decided that the South Wales Police had not conducted an adequate assessment of the impact of the system on data protection.
  • The court decided that, even though there was “no clear evidence” that AFR Locate had any gender or race-related bias, the South Wales Police had essentially assumed as much rather than taking reasonable steps to establish this as a fact.

(The court rejected one of Bridges’ fives point on the basis that it was legally irrelevant, being enacted into law more recently that the events in the case.)

Interestingly, the court rejected what you might think of as the core of Bridges’ objections – which are the “gut feeling” objections that many people have against facial recognition in general – namely that AFR Locate interfered with the right to privacy, no matter how objectively it might be programmed.

The court argued that “[t]he benefits were potentially great, and the impact on Mr Bridges was minor, and so the use of [automatic facial recogntion] was proportionate.”

In other words, the technology itself hasn’t been banned, and the court seems to think it has great potential, just not in the way it’s been trialled so far.

And there you have it.

The full judgement runs to 59 very busy pages, but is worth looking at nevertheless, for a sense of how much complexity cases of this sort seem to create.

Thhe bottom line right now, at least where the UK judiciary stands on this, seems to be that:

  1. Facial recognition is OK in principle and may have significant benefits in detecting criminals at large and identifying vulnerable people.
  2. More care is neeeded in working out how we use it to make sure that we benefit from point (1) without throwing privacy in general to the winds.
  3. Absence of evidence of potential discriminatory biases in facial recognition software is not enough on its own, and what we really need is evidence of absence of bias instead.

In short, “Something needs to be done,” which leads to the open question…

…what do you think that should be? Let us know in the comments!


go top