Serious Security: Rowhammer returns to gaslight your computer

You’re probably familiar with the word gaslighting, used to refer to people with the odious habit of lying not merely to cover up their own wrongdoing, but also to make it look as though someone else is at fault, even to the point of getting the other person to doubt their own memory, decency and sanity.

You might not know, however, that the term comes from a 1930s psychological thriller play called Gas Light (spoiler alert) in which a manipulative and murderous husband pretends to spend his evenings out on the town with his friends, abandoning his long-suffering wife at home in misery.

In fact, he’s secretly sneaking into the apartment upstairs, where he previously murdered the occupant to steal her jewels.

Although he got away with the killing, he came away empty-handed at the time, so he keeps returning to the scene of the crime to search ever more desperately through the murdered woman’s apartment for the valuables.

The giveaway to his criminality is that, in his nightly visits, he not only makes noises that can be heard downstairs, but also needs to turn on the gas lights to see what he’s doing.

Because the entire building is connected to the same gas supply (the play is set in 1880s London, before household electricity replaced gas for lighting), opening and igniting a gas burner in any room causes a temporary pressure drop in the whole system, so that the murderer’s wife notices a brief but telltale dimming of her own lights every time he’s upstairs.

This unavoidable side-effect, namely that using the lights in one part of the house produces a detectable disturbance elsewhere, ultimately leads to the husband being collared by the police.

In case you’re wondering, the verbal metaphor to gaslight in its modern sense comes from the fact that the criminal in the play brashly explains away both the dimming lights and the mysterious noises as evidence that his wife is going mad. His evil plan is both to divert suspicion from his original crime and to have her declared insane, in order to get rid of her once he finds the riches he’s after. When the police come after him, she turns the tables by pretending to help him escape, only to ensure that he’s captured in the end. As she points out, given that he’s gone to such trouble to “prove” all along that she’s insane, no one will now believe or even suspect that she betrayed him to the hangman’s noose entirely on purpose…

Return of Rowhammer

We know what you’re thinking: What’s the connection between gas lights, and their fickle behaviour under load, and the cybersecurity challenge known as rowhammering?

Well, rowhammering is an electronics problem that’s caused by unwanted inside-the-system interactions, just like the flickering gas light in the eponymous play.

In the early days of computers, data was stored using a variety of schemes to represent a series of binary digits, or bits, including: audio pulses passed through long tubes of mercury; magnetic fields stored in a grid of tiny ferrite rings known as cores, from which we get the modern-day jargon term core dump when saving RAM after a program crash; and electrostatic charges stored as blobs of light on an oscilloscope screen.

Modern DRAM chips (dynamic random access memory), in contrast, rely on a very tightly squashed-together grid of nanoscopic capacitors, each of which can either store an electrical charge (which we’ll take to be a binary 1), or not (for a 0-bit).

To read cell C3 above, apply power along row wire 3, discharging the capacitors A3, B3, C3 and D3 down column wires A, B, C and D, allowing their values to be determined. Bits without any charge will read out as 0; bits that were storing a charge as 1. You have to access and discharge four bits to read any one of them.

Surprisingly, perhaps, DRAM chips have more in common with the mercury delay line storage of the 1940s and 1950s than you might think, namely that:

  • You can only read out a full line of data at a time. To read out the 112th bit in a 1024-bit mercury delay line means reading out all 1024 bits (they travel though the mercury in sequence at just over 5000 km/hr, making delay line access times surprisingly fast). DRAM chips use a similar system of discharging one line of capacitors in their grid in one go, to avoid having individual control circuity for every nanocapacitor in the array.
  • Reading out the data wipes out the memory. In delay lines, the audio pulses can’t be allowed to bounce back along the tube or the echoes would ruin the bits currently circulating. So, the data gets read out at one end and then written back, optionally modified, at the other end of the delay-line tube. Similarly, reading out the capacitors in DRAM discharges any that were currently storing 1-bits, thus effectively zeroing out that line of data, so any read must be followed by a rewrite.
  • The data fades away if it’s not rewritten regularly. Delay lines are unidirectional, because echoes aren’t allowed, so you need to read out and write back the bits in a continuous, regular cycle to keep the data alive, or else it vanishes after one transit through the mercury tube. DRAM capacitors also suffer unavoidable data dissipation, because they can typically retain a charge reliably for no more than tenth of a second before the charge leaks away. So, each line of capacitors in the chip gets automatically read-and-rewritten every 64 milliseconds (about 1/15th of a second) to keep the data alive indefinitely.

Writing to read-only memory

So, where does so-called rowhammering come in?

Every time you write to a line of capacitors in a DRAM chip’s memory grid, there’s a very tiny chance that the electrical activity in that line might accidentally affect one or more of the capacitors in the lines next to it, in the same sort of way that turning on a gas light in one room causes a telltale flicker in the other rooms.

The more frequently you write to a single line of capacitors (or, more cunningly, if you can figure out the right memory addresses to use, to the two lines of capacitors either side of your target capacitors for greater bit-blasting energy), the more likely you are to provoke some sort of semi-random bit flip.

And the bad news here is that, because reading from DRAM forces the hardware to write the data back to the same memory cells right away, you only need read access to a particular bunch of memory cells in order to trigger low-level electronic rewrites of those cells.

(There’s an analogy in the problem of “gaslighting” from the play, namely that you don’t actually have to illuminate a lamp for nearby lights to give you away; just opening and closing the gas tap momentarily without actually lighting a flame is enough to trigger the light-dimming effect.)

Simply put, merely by reading from the same block of DRAM memory over and over in a tight loop, you automatically cause it to be rewritten at the same rate, thus greatly increasing the chance that you’ll deliberately, if unpredictably, induce one or more bit flips in nearby memory cells.

Using this sort of treachery to provoke memory errors on purpose is what’s known in the jargon by the self-descriptive name rowhammering.

Rowhammer as an attack technique

Numerous cybersecurity attacks have been proposed based on rowhammering, even though the side-effects are hard to predict.

Some of these attacks are tricky to pull off, because they require the attacker to have precise control over memory layout, the processor setup, and the operating system configuration.

For example, most processor chips (CPUs) and operating systems no longer allow unprivileged programs to flush the processor’s on-board memory cache, which is temporary, fast RAM storage inside the CPU itself that’s used for frequently-accessed data.

As you can imagine, CPU memory caches exist primarily to improve performance, but they also serve the handy purpose of preventing a tight program loop from literally reading the same DRAM capacitors over and over again, by supplying the needed data without accessing any DRAM chips at all.

Also, some motherboards allow the so-called DRAM refresh rate to be boosted so it’s faster than the traditional value of once every 64 millseconds that we mentioned above.

This reduces system performance (programs get briefly paused if they try to read data out of DRAM while it’s being refreshed by the hardware), but decreases the likelihood of rowhammering by “topping up” the charges in all the capacitors on the chip more regularly than is strictly needed.

Freshly rewritten capacitors are much more likely to be sitting at a voltage level that denotes unambigously whether they’re fully charged (a 1-bit) or fully discharged (a 0-bit), rather than drifting uncertainly somewhere between the two.

This means that individual capacitors are less likely to be affected by interference from writes into nearby memory cells.

And many modern DRAM chips have extra smarts built into their memory refresh hardware these days, including a mitigation called TRR (target row refresh).

This system deliberately and automatically rewrites the storage capacitors in any memory lines that are close to memory locations that are being accessed repeatedly.

TRR therefore serves the same electrical “top up the capacitors” purpose as increasing the overall refresh rate, but without imposing a performance impact on the entire chip.

Rowhammering as a supercookie

Intriguingly, a paper recently published by researchers at the University of California, Davis (UCD) investigates the use of rowhammering not for the purpose of breaking into a computer by modifying memory in an exploitable way and thereby opening up a code execution security hole…

…but instead simply for “fingerprinting” the computer so they can recognise it again later on.

Greatly simplified, they found that DRAM chips from different vendors tended to have distinguishably different patterns of bit-flipping misbehaviour when they were subjected to rowhammering attacks.

As you can imagine, this means that just by rowhammering, you may be able to discern hardware details about a victim’s computer that could be combined with other characteristics (such as operating system version, patch level, browser version, browser cookies set, and so on) to help you tell it apart from other computers on the internet.

In four words: sneaky tracking and surveillance!

More dramatically, the researchers found that even externally identical DRAM chips from the same manufacturer typically showed their own distinct and detectable patterns of bit flips, to the point that individual chips could be recognised later on simply by rowhammering them once again.

In other words, the way that a specific DRAM memory module behaves when rowhammered acts as a kind of “supercookie” that identifies, albeit imperfectly, the computer it’s plugged into.

Desktop users rarely change or upgrade their memory, and many laptop users can’t, because the DRAM modules are soldered directly to the motherboard and therefore can’t be swapped out.

Therefore the researchers warn that rowhammering isn’t just a sneaky-but-unreliable way of breaking into a computer, but also a possible way of tracking and identifying your device, even in the absence of other giveway data such as serial numbers, browser cookies, filesystem metadata and so on.

Protective maintenance makes things worse

Fascinatingly, the researchers claim that when they tried to ensure like-for-like in their work by deliberately removing and carefully replacing (re-seating) the memory modules in their motherboards between tests…

…detecting memory module matches actually became easier.

Apparently, leaving removable memory modules well alone makes it more likely that their rowhammering fingerprints will change over time.

We’re guessing that’s due to factors such as heat creep, humidity changes and other environmental variations causing conductivity changes in the metal contacts on the memory stick, and thus subtly altering the way that current flows into and thus inside the chip.

Ironically, a memory module that gets worse over time at resisting the bit-flip side-effects of rowhammering will, in theory at least, become more and more vulnerable to code execution exploits.

That’s because ongoing attacks will gradually trigger more and more bit-flips, and thus probably open up more and more exploitable memory corruption opportunties.

But that same memory module will, ipso facto, become ever more resistant to identification-based rowhammer attacks, because those depend on the misbehaviour of the chip remaining consistent over time to produce results with sufficient “fidelity” (if that is the right word) to identify the chip reliably.

Interestingly, the researchers state that they couldn’t get their fingerprinting technique to work at all on one particular vendor’s memory modules, but they declined to name the maker because they’re not sure why.

From what we can see, the observed immunity of those chips to electronic identification might be down to chance, based on easily-changed behaviour in the code the researchers used to do the rowhammering.

The apparent resilience of that brand of memory might therefore not be down to any specific technical superiority in the product concerned, which would make it unfair to everyone else to name the manufacturer.

What to do?

Should you be worried?

There’s not an awful lot you can do right now to avoid rowhammering, given that it’s a fundamental electrical charge leakage problem that stems from the incredibly small size and close proximity of the capacitors in modern DRAM chips.

Nevertheless, we don’t think you should be terribly concerned.

After all, to extract these DRAM “supercookies”, the researchers need convince you to to run a carefully-coded application of their choice.

They can’t rely on browsers and browser-based JavaScript for tricks of this sort, not least because the code used in this research, dubbed Centauri, needs lower-level system access than most, if not all, contemporary browsers will allow.

Firstly, the Cenaturi code needs the privilege to flush the CPU memory cache on demand, so that every memory read really does trigger electrical access to directly to a DRAM chip.

Without this, the acceleration provided by the cache won’t let enough actual DRAM rewrites through to produce a statistically significant number of bit flips.

Secondly, the Centauri code relies on having sufficient system-level access to force the operating system into allocating memory in contiguous 2MB chunks (known in the jargon as large pages), rather than as a bunch of 4KB memory pages, as both Windows and Linux do by default.

As shown below, you need to make special system function calls to activate large-page memory allocation rights for a program; your user account needs authority to activate that privilege in the first place; and no Windows user accounts have that privilege by default. Loosely speaking, at least on a corporate network, you will probably need sysadmin powers up front to assign yourself the right to activate the large-page allocation privilege that’s required to get the Centauri code working.

To fingerprint your computer, the researchers would need to trick you into running malware, and probably also trick you into logging with at least local administrator rights in the first place.

Of course, if they can do that, then there are many other more reliable and definitive ways that they can probe or manipulate your device to extract strong system identifiers.

These include: taking a complete hardware inventory complete with device identifiers; retrieving hard disk serial numbers; searching for unique filenames and timestamps; examining system configuration settings; downloading a list of applications installed; and much more.

Lastly, because the Centauri code aims not to attack and exploit your computer directly (in which case, risking a crash along the way might be well worth it), there’s a worrying risk that collecting the rowhammering data needed to fingerprint your computer would crash it dramatically, and thus attract your undivided cybersecurity attention.

Rowhammering for the purposes of remote code execution is the kind of thing that crooks can try out comparatively briefly and gently, on the grounds when it works, they’re in, but if it doesn’t, they’ve lost nothing.

But Centauri explicitly relies on provoking sufficiently many bit-flip errors to construct a statistically significant fingerprint, without which it can’t function as a “supercookie” identification technique.

When it comes to unknown software that you’re invited to run “because you know you want to”, please remember: If in doubt, leave it out!


ENABLING LARGE-PAGE ALLOCATIONS IN WINDOWS

To compile and play with this program for yourself, you can use a full-blown development kit such as Clang for Windows (free, open source) or Visual Studio Community (free for personal and open-source use), or just download our port of Fabrice Bellard’s awesome Tiny C Compiler for 64-bit Windows. (Under 500KB, including basic headers, ready-to-use binary files and full source code if you want to see how it works!)


Source code you can copy-and-paste:

#include <windows.h>
#include <stdio.h> int main(void) { SIZE_T ps; void* ptr; HANDLE token; BOOL ok; TOKEN_PRIVILEGES tp; LUID luid; DWORD err; ps = GetLargePageMinimum(); printf("Large pages start at: %lld bytes\n",ps); ok = OpenProcessToken(GetCurrentProcess(),TOKEN_ALL_ACCESS,&token); printf("OPT result: %d, Token: %016llX\n",ok,token); if (!ok) { return 1; } ok = LookupPrivilegeValueA(0,"SeLockMemoryPrivilege",&luid); printf("LPV result: %d, Luid: %ld:%u\n",ok,luid.HighPart,luid.LowPart); if (!ok) { return 2; } // Note that account must have underlying "Lock pages in memory" // as a policy setting. Logout and log back on to activate this // access after authorising the account in GPEDIT. Admin needed. tp.PrivilegeCount = 1; tp.Privileges[0].Luid = luid ; tp.Privileges[0].Attributes = SE_PRIVILEGE_ENABLED; ok = AdjustTokenPrivileges(token,0,&tp,sizeof(tp),0,0); if (!ok) { return 3; } // Note that AdjustPrivs() will return TRUE if the request // is well-formed, but that doesn't mean it worked. Because // you can ask for multiple privileges at once, you need to // check for error 1300 (ERROR_NOT_ALL_ASSIGNED) to see if // any of them (even if there was only one) was disallowed. err = GetLastError(); printf("ATP result: %d, error: %u\n",ok,err); ptr = VirtualAlloc(NULL,ps, MEM_LARGE_PAGES|MEM_RESERVE|MEM_COMMIT, PAGE_READWRITE); err = GetLastError(); printf("VA error: %u, Pointer: %016llX\n",err,ptr); return 0;
}

Build and run with a command as shown below.

At my first attempt, I got error 1300 (ERROR_NOT_ALL_ASSIGNED) because my account wasn’t pre-authorised to request the Lock pages in memory privilege in the first place, and error 1314 (ERROR_PRIVILEGE_NOT_HELD) plus a NULL (zero) pointer back from VirtualAlloc() as a knock-on effect of that:

C:\Users\duck\PAGES> petcc64 -v -stdinc -stdlib p1.c -ladvapi32
Tiny C Compiler - Copyright (C) 2001-2023 Fabrice Bellard
Stripped down by Paul Ducklin for use as a learning tool
Version petcc64-0.9.27 [0006] - Generates 64-bit PEs only
-> p1.c
------------------------------- virt file size section 1000 200 318 .text 2000 600 35c .data 3000 a00 18 .pdata
-------------------------------
<- p1.exe (3072 bytes) C:\Users\duck\PAGES> p1
Large pages start at: 2097152 bytes
OPT result: 1, Token: 00000000000000C4
LPV result: 1, Luid: 0:4
ATP result: 1, error: 1300
VA error: 1314, Pointer: 0000000000000000

To authorise myself to request the relevant privilege (Windows always allocates large pages locked into physical RAM, so you can’t acquire them without that special Lock pages in memory setting turned on), I used the GPEDIT.MSC utility to assign myself the right locally.

Go to Local Computer Policy > Computer Configuration > Windows Settings > Security Settings > Local Policies > User Rights Assignment and add your own username the Lock pages in memory option.

Don’t do this on a work computer without asking first, and avoid doing it on your regular home computer (use a spare PC or a virtual machine instead):


After assigning myself the necessary right, then signing out and logging on again to acquire it, my request to grab 2MB of virtual RAM allocated as a single block of physical RAM succeeded as shown:

C:\Users\duck\PAGES>p1
Large pages start at: 2097152 bytes
OPT result: 1, Token: 00000000000000AC
LPV result: 1, Luid: 0:4
ATP result: 1, error: 0
VA error: 0, Pointer: 0000000001600000

Diagram of DRAM cells reworked from Wikimedia under CC BY-SA-3.0.


S3 Ep142: Putting the X in X-Ops

PUTTING THE X IN X-OPS

First there was DevOps, then SecOps, then DevSecOps. Or should that be SecDevOps?

Paul Ducklin talks to Sophos X-Ops insider Matt Holdcroft about how to get all your corporate “Ops” teams working together, with cybersecurity correctness as a guiding light.

No audio player below? Listen directly on Soundcloud.

With Paul Ducklin and Matt Holdcroft. Intro and outro music by Edith Mudge.

You can listen to us on Soundcloud, Apple Podcasts, Google Podcasts, Spotify and anywhere that good podcasts are found. Or just drop the URL of our RSS feed into your favourite podcatcher.


READ THE TRANSCRIPT

DUCK.  Hello, everybody.

Welcome to the Naked Security podcast.

As you can hear, I am not Doug, I am Duck.

Doug is on vacation this week, so I am joined for this episode by my long-term friend and cybersecurity colleague, Matt Holdcroft.

Matt, you and I go back to the early days of Sophos…

…and the field you work in now is the cybersecurity part of what’s known as “DevSecOps”.

When it comes to X-Ops, you’ve been there for all possible values of X, you might say.

Tell us something about how you got to where you are now, because it’s a fascinating story.


MATT.  My first job at Sophos was Lotus Notes Admin and Developer, and I worked in the then Production Room, so I was responsible for duplicating floppy disks.

These were REAL floppy disks, that you could actually flop!


DUCK.  [LOUD LAUGHTER] Yes, the 5.25″ sort…


MATT.  Yes!

Back then, it was easy.

We had physical security; you could see the network; you knew a computer was networked because it had a bit of cable coming out of the back.

(Though it probably wasn’t networked because someone had lost the terminator off the end [of the cable].)

So, we had nice, simple rules about who could go to where, and who could stick what in what, and life was fairly simple.


DUCK.  These days, it’s almost the other way round, isn’t it?

If a computer is not on the network, then it can’t do much in terms of helping the company achieve its goals, and it’s almost considered impossible to manage.

Because it needs to be able to reach the cloud to do anything useful, and you need to be able to reach out to it, as a security operations person, via the cloud, to make sure it’s up to scratch.

It’s almost a Catch-22 situation, isn’t it?


MATT.  Yes.

It’s completely flipped.

Yes, a computer that’s not connected is secure… but it’s also useless, because it’s not fulfilling its purpose.

It’s better to be continually online so it can continually get the latest updates, and you can keep an eye on it, and you can get real-life telemetry from it, rather than having something that you might check on every other day.


DUCK.  As you say, it is an irony that going online is profoundly risky, but it’s also the only way to manage that risk, particularly in an environment where people don’t show up at the office every day.


MATT.  Yes, the idea of Bring Your Own Device [BYOD] wouldn’t fly back in the day, would it?

But we did have Build Your Own Device when I joined Sophos.

You were expected to order the parts and construct your first PC.

That was a rite of passage!


DUCK.  It was quite nice…

…you could choose, within reason, couldn’t you?


MATT.  [LAUGHTER] Yes!


DUCK.  Should I go for a little bit less disk space, and then maybe I can have [DRAMATIC VOICE] EIGHT MEGABYTES OF RAM!!?!


MATT.  It was the era of 486es, floppies and faxes, when we started, wasn’t it?

I remember the first Pentiums came into the company, and it was, “Wow! Look at it!”


DUCK.  What are your three Top Tips for today’s cybersecurity operators?

Because they’re very different from the old, “Oooh, let’s just watch out for malware and then, when we find it, we’ll go and clean it up.”


MATT.  One of the things that’s changed so much since then, Paul, is that, back in the day, you had an infected machine, and everyone was desperate to get the machine disinfected.

An executable virus would infect *all* the executables on the computer, and getting it back into a “good” state was really haphazard, because if you missed any infection (assuming you could disinfect), you’d be back to square one as soon as that file was invoked.

And we didn’t have, as we have now, digital signatures and manifests and so on where you could get back to a known state.


DUCK.  It’s as though the malware was the key part of the problem, because people expected you to clean it up, and basically remove the fly from the ointment, and then hand the jar of ointment back and say, “It’s safe to use now, folks.”


MATT.  The motivation has changed, because back then the virus writers wanted to infect as many files as possible, generally, and they were often just doing it “for fun”.

Whereas these days, they want to capture a system.

So they’re not interested in infecting every executable.

They just want control of that computer, for whatever purpose.


DUCK.  In fact, there might not even be any infected files during the attack.

They could break in because they’ve bought a password from somebody, and then, when they get in, instead of saying, “Hey, let’s let a virus loose that will set off all sorts of alarms”…

…they’ll say, “Let’s just find what cunning sysadmin tools are already there that we can use in ways that a real sysadmin never would.”


MATT.  In many ways, it wasn’t really malicious until…

…I remember being horrified when I read the description of a particular virus called “Ripper”.

Instead of just infecting files, it would go around and twiddle bits on your system silently.

So, over time, any file or any sector on your disk could become subtly corrupt.

Six months down the line, you might suddenly find that your system was unusable, and you’d have no idea what changes had been made.

I remember that was quite shocking to me, because, before then, viruses had been annoying; some had political motives; and some were just people experimenting and “having fun”.

The first viruses were written as an intellectual exercise.

And I remember, back in the day, that we couldn’t really see any way to monetise infections, even though they were annoying, because you had that problem of, “Pay it into this bank account”, or “Leave the money under this rock in the local park”…

…which was always susceptible to being picked up by the authorities.

Then, of course, Bitcoin came along. [LAUGHTER]

That made the whole malware thing commercially viable, which until then it wasn’t.


DUCK.  So let’s get back to those Top Tips, Matt!

What do you advise as the three things that cybersecurity operators can do that give them, if you like, the biggest band for the buck?


MATT.  OK.

Everyone’s heard this before: Patching.

You’ve got to patch, and you’ve got to patch often.

The longer you leave patching… it’s like not going to the dentist: the longer you leave it, the worse it’s going to be.

You’re more likely to hit a breaking change.

But if you’re patching often, even if you do hit a problem, you can probably cope with that, and over time you will make your applications better anyway.


DUCK.  Indeed, it’s much, much easier to upgrade from, say, OpenSSL 3.0 to 3.1 than it is to upgrade from OpenSSL 1.0.2 to OpenSSL 3.1.


MATT.  And if someone’s probing your environment and they can see that you’re not keeping up-to-date on your patching… it’s, well, “What else is there that we can exploit? It’s worth another look!”

Whereas someone who’s fully patched… they’re probably more on top of things.

It’s like the old Hitchhiker’s Guide to the Galaxy: as long as you’ve got your towel, they assume you’ve got everything else.

So, if you’re fully patched, you’re probably on top of everything else.


DUCK.  So, we’re patching.

What’s the second thing we need to do?


MATT.  You can only patch what you know about.

So the second thing is: Monitoring.

You’ve got to know your estate.

As far as knowing what’s running on your machines, there’s been a lot of effort put in recently with SBOMs, the Software Bill of Materials.

Because people have understood that it’s the whole chain…


DUCK.  Exactly!


MATT.  It’s no good getting an alert that says, “There’s a vulnerability in such-and-such a library,” and your response is, “OK, what do I do with that knowledge?”

Knowing what machines are running, and what’s running on those machines…

…and, bringing it back to patching, “Have they actually installed the patches?”


DUCK.  Or has a crook snuck in and gone, “Aha! They think they’re patched, so if they’re not double-checking that they’ve stayed patched, maybe I can downgrade one of these systems and open up myself a backdoor for ever more, because they think they’ve got the problem sorted.”

So I guess the cliche there is, “Always measure, never assume.”

Now I think I know what your third tip is, and I suspect it’s going to be the hardest/most controversial.

So let me see if I am right… what is it?


MATT.  I would say it is: Kill. (Or Cull.)

Over time, systems accrete… they’re designed, and built, and people move on.


DUCK.  [LAUGHTER] Accrete! [LOUDER LAUGHTER]

Sort of like calcification…


MATT.  Or barnacles…


DUCK.  Yes! [LAUGHTER]


MATT.  Barnacles on the great ship of your company.

They may be doing useful work, but they may be doing it with technology that was in vogue five years ago or ten years ago when the system was designed.

We all know how developers love a new toolset or a new language.

When you’re monitoring, you need to keep an eye on these things, and if that system is getting long in the tooth, you’ve got to take the hard decision and kill it off.

And again, the same as with patching, the longer you leave it, the more likely you are to turn around and say, “What does that system even do?”

It’s very important always to think about lifecycle when you implement a new system.

Think about, “OK, this is my version 1, but how am I going to kill it? When is it going to die?”

Put some expectations out there for the business, for your internal customers, and the same goes for external customers as well.


DUCK.  So, Matt, what’s your advice for what I’m aware can be a very difficult job for someone who’s in the security team (typically this gets harder as the company gets larger) to help them sell the idea?

For example, “You are no longer allowed to code with OpenSSL 1. You have to move to version 3. I don’t care how hard it is!”

How do you get that message across when everyone else at the company is pushing back at you?


MATT.  First of all… you can’t dictate.

You need to give clear standards and those need to be explained.

That sale you got because we shipped early without fixing a problem?

It’ll be overshadowed by the bad publicity that we had a vulnerability or that we shipped with a vulnerability.

It’s always better to prevent than to fix.


DUCK.  Absolutely!


MATT.  I understand, from both sides, that it is difficult.

But the longer you leave it, the harder it is to change.

Setting these things out with, “I’m going to use this version and then I’m going to set-and-forget”?

No!

You have to look at your codebase, and to know what’s in your codebase, and say, “I’m relying on these libraries; I’m relying on these utilities,” and so on.

And you have to say, “You need to be aware that all of those things are subject to change, and face up to it.”


DUCK.  So it sounds as though you’re saying that whether the law starts to tell software vendors that they must provide a Software Bill of Materials (an SBOM, as you mentioned earlier), or not…

…you really need to maintain such a thing inside your organisation anyway, just so you can measure where you stand on a cybersecurity footing.


MATT.  You can’t be reactive about those things.

It’s no good saying, “That vulnerability that was splashed all over the press a month ago? We have now concluded that we are safe.”

[LAUGHTER] That’s no good! [MORE LAUGHTER]

The reality is that everyone’s going to be hit with these mad scrambles to fix vulnerabilities.

There are some big ones on the horizon, potentially, with things like encryption.

Some day, NIST might announce, “We no longer trust anything to do with RSA.”

And everybody’s going to be in the same boat; everyone’s going to have to scramble to implement new, quantum-safe cryptography.

At that point, it’s going to be, “How quickly can you get your fix out?”

Everyone’s going to be doing the same thing.

If you’re prepared for it; if you know what to do; if you’ve got a good understanding of your infrastructure and your code…

…if you can get out there at the head of the pack and say, “We did it in days rather than weeks”?

That’s a commercial advantage, as well as being the right thing to do.


DUCK.  So, let me summarise your three Top Tips into what I think have become four, and see if I’ve got them right.

Tip 1 is good old Patch early; patch often.

Waiting two months, like people did back in the Wannacry days… that wasn’t satisfactory six years ago, and it’s certainly far, far too long in 2023.

Even two weeks is too long; you need to think, “If I need to do this in two days, how could I do it?”

Tip 2 is Monitor, or in my cliche-words, “Always measure, never assume.”

That way you can make sure that the patches that are supposed to be there really are, and so that you can actually find out about those “servers in the cupboard under the stairs” that somebody forgot about.

Tip 3 is Kill/Cull, meaning that you build a culture in which you are able to dispose of products that are no longer fit for purpose.

And a sort-of auxiliary Tip 4 is Be nimble, so that when that Kill/Cull moment comes along, you can actually do it faster than everybody else.

Because that’s good for your customers, and it also puts you (as you said) at a commercial advantage.

Have it got that right?


MATT.  Sounds like it!


DUCK.  [TRIUMPHANT] Four simple things to do this afternoon. [LAUGHTER]


MATT.  Yes! [MORE LAUGHTER]


DUCK.  Like cybsecurity in general, they are journeys, are they not, rather than destinations?


MATT.  Yes!

And don’t let “best” be the enemy of “better”. (Or “good”.)

So…

Patch.

Monitor.

Kill. (Or Cull.)

And: Be nimble… be ready for change.


DUCK.  Matt, that’s a great way to finish.

Thank you so much for stepping up to the microphone at short notice.

As always, for our listeners, if you have any comments you can leave them on the Naked Security site, or contact us on social: @nakedsecurity.

It now remains only for me to say, as usual: Until next time…


BOTH.  Stay secure!

[MUSICAL MODEM]


Firefox 115 is out, says farewell to older Windows and Mac users

Firefox’s latest monthly update just came out, bumping the primary version of the popular alternative browser to 115.0.

OK, it’s technically a once-every-four-weeks update, so that there will sometimes be two major updates in a single calendar month, just as you sometimes get two full moons in a month, but this month there’s only one.

(At the end of next month, August 2023, there will co-incidentally be both a blue moon, which is the term used for the second full moon in a single month, and what we’ll refer to by analogy as a Blue Firefox, with Firefox 116 arriving on 01 August 2023 and Firefox 117 following up four weeks later on 29 August 2023.)

Early warning for users of old OSes

Mozilla’s own headline news for version 115 is that:

In January 2023, Microsoft ended support for Windows 7 and Windows 8. As a consequence, this is the last version of Firefox that users on those operating systems will receive. […]

Similarly, this is the last major version of Firefox that will support Apple macOS 10.12, 10.13, and 10.14.

From next month, if you’re stuck with computers that can only run older, unsupported versions of Windows and macOS, you’ll automatically be switched over to the Firefox ESR version.

ESR is short for Extended Support Release, a special Firefox flavour that gets security updates but not feature updates.

Unfortunately, every so often the ESR absorbs all the feature updates that have been deferred since the last time the ESR “caught up”, after which it spends a year or so quietly getting just security updates once again.

In other words, ESR versions last for just over a year before they are “re-based” on a recent major version, complete with all the new features from the interim period added in, and all the now-expunged features taken out.

By the end of 2023, for example, the ESR release will be at 115.6, which means that it will be this month’s version feature-wise, along with all the security patches that have come out since now.

But September 2024 will see the last ESR version release based on major version 115, namely ESR 115.15…

…after which the oldest supported ESR release will be based on the code of next month’s major version 116, which won’t run on your older Windows and Mac devices any more.

In short, Windows 7, Windows 8 and macOS-before-Catalina (10.15) won’t get Firefox updates at all after September 2024, because even the ESR version will no longer support those platforms.

(If you can’t update your computer by then, we strongly suggest switching to an alternative operating system that is supported on your hardware, such as Linux, so you can not only get system upgrades but also run an up-to-date browser.)

Patches this month

Fortunately, none of this month’s security patches are listed as zero-days, meaning that all the fixes included are for bugs that were either responsibly disclosed by outside researchers, or discovered by Mozilla’s own security and development teams.

There are four CVE-numbered bug fixes rated High, namely:

  • CVE-2023-37201: Use-after-free in WebRTC certificate generation. Ironically, this means a potential remote code execution bug (where an attacker gets to implant code on your computer without warning) could be triggered during the very part of an audio or video call that’s supposed to set up a secure, end-to-end encrypted channel over HTTPS.
  • CVE-2023-37202: Potential use-after-free from compartment mismatch in SpiderMonkey. SpiderMonkey is the Mozilla software component responsible for handling JavaScript code. Running externally supplied JavaScript is supposed to be “mostly harmless”, because browser JavaScript engines deliberately limit the damage that remote JavaScript code can do. Unless, of course, the JavaScript engine itself contains an exploitable bug, allowing what’s known in the jargon as a security escape or a sandbox escape.
  • CVE-2023-37211: Memory safety bugs fixed in Firefox 115, Firefox ESR 102.13, and Thunderbird 102.13. As usual, Mozilla is candid enough to admit, even for bugs found automatically that might ultimately turn out not to be dangerous, “We presume that with enough effort some of these could have been exploited to run arbitrary code.”
  • CVE-2023-37212: Memory safety bugs fixed in Firefox 115. This is a further set of possible security bugs patched only in the latest major version, but not in the current ESR 102.13 release, presumably because these bugs were introduced via new features added since version 102 came out last year. The concern that “new features mean new bugs” is what leads some users to stick to ESR releases in the first place. (Note that you can add the two numbers in the ESR version together to tell you how far along you are in security update terms.)

There are numerous other Moderate and Low severity bugs, of which three stand out as interesting, at least in our opinion:

  • CVE-2023-37204: Fullscreen notification obscured via option element. Apparently, a rogue web page can switch Firefox into fullscreen mode while simultaneously kicking off a background calculation to use up so much processing power that you won’t see the browser’s warning about taking over the entire screen. Note that a rogue website can paint pixels anywhere on the display in fullscreen mode, including popping up realistic but fake operating system dialogs, or a displaying a bogus address bar with a fake URL in it. As a result, warnings before you enter fullscreen mode can considered vital.
  • CVE-2023-37207: Fullscreen notification obscured. This bug is similar to the previous one, though it is triggered not by chewing up processor time, but by referencing a type of URL (for example a mailto:// link) that gets handled by an external program instead of by the browser itself.
  • CVE-2023-37205: URL spoofing in address bar using Right-to-Left characters. We don’t know exactly how this bug works or how it might be exploited, but the description suggests that by mixing Arabic characters in a URL with Latin ones that specify the server name part, an attacker could get a malicious domain name in Latin script to get written out “backwards”. Thus a site that showed up as, say, moc.elpmaxe could actually refer to the server at example.com. With a carefully-chosen server name, an unknown and untrusted domain could be disguised to look like a well-known brand name.

What to do?

Open the Help > About Firefox window (or Firefox > About Firefox on macOS) to see what version you currently have, and to get the latest version if you’re out of date.

Note that if you’re months out of date, you may not get the latest version in one go, so go back into the About Firefox dialog again to check that there aren’t any additional update “jumps” you need to complete.

If Firefox is supplied by your Linux or BSD distro, check back with the distro itself for the latest version.


Ghostscript bug could allow rogue documents to run system commands

Even if you haven’t heard of the venerable Ghostscript project, you may very well have used it without knowing.

Alternatively, you may have it baked into a cloud service that you offer, or have it preinstalled and ready to go if you use a package-based software service such as a BSD or Linux distro, Homebrew on a Mac, or Chocolatey on Windows.

Ghostscript is a free and open-source implementation of Adobe’s widely-used PostScript document composition system and its even-more-widely-used PDF file format, short for Portable Document Format. (Internally, PDF files rely on PostScript code to define how to compose a document.)

For example, the popular open-source graphics program Inkscape uses Ghostscript behind the scenes to import EPS (Embedded PostScript) vector graphics files, such as you might download from an image library or receive from a design company.

Loosely put, Ghostscript reads in PostScript (or EPS, or PDF) program code, which describes how to construct the pages in a document, and converts it, or renders it (to use the jargon word), into a format more suitable for displaying or printing, such as raw pixel data or a PNG graphics file.

Unfortunately, until the latest release of Ghostscript, now at version 10.01.2, the product had a bug, dubbed CVE-2023-36664, that could allow rogue documents not only to create pages of text and graphics, but also to send system commands into the Ghostscript rendering engine and trick the software into running them.

Pipes and pipelines

The problem came about because Ghostscript’s handling of filenames for output made it possible to send the output into what’s known in the jargon as a pipe rather than a regular file.

Pipes, as you will know if you’ve ever done any programming or script writing, are system objects that pretend to be files, in that you can write to them as you would to disk, or read data in from them, using regular system functions such as read() and write() on Unix-type systems, or ReadFile() and WriteFile() on Windows…

…but the data doesn’t actually end up on disk at all.

Instead, the “write” end of a pipe simply shovels the output data into a temporary block of memory, and the “read” end of it sucks in any data that’s already sitting in the memory pipeline, as though it had come from a permanent file on disk.

This is super-useful for sending data from one program to another.

When you want to take the output from program ONE.EXE and use it as the input for TWO.EXE, you don’t need to save the output to a temporary file first, and then read it back in using the > and < characters for file redirection, like this:

 C:\Users\duck> ONE.EXE > TEMP.DAT C:\Users\duck> TWO.EXE < TEMP.DAT

There are several hassles with this approach, including these:

  • You have to wait for the first command to finish and close off the TEMP.DAT file before the second command can start reading it in.
  • You could end up with a huge intermediate file that eats up more disk space than you want.
  • You could get messed around if someone else fiddles with temporary file between the first program terminating and the second one launching.
  • You have to ensure that the temporary filename doesn’t clash with an existing file you want to keep.
  • You are left with a temporary file to clean up later that could leak data if it’s forgotten.

With a memory-based intermediate “pseudofile” in the form of a pipe, you can condense this sort of process chain into:

 C:\Users\duck> ONE.EXE | TWO.EXE

You can see from this notation where the names pipe and pipeline come from, and also why the vertical bar symbol (|) chosen to represent the pipeline (in both Unix and Windows) is more commonly known in the IT world as the pipe character.

Because files-that-are-actually-pipes-at-the-operating-system-level are almost always used for communicating between two processes, that magic pipe character is generally followed not by a filename to write into for later use, but by the name of a command that will consume the output right away.

In other words, if you allow remotely-supplied content to specify a filename to be used for output, then you need to be careful if you allow that filename to have a special form that says, “Don’t write to a file; start a pipeline instead, using the filename to specify a command to run.”

When features turn into bugs

Apparently, Ghostscript did have such a “feature”, whereby you could say you wanted to send output to a specially-formatted filename starting with %pipe% or simply |, thereby giving you a chance of sneakily launching a command of your choice on the victim’s computer.

(We haven’t tried this, but we’re guessing that you can also add command-line options as well as a command name to execute, thus giving you even finer control over what sort of rogue behaviour to provoke at the other end.)

Amusingly, if that is the right word, the “sometimes patches need patches” problem popped up again in the process of fixing this bug.

In yesterday’s article about a WordPress plugin flaw, we described how the makers of the buggy plugin (Ultimate Member) have recently and rapidly gone through four patches trying to squash a privilege escalation bug:

We’ve also recently written about file-sharing software MOVEit pushing out three patches in quick succession to deal with a command injection vulnerability that first showed up as a zero-day in the hands of ransomware crooks:

In this case, the Ghostscript team first added a check like this, to detect the presence of the dangerous text %pipe... at the start of a filename:

/* "%pipe%" do not follow the normal rules for path definitions, so we don't "reduce" them to avoid unexpected results */ if (len > 5 && memcmp(path, "%pipe", 5) != 0) { . . . 

Then the programmers realised that their own code would accept a plain | character as well as the prefix %pipe%, so the code was updated to deal with both cases.

Here, instead of checking that the variable path doesn’t start with %pipe... to detect that that the filename is “safe”, the code declares the filename unsafe if it starts with either a pipe character (|) or the dreaded text %pipe...:

/* "%pipe%" do not follow the normal rules for path definitions, so we don't "reduce" them to avoid unexpected results */ if (path[0] == '|' || (len > 5 && memcmp(path, "%pipe", 5) == 0)) { . . .
Above, you’ll see that if memcmp() returns zero, it means that the comparison was TRUE, because the two memory blocks you’re comparing match exactly, even though zero in C programs is conventionally used to represent FALSE. This annoying inconsistency arises from the fact that memcmp() actually tells you the order of the two memory blocks. If the first block would sort alphanumerically before the second, you get a negative number back, so that you can tell that aardvark precedes zymurgy1. If they’re the other way around, you get a positive number, which leaves zero to denote that they’re identical. Like this:

#include <string.h>
#include <stdio.h> int main(void) { printf("%d\n",memcmp("aardvark","zymurgy1",8)); printf("%d\n",memcmp("aardvark","00NOTES1",8)); printf("%d\n",memcmp("aardvark","aardvark",8)); return 0;
}
---output---
-1
1
0

What to do?

  • If you have a standalone Ghostcript package that’s managed by your Unix or Linux distro (or by a similar package manager such as the abovementioned Homebrew on macOS), make sure you’ve got the latest version.
  • If you have software that comes with a bundled version of Ghostscript, check with the provider for details on upgrading the Ghostscript component.
  • If you are a programmer, don’t accept any immediately-obvious bugfix as the beginning and end of your vulnerability-squashing work. Ask yourself, as the Ghostscript team did, “Where else could a similar sort of coding blunder have happened, and what other tricks could be used to trigger the bug we already know about.”

WordPress plugin lets users become admins – Patch early, patch often!

If you run a WordPress site with the Ultimate Members plugin installed, make sure you’ve updated it to the latest version.

Over the weekend, the plugin’s creator published version 2.6.7, which is supposed to patch a serious security hole, described by user @softwaregeek on the WordPress support site as follows:

A critical vulnerability in the plugin (CVE-2023-3460) allows an unauthenticated attacker to register as an administrator and take full control of the website. The problem occurs with the plugin registration form. In this form it appears possible to change certain values for the account to be registered. This includes the wp_capabilities value, which determines the user’s role on the website.

The plugin doesn’t allow users to enter this value, but this filter turns out to be easy to bypass, making it possible to edit wp_capabilities and become an admin.

In other words, when creating or managing their accounts online, the client-side web form presented to users doesn’t officially allow them to set themselves up with superpowers.

But the back-end software doesn’t reliably detect and block rogue users who deliberately submit improper requests.

Plugin promises “absolute ease”

The Ultimate Member software is meant to help WordPress sites to offer various levels of user access, listing itself as the “best user profile and membership plugin for WordPress”, and talking itself up in its advertising blurb as:

The #1 user profile & membership plugin for WordPress. The plugin makes it a breeze for users to sign-up and become members of your website. The plugin allows you to add beautiful user profiles to your site and is perfect for creating advanced online communities and membership sites. Lightweight and highly extendible, Ultimate Member will enable you to create almost any type of site where users can join and become members with absolute ease.

Unfortunately, the programmers don’t seem terribly confident in their own ability to match the “absolute ease” of the plugin’s use with strong security.

In an official response to the above security report from @softwaregeek, the company described its bug-fixing process like this [quoted text sic]:

We are working on the fixes related to this vulnerability since 2.6.3 version when we get a report from one of our customer. Versions 2.6.4, 2.6.5, 2.6.6 partially close this vulnerability but we are still working together with WPScan team for getting the best result. We also get their report with all necessary details.

All previous versions are vulnerable so we highly recommend to upgrade your websites to 2.6.6 and keep updates in the future for getting the recent security and feature enhancements.

We are currently working on fixing a remaining issue and will release a further update as soon as possible.

Bugs in many places

If you were on cybersecurity duty during the infamous Log4Shell vulnerability over the Christmas vacation season at the end of 2021, you’ll know that some types of programming bug end up needing patches that need patches, and so on.



For example, if you have a buffer overflow at a single point in your code where you inadvertently reserved 28 bytes of memory but meant to type in 128 all along, fixing that erroneous number would be enough to patch the bug in one go.

Now, however, imagine that the bug wasn’t down to a typing mistake at just one point in the code, but that it was caused by an assumption that 28 bytes was the right buffer size at all times and in all places.

You and your coding team might have repeated the bug at other places in your software, so that you need to settle in for an extended session of bug-hunting.

That way, you can promptly and proactively push out further patches if you find other bugs caused by the same, or a similar, mistake. (Bugs are generally easier to find once you know what to look for in the first place.)

In the Log4J case, attackers also set about scouring the code, hoping to find related coding mistakes elswhere in the code before the Log4J programmers did.

Fortunately, the Log4J programming team not only reviewed their own code to fix related bugs proactively, but also kept their eyes out for new proof-of-concept exploits.

Some new vulnerabilities were publicly revelealed by excitable bug-hunters who apparently preferred instant internet fame to the more sober form of delayed recognition they would get from disclosing the bug responsibly to the Log4J coders.

We saw a similar situation in the recent MOVEit command injection vulnerability, where associates of the Clop ransomware gang found and exploited a zero-day bug in MOVEit’s web-based front end, allowing the crooks to steal sensitive company data and then try to blackmail the victims into paying “hush money”.

Progress Software, makers of MOVEit, quickly patched the zero-day, then published a second patch after finding related bugs in a bug-hunting session of their own, only to publish a third patch shortly afterwards, when a self-styled threat hunter found yet another hole that Progress had missed.

Sadly, that “researcher” decided to claim credit for finding the vulnerability by publishing it for anyone and everyone to see, rather than giving Progress a day or two to deal with it first.

This forced Progress to declare it to be yet another zero-day, and forced Progress customers to turn the buggy part of the software off entirely for about 24 hours while a patch was created and tested.



In this Ultimate Members bug situation, the makers of the plugin weren’t as thoughtful as the makers of MOVEit, who explicitly advised their customers to stop using the software while that new and exploitable hole was patched.

Ultimate Members merely advised their users to keep their eyes out for ongoing updates, of which the recently published 2.6.7 is the fourth in a chain of bug fixes for a problem first noticed in the middle of June 2023, when 2.6.3 was the current version number.

What to do?

  • If you are an UltimateMember user, patch urgently. Given the piecemeal way that the plugin’s coding team seem to be addressing this issue, make sure you look out for future updates and apply them as soon as you can, too.
  • If you’re a server-side programmer, always assume the worst. Never rely on client-side code that you can’t control, such as HTML or JavaScript that runs in the user’s browser, to ensure that submitted input data is safe. Validate thine inputs, as we like to say on Naked Security. Always measure, never assume.
  • If you’re a programmer, search broadly for related issues when any bug is reported. Coding errors made in one place by one programmer may have been duplicated elsewhere, either by the same coder working on other parts of the project, or by other coders “learning” bad habits or trustingly following incorrect design assumptions.

go top