Category Archives: News

Skimming the CREAM – recursive withdrawals loot $13M in cryptocash

You must have had that happy feeling (happiest of all when it’s still a day or two to payday and you know that your balance is paper-thin) when you’re withdrawing money from a cash machine and, even though you’re still nervously watching the ATM screen telling you that your request is being processed, you hear the motors in the cash dispensing machinery start to spin up.

That means, even before any banknotes get counted out or the display tells you the final verdict, that [a] you’ve got enough funds, [b] the transaction has been approved, [c] the machine is working properly, and [d] you’re about to get the money.

Well, imagine that if you hit the [Cancel] button at exactly the right moment between the mechanism firing up and the money being counted out…

…and if your timing was spot on, then your card would stay in the machine, your account wouldn’t get debited, and you’d be asked if you wanted to try again, BUT YOU’D GET THE CASH FROM THE CANCELLED TRANSACTION ANYWAY!?!!?

And imagine that, as long as you kept pressing that magic button at just the right moment, you could loop back on yourself and layer ghost withdrawal on ghost withdrawal…

…until the machine finally ran out of money, or hit some internal software limit on recursive withdrawals, or you decided to quit while you were ahead and get clear of the ATM before an alarm went off.

They thought of that

Cashpoint machines aren’t infinitely wealthy, of course, because cash is bulkier than you might think, so if you didn’t get your card back and could only work this trick at one ATM, you’d briefly be rich, but you wouldn’t be an instant millionaire.

You might get away with somewhere between $10,000 and $100,000 (less in the UK, where cash machines generally only contain £10 and £20 notes), depending on the maximum capacity of the machine, the age of the banknotes (used money doesn’t lie quite as flat as crisp, uncirculated bills), how full the bank typically stacks that particular machine, and the time of day.

But in real life, it’s never (or almost never, given that “never” is a treacherous word in cybersecurity) going to happen that way, because your bank isn’t crazy: withdrawals follow a software engineering principle known in the jargon as ACID, which stands for atomic, consistent, isolated and durable.

Which is a fancy way of saying that you won’t get the money if the debit hasn’t been recorded against your account, and your account won’t get debited if the money can’t be dispensed: it’s always both or neither, never just one or the other.

(And you can bet your boots that if ever there were a glitch, it would almost certainly favour the bank, and you’d have to report the problem in person to get the machine checked out to confirm that the money didn’t actually emerge correctly.)

In boolean logic, you could describe this situation as (A AND B) OR ((NOT A) AND (NOT B)), or XNOR (the negation of exclusive or) for short.

Note. In the unlikely event that you ever receive more money from an ATM that you were expecting, for example if you withdraw £100 and get a wedge of ten £20s instead of £10s, or if your post-withdrawal balance shows up as undiminished even after you’ve just pocketed $500 from the machine, don’t assume it’s tough luck for the bank and thus that the money is a free gift. If it were a genuine mistake, and you jolly well knew it, you’ll almost certainly be found liable for the amount, given that you did, after all, receive it and keep it.

DeFi not quite so careful

DeFi, short for decentralised finance, is all the rage at the moment, especially in the form of unregulated cryptocurrencies and so-called “smart contracts”, which are essentially short programs – software code in which you express a sequence of trading commands – that operate automatically to shift your cryptocurrency holdings around in the ether.

The decentralised part, which is also the deregulated (or, more precisely, the unregulated) part, means that there are no “clearing houses” or traditional procedures that would be applied if you were operating through centralised banks.

In old-school banking, transactions are sluggish, and may require human approvals along the way, but can (at least sometimes) be reversed entirely if some part of the process goes wrong or is successfully contested.

Simply put, DeFi aims to avoid centralised control, to bypass the vested interests of existing financial insitutions, and thus to speed up and liberalise online trading,…

…while simultaneously removing a lot of the regulatory protection (and potentially ignoring centuries worth of operational wisdom) that you enjoy in the traditional, slow-coach banking world that DeFi fans aim to break away from.

You’d be unlikely to accept the fast-talking, modern coding motto of move fast and break things if you were relying on internet-enabled software to drive your car, design a bridge you’d have to use every day, or subject you to potentially dangerous medical intervention…

…yet in the finance sector, you’d be forgiven for thinking that this motto is the rule, rather than the exception.

Hundreds of millions missing in action

More than two weeks ago, for example, we described how a software design blunder led to the Chinese cryptocurrency exchange Poly Networks suffering a cyber-robbery of more than half a billion dollars ($610 million in total, apparently), until the hacker behind the heist somewhat reluctantly decided to hand back the funds, a process that apparently took until earlier today to complete.

On 2021-08-20, we wrote about a Japanese outfit called Liquid that apparently lost more than $100 million in an electronic smash-and-grab of its own.

That company – whose goal of keeping your cryptocurrencies liquid and tradeable as quickly as possible turned out to leave the company itself in a dangerously illiquid state – is only gradually getting going again, after assuring customers that they won’t end up out of pocket themselves.

Apparently, the company has rushed out a brand new security system for its cryptocurrency storage, and is now telling customers to “rest assured, […] our state-of-the-art [multi-party computation] technology ensures assets remain secure at all times. […] Your assets are safe with us and will always be.”

Another smart contract system bites the cyberdust

This week, it’s the turn of Taiwan-based cryptofinance company C.R.E.A.M. to suffer the shortcomings of smart contract software sloppiness, with a cyberthief allegedly making off with some $13 million in the crytocoins AMP and ETH.

The company’s own notification on Twitter just says that the exploit happened…

“by way of re-entrancy on the AMP token contract”.

Re-entrancy, or recursion if you want to call it that, is a digital problem that’s very much like the unlikely cash machine withdrawal “trick” that we speculated about at the start of this article.

For example, imagine if you have smart contract code (greatly simplified below) that allows the other party to check that they have at least $X in their account; then to call smart contract code from their side of the deal to process $X; then to deduct that $X from their account.

Don’t worry if you aren’t a programmer, because the overall misbehaviour should be clear: you’re accepting function calls to a smart contract called company.withdraw() where customers can specify an account to withdraw from, an amount to withdraw, and smart contract code of their own to be called to process the withdrawal of the specified amount.

After you’ve verified their balance can cover the funds, and permitted them to transact the approved amount, you then debit their account to reflect the money they just spent.

Like this, in pseudocode:

function company.withdraw(account, amount, contractcode) { // Check that there is 'amount' left in 'account' call company.verifybalance(account, amount); // Call 'contractcode' function with approved 'amount'. call contractcode(amount); // And then take 'amount' out of 'account' call company.reducebalance(account, amount); } 

But this opens a hole in which the smart contract code provided in the user’s request can re-enter your own code, calling it recursively (i.e. without waiting for the previous call to complete), like this:

function customer.contract(amount) { // Customer's smart contract code // First, spend the 'amount' approved by the company [...disburse the amount somehow...] // Then, re-enter the withdraw() function above, recursively // specifying this very function as the smart contract once again, // which will itself be called again, which will in turn // recursively call withdraw() again... call company.withdraw(account, amount, customer.contract); // And the line above in the withdraw() function where reducebalance() // is supposed to be called will never be reached, because this code // will keep jumping back into the withdraw() function right at the // top, which will come back here, etc. etc. :-( }

If you trace the program flow with your finger, you will see that if the customer correctly authenticates their account, and has at least 1000 units of credit available to pass the initial balance check, then if they trigger a transaction by issuing a call company.withdraw(account,1000,customer.contract), the flow of code will go like this:

0001: call company.withdraw(account,1000,customer.contract); // Start a transaction
0002: call company.verifybalance(account,1000); // Succeeds because account has >= 1000 in it
0003: call customer.contract(1000) // So there is 1000 to spend
0004: [...disburse the amount somehow...] // And thus we spend 1000 that actually have
0005: call company.withdraw(account,1000,customer.contract); // Sneakily re-enter withdraw() at the top
0006: call company.verifybalance(account,1000); // Succeeds again because account wasn't debited yet
0007: call customer.contract(1000) // So there is apparently still 1000 to spend
0008: [...disburse the amount somehow...] // And we get to spend our first 1000 again
0009: call company.withdraw(account,1000,customer.contract); // Again re-enter withdraw() at the top
000A: call company.verifybalance(account,1000); // Succeeds yet again because account still not debited
000B: call customer.contract(1000) // And another 1000 gets approved 000C: [...disburse the amount somehow...] // And we spend our first 1000 for a third time
000D: [...and so on recursively forever...] // !!!KA-CHING! RE-ENTRANCY JACKPOT SITUATION!!!

C.R.E.A.M. (which really is an abbreviation, as the dots imply, that stands for Crypto Rules Everything Around Me), has said simply that it has stopped the exploit by pausing supply and borrow on AMP”, where AMP is the cryptocurrency system where the company’s bug was abused, and advised: “Post-mortem to come”.

What to do?

What can we say, except the same as we said the time before the time before?

  • Don’t bet more than you can afford to lose. DeFi is neither regulated by the authorities nor mature in its cybersecurity abilities, so there’s not only a higher-than-usual risk of your funds going missing, but also a lower-than-usual likelihood of getting your money back if they do.
  • Don’t keep all your funds in a “hot” state.. Store as much as you can offline and encrypted in so-called cold wallets, so your cryptocoins aren’t all instantly available online for crooks to grab hold of if someone else makes a security blunder.

Big bad decryption bug in OpenSSL – but no cause for alarm

The well-known and widely-used encryption library OpenSSL released a security patch earlier this week.

Annoyingly for those who like lean, modern, sans serif typefaces, the new version is OpenSSL 1.1.1l, which is tricky to interpret if you use a font in which upper case EYE, lower case ELL and the digit ONE look at all similar.

To spell it out phonetically, you’re after OpenSSL version ONE dot ONE dot ONE LIMA.

(At the time of writing, Naked Security’s official typeface is Flama, a Bauhaus-inspired font family derived from DIN 1451, which itself arose out of early 20th century German railway and road lettering styles. Our lower case ELLs have a neat looking rightwards curl at the bottom to improve their legibility, and ONEs get a classically European look with a crossbar at the bottom and a little leftward flick at the top. But not all typefaces are made that way.)

The bugs

OpenSSL, as its name suggests, is mainly used by network software that uses the TLS protocol (transport layer security), formerly known as SSL (secure sockets layer), to protect data in transit.

Although TLS has now replaced SSL, removing a huge number of cryptographic flaws along the way, many of the popular open source programming libraries that support it, such as OpenSSL, LibreSSL and BoringSSL, have kept old-school product names for the sake of familiarity.

Despite having TLS support as its primary aim, OpenSSL also lets you access the lower-level functions on which TLS itself depends, so you can use the libcrypto part of OpenSSL to do standalone encryption, compute file hashes, verify digital signatures and even do arithmetic with numbers that are thousands of digits long.

There are two bugs patched in the new version:

  • CVE-2021-3711: SM2 decryption buffer overflow.
  • CVE-2021-3712: Read buffer overruns processing ASN.1 strings.

Strings, long and short

The second of these bugs, CVE-2021-3712, is the less dangerous of the two, and ironically relates to how OpenSSL handles encoded cryptographic keys and certificates.

The raw data inside TLS key and certifcate files is packaged up in a format called DER, short for Distinguished Encoding Rules, which is a form of ASN.1, short for Abstract Syntax Notation version 1, a structured way of representing binary data.

(Note that if you’ve ever looked at TLS keys or certificates, you’ve probably seen something like this:

-----BEGIN PUBLIC KEY-----
MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAE+LXZfjSOTE0cigDmC3Vlbm0VABgl
Zkmp1zbZsiN9ILxqSQy5Krrza94c/eVZORK03gteh9txboKKQOh6LyAftg==
-----END PUBLIC KEY-----

That’s just a topped-and-tailed, base64 encoded version of the raw DER data, used to make the file easier to recognise and less likely to get mangled in transit than a pure binary file.)

Those names are quite a mouthful, but the important part is not the jargon, but the fact that text strings in ASN.1 are stored in a similar way to how they are in programming languages like Pascal, namely with a length field followed by exactly that much data.

In C, however, strings are stored without any length field: you just get the raw text data, ended with a zero (NUL) byte.

That makes C strings much simpler to use, but it can be annoying for three reasons.

Firstly, it means you can never be sure how long a string is until you traverse the whole thing to find out where the NUL byte is; secondly, you can’t have a NUL byte in the middle of a string, even if you want to; and thirdly, if the final NUL byte gets left out, then copying or printing out a string could go on and on for ages, and include way more information than you intended, assuming that the unterminated string is followed by a large block of non-zero data.

So, ASN.1 gives you structure and control, while C gives you simplicity and speed.

To give you the best of both worlds, OpenSSL always adds a NUL byte to its ASN.1 strings, even though this is not necessary.

This means you can access those strings via OpenSSL’s special ASN.1 functions, as usual, but also safely read them out directly from C, for example if you want to print out a message that contains the relevant string.

Actually (and this is the problem), that’s not quite true.

You can can safely treat OpenSSL’s ASN.1 strings as C strings, but only if they were generated by OpenSSL’s special “always add the NUL byte” functions, otherwise you could end up with an unterminated string, and all the problems that can cause.

If you construct the ASN.1 data yourself, maliciously or otherwise, you don’t need to include those pesky NUL bytes, and the resulting strings will not be safe to use directly from C – you will need to use OpenSSL’s special access functions all the time.

Unfortunately, a few of OpenSSL’s own functions were found to be taking shortcuts and relying on directly accessing ASN.1 strings from C, even when they couldn’t be sure that the original data had been created with those all-important NUL bytes tacked on the end.

As a result, with clever shenanigans, it might be possible for an attacker to trick OpenSSL into printing out data that goes beyond the end of the memory buffer.

This sort of read buffer overflow could lead to to data leaks, where private data accidentally gets revealed in output such as a log file or a system message.

Alternatively, read overflows could lead to memory access errors, where OpenSSL reads so much extra data beyond the missing NUL that an access violation occurs, leading to a software crash.

Data breached due to a read overflow is exactly what happened in the infamous Heartbleed bug, where code that was supposed to send a reply containing just a few bytes from memory inadvertently copied up to 64Kbytes every time instead.

This leaked out whatever else was adjacent in memory at the time, possibly including passwords or decryption keys that just happened to be nearby.

Decrypting too much

The more serious bug of the two, CVE-2021-3711, also involves a buffer overflow, but this time it’s a write overflow, making it much more dangerous.

If you can write past the end of an officially allocated block of memory and modify data that controls some other part of a program, then you may be able to manipulate the behaviour of the program in the future.

For example, you may be able to trick it into thinking that something succeeded (or failed) when it didn’t, or even to take over the flow of program execution entirely.

Using booby trapped data to take over a running program is known as RCE, short for remote code execution, which means exactly what it says: someone else, perhaps even on the other side of the world, gets to control your computer without so much as a popup dialog or a warning.

The CVE-2021-3711 bug relies on a common programming idiom used in software code that generates output data.

That idiom involves using the same data output function twice in succession: first, you run the function but say “don’t actually generate the data, just tell me how much there will be when I do it for real”; then, after setting aside a buffer of the right size, you run the function again to produce the actual data.

That way, in theory, you can reliably avoid buffer overflows by making sure that you have enough memory space before you start.

Except that in one specific case in OpenSSL, namely when using the Chinese government’s cryptographic algorithm known as ShangMi (SM), the software may end up telling you that you’ll need a buffer size up to 62 bytes too small.

That means that booby-trapped encrypted data sent into for decryption could trigger a significant, writable buffer overflow inside OpenSSL.

In particular, it looks as though the buggy code could end up getting called if OpenSSL is asked to set up a TLS connection using the ShangMi ciphersuite and then to validate the web certificate presented by the other end during the TLS handshake.

So far, we’re not aware of any working exploits using this vulnerability.

With a bit of luck

The good news here is that official TLS support for ShangMi was only introduced in RFC 8998, dated March 2021, so it’s a newcomer to the world’s cryptographic stable.

So, although OpenSSL includes implementations of the SM algorithms (SM2 for key agreement and digital signatures, SM3 for hashing, and SM4 for block encryption)…

…it doesn’t yet include the code needed to allow you to choose these algorithms as a ciphersuite for use in TLS connections.

You can’t ask your TLS client code to request a ShangMi connection to someone else’s server, as far as we can see; and you can’t get your TLS server code to accept a ShangMi connection from someone else’s client.

So the bug is in there, down in the low-level OpenSSL libcrypto code, but if you use OpenSSL at the TLS level to make or accept secure connections, we don’t think you can open up a session in which the buggy code could be triggered.

In our opinion, that greatly reduces the likelihood of criminals abusing this flaw to implant malware on your laptop, for example by luring you to a booby-trapped website and presenting you with a rogue certificate during connection setup.

What to do?

  • Upgrade to OpenSSL 1.1.1l if you can. Although most software on Windows, Mac, iOS and Android will not be using OpenSSL, because those platforms have their own alternative TLS implementations, some software may include an OpenSSL build of its own and will need updating independently. If in doubt, consult your vendor. Most Linux distros will have a system-wide version of OpenSSL, so check with your distro for an update. (Note: Firefox doesn’t use OpenSSL on any platforms.)
  • Consider rebuilding OpenSSL without ShangMi support if you can’t upgrade. Passing the options nosm2 nosm3 nosm4 to the OpenSSL config script before rebuilding will do the trick. Given that ShangiMi can’t yet be selected for use in TLS connections, you should find this to be a non-disruptive change.
  • If you’re a programmer, always assume the worst about data. Never assume that data you are called upon to deconstruct was created by the matching construction functions that you carefully coded into your own library. Crafting booby-trapped data packets that don’t look the way you expect (and weren’t covered in your own testing) is what a lot of cybersecurity researchers do for a living. Unfortunately, not all of them work for the Good Guys.

S3 Ep47: Daylight robbery, spaghetti trouble, and mousetastic superpowers [Podcast]

[02’00”] More money troubles in cryptotown.
[10’28”] Trouble with plastic spaghetti.
[21’10”] The mouse that conquered Windows.
[31’38”] Oh! No! When you report yourself for phishing.

With Paul Ducklin and Doug Aamoth.

Intro and outro music by Edith Mudge.

LISTEN NOW

Click-and-drag on the soundwaves below to skip to any point in the podcast. You can also listen directly on Soundcloud.


WHERE TO FIND THE PODCAST ONLINE

You can listen to us on Soundcloud, Apple Podcasts, Google Podcasts, Spotify, Stitcher, Overcast and anywhere that good podcasts are found.

Or just drop the URL of our RSS feed into your favourite podcatcher software.

If you have any questions that you’d like us to answer on the podcast, you can contact us at tips@sophos.com, or simply leave us a comment below.


How a gaming mouse can get you Windows superpowers!

We all know a sysadmin or two (or three, or four) who are seriously into gaming, and have the cool hardware to prove it…

…perhaps including a special chair, dedicated headphones, an ultra-hackable mouse, and an indestructible, mechanically triggered, 6-key-rollover, touch-typist’s keyboard (with multicoloured blank keycaps, configured in COLEMAK format, rather than QWERTY or even DVORAK, natch).

But what if you want to go the other way around?

What if you’re a gamer who wants to be a sysadmin? On someone else’s computer?

Well, apparently, until last week at least, gamer-centric mice and keyboards from popular vendor Razer could help you to do just that.

The problem, it seems, was a Helpful Feature that turned out also to be a bug, as noted on Twitter by a researcher going by @j0nh4t, or jonhat in full:

At first viewing, the video above might seem a bit overwhelming, if not actually confusing, but the bug goes something like this:

  • You plug in a Razer gaming mouse for the first time.
  • Windows detects that this device type has special software and drivers that will make it work Even Better than a regular mouse.
  • Windows finds Razer’s official addons in the Windows Update cloud.
  • Windows downloads and launches the official addons so you don’t have to.
  • The Razer app helpfully ends with a clickable directory name, showing you what ended up where in the installation process.

As risky as this sounds at first, you can argue that it’s better than leaving you to your own devices, literally and figuratively, as early versions of Windows did, casting around on the internet for a download that looks like a driver that might work…

…and then downloading it from a site that might be the real one, or might not…

…and then running it with Admin powers yourself, only to find it’s the wrong driver, and your device now doesn’t work at all, and that you can’t figure out how to revert the installation…

…and then wondering why your computer is slowing to a crawl and uploading hundreds of megabytes of data to a weirdly named server in [REDACTED] while at the same time sending thousands of weird emails to people throughout [REDACTED].

With that in mind, automatically getting and running an official copy of the official drivers and app from an official Microsoft server sounds not only much more more convenient but also much less likely to end badly.

When a feature turns into a bug

The problem in this case is the point at which Razer’s app helpfully displays the name of the software installation directory at the end, even though it doesn’t need to.

That’s an active link in Razer’s app, so you can right-click on it and view the directory in File Explorer.

Then, once you’re in Explorer, you can do a Shift-and-right-click and use the handy option Open PowerShell window here, giving you a command-line alternative to the existing Explorer window.

But that PowerShell prompt was spawned from the Explorer process, which was spawned from Razer’s installer, which was spawned by the automatic device installer process in Windows itself…

…which was running under the all-powerful NT AUTHORITY\SYSTEM account, usually referred to as NTSYSTEM or just System for short.

So the PowerShell window is now running as System too, which means you have almost complete control over the files, memory, processes, devices, services, kernel drivers and configuration of the computer.

In other words, if you’re a penetration tester given access to unlocked company laptops to see how long it takes you to promote yourself to get Admin superpowers via a regular user’s account, and if you have a Razer mouse with you, the answer is probably, “Not very long”.

Technically, you don’t need to take a wired mouse with you – if you have a Razer wireless mouse, plugging in just the dongle ought to be enough, whether the mouse is present or not, because the dongle is what interacts with the USB subsystem on Windows and identifies the device.

And if you have a hacker-friendly tool such as a Raspberry Pi Zero, which has two-way USB hardware configurability, you can either set it up as a USB host and plug other devices into it, or you can set it up as a USB device and plug it into other computers.

When you pretend via software to be a USB device such as a mouse or a keyboard (Razer makes keyboards, too, that come with added drivers to make them perform Even Better for avid gamers), you can pretend you are almost any sort of device made by almost any known vendor.

We tried tricking Windows 11 into thinking our Pi Zero was a Razer keyboard, a subterfuge that Windows accepted, happily taking fake keystrokes from our bogus device. However, we ran out of time to create a configuration with enough verisimilitude to convince Windows that it needed to fetch additional drivers to support the device fully. We assume that with more time, knowledge, or both, it could be done easily.

What to do?

If you’re a home user who’s an Admin on your own computer already, you don’t have anything to worry about.

For self-admins, the Elevation of Privilege (EoP) trick described here is just an expensive and roundabout way of using the regular “Run as administrator” Windows option that would give you a PowerShell or an command prompt with superpowers anyway.

But if you are looking after a computer for another family member who doesn’t have Admin powers, or you are running a corporate network where users aren’t supposed to fiddle with the underlying operating systems on their laptops, you probably want a way to stop this sort of trick.

To lock down an individual computer (we tested this on both Windows 10 and Windows 11), go to Settings > System > About, and click on the Advanced system settings option:

This should bring up the System Properties window:

Click on Hardware > Device Installation Settings and choose No for the setting Do you want to automatically download manufacturers’ apps and custom icons available for your devices?

Without additional drivers, many new devices will work anyway (albeit without added features such as extra buttons on a gaming mouse, or special zoom-and-pan features on a webcam); others, however, may not work at all if Windows doesn’t already have a generic driver that supports the core functionality of the device.

To get the device to work fully, the user will need to ask a designated Admin to enable the installation of the new drivers and software for them.

Further information

On a corporate network, you may also want to look at taking broader control over who’s allowed to add what sort of device, and where.

For Windows networks in general, you might want to review Microsoft’s own advice for regulating device installation with Group Policy.

If you are a Sophos customer, please take a look at Sophos Central Peripheral Control, for additional security and control.


What’s *THAT* on my 3D printer? Cloud bug lets anyone print to everyone

Are you part of the Maker scene?

If so, you probably have your very own 3D printer (or, depending on how keen you are, several 3D printers) stashed in your garage, shed, basement, attic or local makerspace.

Unlike an old-school 2D plotter than can move its printing mechanism side-to-side and top-to-bottom in order to skim across a horizontal surface, a 3D printer can move its print head vertically as well.

To print on a surface, a 2D plotter usually uses some sort of pen that releases ink as the print head moves in the (X,Y) plane.

A 3D printer, however, can be instructed to emit a stream of liquid filament from its print head as it moves in (X,Y,Z) space.

In hobbyist printers, this filament is usually a spool of fine polymer cord that’s melted by a heating element as it passes through the head, so that it emerges like gloopy plastic dental floss.

If emitted close enough to a part of the output that’s already been printed, the melted floss gloms onto the existing plastic, hardens, and ultimately forms a complete model, like this (but a lot more slowly):

Video by RepRapPro on Wikimedia.
Creative Commons Attribution 2.5 Generic.

As you can imagine, there’s a lot that can go wrong when printing a model in this way, notably if the fine stream of molten gloop doesn’t emerge near an existing surface onto which it can stick and solidify.

If the model becomes poorly balanced and falls over; if the print head gets out of alignment; if the polymer is not quite hot enough to stick, or is too hot to harden in time; if there’s even a tiny mistake in any of the (X,Y,Z) co-ordinates in the print job; if an already-printed part of the model buckles out of shape or warps slightly; if the print nozzle suffers a temporary blockage…

…then you can end up with the print head spewing out a detached swirl of unattached plastic thread, like a giant toothpaste tube that’s been squeezed, and squeezed, and squeezed.

And once your 3D printer has got itself into the squeeze-and-squeeze-the-toothpaste-tube state, it will almost certainly keep on squishing out disconnected strands of plastic floss, with nothing to adhere to, until the filament runs out, the printer overheats, or you spot the problem and hit the [Cancel] button.

This produces what makerpeople refer to as a spaghetti monster, as this Reddit poster reveals in a plea for help entitled What makes spaghetti happen?, complete with a picture of one that got away:

Click on image to see original post.

The Spaghetti Detective

The problem with most 3D print jobs is that they don’t take minutes, they take hours, perhaps even days, so it’s difficult to keep an eye on them all the time.

Many hobbyists rig up up webcams that they can connect to remotely, so that they can intermittently check up on running print jobs while they’re out and about running other jobs such as shopping and going to work, which gives them a chance to shut down a failed job without using up a whole spool of filament first.

But even with remote access enabled, you can’t keep watch all the time, especially if you’re sleeping while an overnight job completes.

Enter The Spaghetti Detective (TSD), an open source toolkit that uses automated image recognition techniques to detect the appearance of “spaghetti” in or around a running print job so that it can warn you or shut down the job automatically.

Alternatively, if you don’t want the hassle of setting up a working TSD server of your own (there’s quite a lot of work involved, and you’ll probably need a spare computer) then the creator of TSD, Kenneth Jiang, offers a cloud-based version that’s free for occasional use, or $48 a year if you want 50 hours of online webcam monitoring a month that you can use to detect spaghettified jobs automatically.

Jiang himself say that he identifies as a hacker, not a coder, and admits that this which means he favours “getting features built fast”, as well as being “sloppy about coding styles” and “terrible at algorithm questions”.

Well, those comments came back to bite him late last week when he made some modifications to the TSD cloud code and inadvertently opened up printers on private networks, such as a home Wi-Fi setup, to the internet at large.

As one Reddit user dramatically claimed (the original post has since been deleted for undisclosed reasons): [Woke] up this morning and [saw] this on my 3D printer, with a picture allegedly showing a job kicked off by someone they didn’t know, from a location they couldn’t determined:

Click on image to see original post.

What happened?

The good news is that Jiang has now fixed the problem he mistakenly created, written up a full mea culpa article to describe what happened, and thereby retained the goodwill of many, if not most, of the makerpeople that find his service useful:

I made a stupid mistake last night when I re-configured TSD cloud to make it more efficient and run faster. My mistake created a security vulnerability for about 8 hours. The users who happened to be linking a printer at that time were able to see each other’s printer through auto-discovery, and were able to link to them too! We were notified of a case in which a user started a print on someone else’s printer. […] My sincere apologies to our community for this horrible mistake.

(If you’re looking for lessons to learn from this response, take note that Jiang didn’t start with the dreaded words, “I take your security seriously”; he didn’t excuse himself by saying, “At least credit cards numnbers weren’t unaffected”; and he didn’t downplay the bug because it only lasted eight hours and apparently affected fewer than 100 people.)

The bad news is that although the immediate bug is fixed, the underlying system for deciding what devices are supposed to be able to discover which printers is still fundamentally flawed.

Jiang, it transpires, was permitting two devices to “discover” each other automatically based on whether they showed up on the internet with the same IP number, as they typically would if they were on the same private network behind the same home router.

That’s because most home routers, and many business firewalls, too, implement a feature called NAT, short for Network Address Translation, whereby outbound traffic from any internal device is rewritten so that it appears to have come directly from the router.

The replies, therefore, officially terminate at the router, which then rewrites the incoming traffic for the true recipient, and forwards it inwards to the originator.

This process is necessary (and, indeed, has been used since the 1990s) because there are fewer than 4 billion regular (IPv4) network numbers to go around, but far more than 4 billion devices that want to get online these days.

NAT allows entire networks, whether they consist of 5, 555 or 5555 different devices, to get by with just one internet-facing network number, and permits ISPs to reallocate network numbers on demand, instead of allocating them permanently to individual customers, where they might neither be needed or even used.

Network numbers aren’t authenticators

The bug that opened up Jiang’s TSD cloud so that anyone could discover everyone was caused by the fact that he accidentally started supplying the IP number of one of his own servers, a load balancer through which he passed all incoming traffic, as the “source IP address” of every incoming connection.

Loosely speaking, he turned the load balancer into a second later of NAT, so that everyone seemed to be connected to the same public network, thus making all the connected devices seem to belong to the same person.

Unfortunately, reverting the misconfiguration that caused this bug has only papered over the problem, for the simple reason that IP numbers aren’t suitable for identification and authentication.

Firstly, two devices with different IP numbers may very well be on the same physical network, as all devices were in the early days of the internet, back before NAT became necessary.

Secondly, two devices with the same IP number may very well be on different networks, for example if an ISP applies a second level of NAT in order to group different customers together and therefore to reduce the quantity of public IP numbers they need.

Likewise. if several companies in a shared building decide to pool their funds and share a firewall and high-speed internet connection, thus effectively letting the building act as an ISP, they may end up with the same public IP number, even though the individual devices are on independent networks operated by different businesses.

What to do?

  • If you’re a TSD user, the immediate risk of waking up to unsolicited output on your 3D printer is now largely under control. You don’t need to update anything because the bug was fixed on the server side. The TSD system is now no less secure than it was before.
  • If you’re a programmer, don’t take authentication shortcuts. Authentication needs strong cryptography, and identifiers such as IP numbers, MAC addresses, and blobs of data that is assumed to be random but might not be, are unsuitable as cryptographic material.

Jiang, in the meantime, says he’s looking to replace the current TSD auto-discover system with one that’s more precise and presumably also more secure, so if you’re a TSD user, keep an eye on his website to see how that project is getting along.


go top