Radar Trends to Watch: December 2023
Developments in AI, Security, Web, and More
We’re continuing to push AI content into other areas, as appropriate. AI is influencing everything, including biology. Perhaps the biggest new trend, though, is the interest that security researchers are taking in AI. Language models present a whole new class of vulnerabilities, and we don’t yet know how to defend against most of them. We’ve known about prompt injection for a time, but SneakyPrompt is a way of tricking language models by composing nonsense words from fragments that are still meaningful to the model. And cross-site prompt injection means putting a hostile prompt into a document and then sharing that document with a victim who is using an AI-augmented editor; the hostile prompt is executed by the victim when they open the document. Those two have already been fixed, but if I know anything about security, that is only the beginning.
- We have seen several automated testing tools for evaluating and testing AI system, including Giskard and Talc.
- Amazon has announced Q, an AI chatbot that is designed for business. They claim that it can use information in your company’s private data, suggesting that it is using the RAG pattern to supplement the model itself.
- Let the context wars begin. Anthropic announces a 200K context window for Claude 2.1, along with a 50% decline in the percentage of false statements (hallucinations). Unlike most AI systems, Claude 2.1 is able to say “I don’t know” when it doesn’t have the answer to a question.
- There’s a tool for integrating generative art AI with the Krita open source drawing tool. It preserves a human-centered artist’s workflow while integrating AI. It uses Stable Diffusion and can run locally, with sufficient processing power; it might be capable of using other models.
- Simon Willison has published an excellent exploration of OpenAI’s GPTs. They’re more than they seem: not just a simple way of storing useful prompts.
- Google has announced some new models for AI-generated music. One model can provide an orchestration for a simple melody line, and represents an interesting connection between human creativity and AI. Audio output is watermarked with SynthID.
- Warner Bros. is using AI to simulate the voice and image of Édith Piaf for an upcoming biopic. Unlike the Beatles’ “Now and Then,” which used AI to restore John Lennon’s voice from earlier tapes, AI will synthesize Piaf’s voice and image to use in narration and video.
- An AI system from Google’s Deep Mind has been shown to outperform traditional weather forecasting. This is the first time AI has outperformed human weather prediction.
- A researcher has proposed a method for detecting and filtering unsafe and hateful images that are generated by AI.
- AI-generated facial images of White people can now appear “more real” than actual photographs. The same is not true of images of racial or ethnic minorities. What are the consequences of White faces being perceived as “more realistic”?
- Chain of Density is a relatively new prompting technique. You ask a language model to summarize something. The initial response will probably be verbose. Then you ask it to improve the summary by adding new facts without increasing the summary’s length.
- The Zephyr-7B model, a fine-tuned descendant of Mistral-7B, outperforms other 7B models on benchmarks. It was trained using a technique called knowledge distillation. It has not been trained to reject hate speech and other inappropriate output.
- Can a large language model be the operating system of the future? And if so, what would that look like?
- Quantization is a technique for reducing the size of large language models by storing parameters in as few as 4 bits. GPTQ is an open source tool for quantizing models. AutoGPTQ is another implementation that’s compatible with the Hugging Face Transformers library.
- Researchers use machine learning to enable users to create objects in virtual reality without touching a keyboard or a mouse. Gestural interfaces haven’t worked well in the past. Is this their time?
- Google’s PaLl-3 is a vision model with 5 billion parameters that consistently outperforms much larger models.
- Hem is an open source model for measuring generative AI hallucinations. It’s an interesting idea, though given a first glance at the leaderboard, it seems overly generous.
- OpenAI has announced the GPT store, an app store that is essentially a mechanism for sharing prompts. They also announced a no-code development platform for GPT “agents,” lower pricing for GPT-4, and indemnification against copyright lawsuits for users of GPT products.
- LangSmith looks like a good platform for developing and debugging LangChain-based AI agents.
- Tim Bray explains Leica’s use of C2PA to watermark photographs. C2PA is a standard that uses public key cryptography to trace image provenance. Photoshop implements C2PA, allowing both the image creator and its (Photoshop) editors to be traced.
- An important new group of attacks against Bluetooth, called BLUFFS, allows attackers to impersonate others’ devices and to execute man-in-the-middle attacks. All Bluetooth devices since roughly 2014 are vulnerable.
- If you aren’t already careful about what you plug in to your USB ports, you should be. LitterDrifter is a worm that propagates via USB drives. It is oriented towards data collection (i.e., espionage), and was developed by a group with close ties to the Russian state.
- The AlphV ransomware group wins the irony award. They reported one of their victims to the SEC for not disclosing the attack. Other groups are following the same strategy. The law requiring disclosure is not yet in effect, so aside from PR damage, consequences will be minor.
- SneakyPrompt is a new technique for creating hostile prompts that can “jailbreak” image generators, causing them to generate images that violate policies. It works by substituting tokens from words that aren’t allowed with tokens from other words that are semantically similar, creating a “word” that is nonsensical to humans but still meaningful to the model.
- Security researchers showed that Google’s Bard was vulnerable to prompt injection via Gmail, Google Docs, and other documents that were shared with unsuspecting victims. The hostile prompt was executed when the user opened the document. The vulnerability was promptly fixed, but it shows what will happen as language models become part of our lives.
- Researchers have demonstrated that an error during signature generation can expose private SSH keys to attack. Open source SSH implementations have countermeasures that protect them from this attack, but some proprietary implementations don’t.
- If you’re concerned about privacy, worry about the data broker industry, not Google and Facebook. A report shows that it’s easy to obtain information (including net worth and home ownership) about US military service members with minimal vetting.
- Proposed EU legislation called eIDAS 2.0 (electronic ID, Authentication and Services) gives European governments the ability to conduct man-in-the-middle attacks against secured web communications (TLS and https). It would be illegal for browser makers to reject certificates compromised by governments.
- Developer backlash against the Shift-Left approach to security isn’t unexpected, but it may be reaching its limits in other ways: attackers are focusing less on vulnerabilities in code and more on flaws in business logic—in addition to targeting users themselves.
- History is important. Gene Spafford has posted an excellent 35th anniversary essay about the Morris Worm, and lessons drawn from it that are still applicable today.
- In a simulated financial system, a trading bot based on GPT-4 not only used information that was declared as “insider information”; it stated that it had not used any insider information. The benefit of using the information outweighed the risk of being discovered. (Or perhaps it was behaving the same way as human traders.)
- If you write shell scripts, you will find this useful: ShellCheck, a program to find bugs in shell scripts.
- India has been experimenting successfully with digital public goods—publishing open source software with open standards and data—for creating a digital commons. Such a commons might be a practical alternative to blockchains.
- The Python Software Foundation has hired a security developer, with the intention of improving Python’s security features.
- Collaboration without CRDTs: CRDTs are important—but for many kinds of applications, it’s possible to build collaborative software without them.
- ShadowTraffic is a service for simulating traffic to backend systems. It is packaged as a Docker container, so it can easily run locally or in a cloud. It can currently simulate traffic for Kafka and Postgres, and webhooks, but its developer plans to expand to other backends quickly.
- The Rust + Wasm stack is a good choice for running Llama 2 models efficiently on an M2 MacBook. Memory requirements, disk requirements, and performance are much better than with Python.
- GitHub’s Copilot for Docs lets users ask questions that are answered by a chatbot trained on documentation in GitHub’s repositories. They plan to integrate other documentation, along with other GitHub content.
- OpenInterpreter sends prompts to a language model, and then runs the code generated by those prompts locally. You can inspect the code before it runs. It defaults to GPT-4, but can use other models, including models running locally. Automatically executing generated code is a bad idea, but it’s a step towards automating everything.
- Microsoft’s Radius is a cloud native application platform that provides a unified model for developing and deploying applications on all the major cloud providers.
- Knowing how to use the terminal is a superpower. But terminals make one thing difficult: recording terminal sessions. Asciinema is an open source project that solves the problem.
- Bug triage: You can’t fix all the bugs. But you can prioritize what to fix, and when.
- Bjarne Stroustrup proposes memory safety for C++.
- We don’t know why you’d want to run Windows 98 in the browser, but you can. There’s no hint about how this is implemented; I assume it is some sort of Wasm wizardry.
- Opt for enhancement over replacement: that’s the argument for using HTML Web Components rather than React components.
- tldraw is a simple application that lets you draw a wireframe for a website on a screen, specify the components you want to implement it, and send it to GPT-4, which generates code for a mockup. The mockup can then be edited, and the code regenerated.
- Google is suing two people who have “weaponized” the DMCA by issuing false takedown notices against the websites of products (apparently T-shirts) that compete with them.
- WebRTC was designed to support videoconferencing. It has been used for many other real time applications, but there should be alternatives available. Replacing it will take years, but that’s the goal of the Media over Quic project.
- The UK has approved a CRISPR-based genetic therapy for sickle cell anemia and beta thalassemia.
- A European startup named Cradle has created a generative AI model to design new proteins.
- In a small test involving patients with a genetic predisposition to high cholesterol, a CRISPR treatment that modified a gene in the liver appeared to reduce cholesterol levels permanently. Larger and more comprehensive testing will follow.
- Open source drug discovery might be an approach for developing antivirals for many common diseases for which there are no treatments, including diseases as common as measles and West Nile.
- AI is coming to the Internet of Things. ARM’s latest CPU design, the Cortex-M52, is a processor designed for AI in low-power, low-cost devices.
- Microsoft has developed its own AI chip, Maia, which will be available on Azure in 2024.
- H100 GPUs are yesterday’s technology. NVIDIA has announced the H200, with more and faster memory. NVIDIA claims almost double the performance of the H100 in LLM inference, and up to 100X performance for “data science” applications.