Four Short Links

Nat Torkington's eclectic collection of curated links.

Four short links: 25 December 2018

Hardware Testing is Hard, Biological Keygen, Christmas Robots, and Open Data

  1. Maxclave (Bunnie Huang) -- you thought software testing was hard? Welcome to the world of hardware testing.
  2. Biological One-Way Functions for Secure Key Generation -- It is demonstrated that the spatiotemporal dynamics of an ensemble of living organisms such as T cells can be used for maximum entropy, high‐density, and high‐speed key generation.
  3. Christmas Robot Roundup (IEEE) -- selection of holiday greetings from various robots and robotics companies. I for one welcome our new tinsel-and-holly-clad industrial apparatus overlords.
  4. Congress Votes to Make Open Government Data the Default in the United States -- The Open, Public, Electronic, and Necessary Government Data Act (AKA the OPEN Government Data Act) is about to become law [...]. This codifies two canonical principles for democracy in the 21st century: 1. public information should be open by default to the public in a machine-readable format, where such publication doesn’t harm privacy or security. 2. federal agencies should use evidence when they make public policy. Merry Christmas, democracy; here's a small present in a bad year.

Four short links: 24 December 2018

Learning Prolog, Data Race, Animating Photos, and Easy Flashing

  1. Solving Murder with Prolog -- if THIS was the motivating example for Prolog, I'd have taken to it a lot sooner! I love those logic puzzle books.
  2. The Machine Learning Race is Really a Data Race (MIT Sloan Review) -- Organizations that hope to make AI a differentiator need to draw from alternative data sets—ones they may have to create themselves.
  3. Photo Wakeup: 3-D Character Animation from a Single Photo -- this is incredible work. Watch the video if nothing else.
  4. Etcher -- Flash OS images to SD cards and USB drives, safely and easily. Open source.

Four short links: 21 December 2018

Tech in China, Wisdom of Small Groups, iOS VPN, and Gameboy Supercomputer

  1. MIT TR: The China Issue -- from AI to landscaping, it's the state of big tech in China.
  2. Aggregated Knowledge From a Small Number of Debates Outperforms the Wisdom of Large Crowds -- what it says on the box. This is why I like the World Cafe Method of facilitating discussions.
  3. Wireguard for iOS -- a port of Wireguard VPN to the Apple mobile ecosystem.
  4. A Gameboy Supercomputer -- At a total of slightly over one billion frames per second, it is arguably the fastest 8-bit game console cluster in the world.

Four short links: 20 December 2018

Misinformation Research, AI UI, Facebook's Value, and Python Governance

  1. Common-Knowledge Attacks on Democracy -- We argue that scaling up computer security arguments to the level of the state, so that the entire polity is treated as an information system with associated attack surfaces and threat models, provides the best immediate way to understand these attacks and how to mitigate them. We demonstrate systematic differences between how autocracies and democracies work as information systems, because they rely on different mixes of common and contested political knowledge. Released 17 November; Bruce Schneier is co-author.
  2. Can Users Control and Understand a UI Driven by Machine Learning? -- In this article, we examine some of the challenges users encounter when interacting with machine learning algorithms on Facebook, Instagram, Google News, Netflix, and Uber Driver.
  3. Estimating the Value of Facebook by Paying Users to Stop Using It -- across all three samples, the mean bid to deactivate Facebook for a year exceeded $1,000.
  4. Python Gets a New Governance Model -- The council is imbued with "broad authority to make decisions about the project," but the goal is that it uses that authority rarely; it is meant to delegate its authority broadly. The PEP says the council should seek consensus, rather than dictate, and that it should define a standard PEP decision-making process that will (hopefully) rarely need council votes to resolve. It is, however, the "court of final appeal" for decisions affecting the language. But the council cannot change the governance PEP; that can only happen via a two-thirds vote of the core team. Python gets a constitution (aka PEP 8016).

Four short links: 19 December 2018

Observable Notebooks, Disinformation Report, Chained Blocking, and Trivia from 2018

  1. Observable Notebooks -- JavaScript notebooks. (via Observable Notebooks and iNaturalist)
  2. Disinformation Report -- selective amplification (or pre-consumption filtering) remains one of the most interesting open challenges in infotech, and this report gives context and urgency to it. The IRA shifted a majority of its activity to Instagram in 2017; this was perhaps in response to increased scrutiny on other platforms, including media coverage of its Twitter operation. Instagram engagement outperformed Facebook. New Knowledge note that the Russian misinformation agency was run like a digital marketing shop [...] They built their content using digital marketing best practices, even evolving page logos and typography over time.. (via Renee DiResta)
  3. Twitter Block Chain -- a Chrome extension that blocks followers of the jerk, not just the jerk themselves. The power of the open web is that we can write the tools the platforms don't yet provide, however clunky. (via Hadyn Green)
  4. 52 Things I Learned in 2018 -- each comes with attribution. Three sample facts, sans attribution: (*) 35% of Rwanda’s national blood supply outside the capital city is now delivered by drone. (*) [Unicode] includes a group of ‘ghost characters’ (妛挧暃椦槞蟐袮閠駲墸壥彁) which have no known meaning. It’s believed they are errors introduced by folds and wrinkles during a paper-based 1978 Japanese government project to standardize the alphabet, but are now locked into the standard forever. (*) Cassidy Williams had a dream about a Scrabble-themed mechanical keyboard. When she woke up, she started cold-calling Hasbro to ask for permission to make it real. Eventually, she made it happen.

Four short links: 18 December 2018

Singing AI, Content Signing, Data Rights, and Query Processing

  1. AI Voices -- marketing copy, but I can't find technical detail. The demos are worth checking out. The sprint to automated pop music generation has begun. Not just limited to Japanese, as it is also capable of producing convincing Mandarin and even English voices for songs such as Adele’s "Rolling in the Deep" and Britney Spears’ "Everytime" on their official website.
  2. Notary -- publishers can sign their content offline using keys kept highly secure. Once the publisher is ready to make the content available, they can push their signed trusted collection to a Notary server. Consumers, having acquired the publisher's public key through a secure channel, can then communicate with any Notary server or (insecure) mirror, relying only on the publisher's key to determine the validity and integrity of the received content.
  3. It's Time for a Bill of Data Rights (MIT TR) -- this essay argues that “data ownership” is a flawed, counterproductive way of thinking about data. It not only does not fix existing problems, it creates new ones. Instead, we need a framework that gives people rights to stipulate how their data is used without requiring them to take ownership of it themselves. (via Cory Doctorow)
  4. Trill -- a single-node query processor for temporal or streaming data: open source from Microsoft. Described in this blog post.

Four short links: 17 December 2018

Open Source Licensing, Computer History, Serverless, and Wicked Problems

  1. Open Source Confronts Its Midlife Crisis (Bryan Cantrill) -- To be clear, the underlying problem is not the licensing, it’s that these companies don’t know how to make money—they want open source to be its own business model, and seeing that the cloud service providers have an entirely viable business model, they want a piece of the action. Also see Bryan's followup: A EULA in FOSS Clothing: You will notice that this looks nothing like any traditional source-based license—but it is exactly the kind of boilerplate that you find on EULAs, terms-of-service agreements, and other contracts that are being rammed down your throat.
  2. A Computer of One's Own -- fantastic precis of the work of significant women in computing history.
  3. Serverlessness (Tim Bray) -- Tim works in AWS's Serverless group and has been collecting what he's learned in his years building serverless infrastructure.
  4. Why We Suck at Solving Wicked Problems -- this rings true with my experience.

Four short links: 14 December 2018

Satellite LoRaWAN, Bret Victor, State of AI, and Immutable Documentation

  1. Fleet -- launched satellites as backhaul for LoRaWAN base station traffic.
  2. Computing is Everywhere -- podcast episode with Bret Victor. Lots of interesting history and context to what he's up to at Dynamicland. (via Paul Ford)
  3. AI Index 2018 Report (Stanford) -- think of it as the Mary Meeker report for AI.
  4. Etsy's Experiment with Immutable Documentation -- In trying to overcome the problem of staleness, the crucial observation is that how-docs typically change faster than why-docs do. Therefore the more how-docs are mixed in with why-docs in a doc page, the more likely the page is to go stale. We’ve leveraged this observation by creating an entirely separate system to hold our how-docs.

Four short links: 13 December 2018

CS Ethics, Insect IoT, Glitch Showcase, and SQL Repos

  1. Embedded Ethics -- Harvard project that integrates ethics modules into courses across the standard computer science curriculum. Those modules are straightforward, online, and open access.
  2. Living IOT: A Flying Wireless Platform on Live Insects -- We develop and deploy our platform on bumblebees which includes backscatter communication, low-power self-localization hardware, sensors, and a power source. We show that our platform is capable of sensing, backscattering data at 1 kbps when the insects are back at the hive, and localizing itself up to distances of 80 m from the access points, all within a total weight budget of 102 mg. (via BoingBoing)
  3. Looky What We Made -- showcase of Glitch apps.
  4. Git Your SQL Together -- why I recommend tracking SQL queries in git: 1. You will *always* need that query again. 2. Queries are living artifacts that change over time. 3. If it’s useful to you, it’s useful to others (and vice versa)

Four short links: 12 December 2018

Render as Comic, Notebook to Production, Population Visualization, and Location Privacy

  1. Comixify -- render video as comics.
  2. How to Grow Neat Software Architecture out of Jupyter Notebooks -- everyone's coding in notebooks as a sweet step up from the basic one-command REPL loop. Here's some good advice on how to grow these projects without creating a spaghetti monster.
  3. City 3D -- This project wields data from the Global Human Settlement Layer, which uses “satellite imagery, census data, and volunteered geographic information” to create population density maps. Best visualization I've seen in a very long time.
  4. Your Apps Know Where You Were Last Night, and They're Not Keeping It Secret (NY Times) -- At least 75 companies receive anonymous, precise location data from apps whose users enable location services to get local news and weather or other information. They claim 200M mobile devices, with updates as often as every six seconds. These companies sell, use, or analyze the data to cater to advertisers, retail outlets, and even hedge funds seeking insights into consumer behavior. [...] An app may tell users that granting access to their location will help them get traffic information, but not mention that the data will be shared and sold. That disclosure is often buried in a vague privacy policy.

Four short links: 11 December 2018

Can We Stop?, Everything Breaks, Edge Cloud, and Molly Guard

  1. The Seductive Diversion of Solving Bias in Artificial Intelligence -- provocative title, but the point is that the preoccupation with narrow computational puzzles distracts us from the far more important issue of the colossal asymmetry between societal cost and private gain in the rollout of automated systems. It also denies us the possibility of asking: should we be building these systems at all? The expected value of pursuing this line of thinking is pretty low because there's a vanishingly small probability that we can coordinate activity globally to prevent something bad from happening. Exhibit A: climate change.
  2. Everything Breaks (Michael Lopp) -- Humans will greatly benefit from a clear explanation of the rules of the game. The rules need to evolve in unexpected ways to account for the arrival of more humans. The only way to effectively learn to what is going to break is keeping playing...and learning. See also lessons learned from scaling Stripe's engineering team.
  3. Terrarium (Fastly) -- an interesting glimpse at a possible future for web apps, where your CDN (which you need to have anyway if you're publishing anything remotely contentious or interesting) blurs with your hosting infrastructure provider. Terrarium is a multi-language deployment platform based on WebAssembly. Think of it as a playground for experimenting with edge-side WebAssembly. Being one of the first Fastly Labs projects, you can also think of it as our way of publicly experimenting with what the future of real highly performant edge computing could look like.
  4. molly-guard -- protects machines from accidental shutdowns/reboots. Etymology of the name: originally a Plexiglas cover improvised for the Big Red Switch on an IBM 4341 mainframe after a programmer's toddler daughter (named Molly) tripped it twice in one day. Later generalized to covers over stop/reset switches on disk drives and networking equipment. (via Mike Forbes)

Four short links: 10 December 2018

Language Zoo, VS AI, Advertising Plus, and Minecraft Scripting

  1. The Programming Languages Zoo -- a collection of miniature programming languages that demonstrates various concepts and techniques used in programming language design and implementation.
  2. AI in Visual Studio Code -- good to see IDEs getting AI-powered features to augment coders. In some small way, Doug Engelbart would be proud.
  3. Outgrowing Advertising: Multimodal Business Models as a Product Strategy -- business models from Chinese companies that are augmenting advertising with other revenue streams.
  4. Minecraft Scripting API in Public Beta -- The Minecraft Script Engine uses the JavaScript language. Scripts can be written and bundled with Behaviour Packs to listen and respond to game events, get (and modify) data in components that entities have, and affect different parts of the game.

Four short links: 7 December 2018

Broken Feedback, Fake AI, Teaching with Jupyter, and Multiplayer Code UI

  1. Why Ratings and Feedback Forms Don't Work (The Atlantic) -- Negative feedback is actually good feedback because it yields greater efficiency and performance. [...] Positive feedback, by contrast, causes the system to keep going, unchecked. Like a thermostat that registers the room as too warm and cranks up the furnace, it’s generally meant to be avoided. But today’s understanding of feedback has reversed those terms.
  2. How to Recognize Fake AI-Generated Images -- worth remembering that researchers are in a war with these kinds of heuristics because if "straight hair looks like paint," then a researcher can get a paper out of fixing that.
  3. Teaching and Learning with Jupyter -- open about Jupyter and its use in teaching and learning.
  4. repl.it Multiplayer -- code with friends in the same editor, execute programs in the same interpreter, interact with the same terminal, chat in the IDE, edit files and share the same system resources, and ship applications from the same interface.

Four short links: 6 December 2018

Public Domain, Optimistic Sci-Fi, C64 Defrag, and Quantum Computing

  1. Re-Opening of the Public Domain (Creative Commons) -- after years of legal extension of copyright terms, 2019 will be the first year in which new materials fall into the American public domain, and Creative Commons is throwing a bash at the Internet Archive.
  2. Better Worlds (The Verge) -- starting on January 14th, we’ll be publishing Better Worlds: 10 original fiction stories, five animated adaptations, and five audio adaptations by a diverse roster of science fiction authors who take a more optimistic view of what lies ahead in ways both large and small, fantastical and everyday. Necessary! I heard a great interview with Tyler Cowen where he said, "you cannot live with pessimism, right? There’s also a notion that more optimism is a partially self-fulfilling prophecy. Believing pessimistic views might make them more likely to come about." It is a fallacy to conflate optimism with naivete.
  3. A Disk Defragmenter for the Commodore 64 -- I don't know what's more insane: watching a great 40x25 homage to the classic Windows defrag progress screen or reading the bonkers BASIC code behind it.
  4. Quantum Computing Progress and Prospects -- an introduction to the field, including the unique characteristics and constraints of the technology, and assesses the feasibility and implications of creating a functional quantum computer capable of addressing real-world problems. This report considers hardware and software requirements, quantum algorithms, drivers of advances in quantum computing and quantum devices, benchmarks associated with relevant use cases, the time and resources required, and how to assess the probability of success. Separate the hype from the reality and develop a sense of the probability of different possible evolutionary paths for the technology.

Four short links: 5 December 2018

NLP for Code, Monolith vs. Modular, Automatic Gender Recognition, and Budget Simulator

  1. code2vec -- a dedicated website for demonstrating the principles shown in the paper code2vec: Learning Distributed Representations of Code. An interesting start to using a productive NLP technique on code.
  2. Monolithic or Modular -- When monolithic adherents look at a modular project, they may think that it’s low quality or abandoned simply because commit count is low and rare, new features are not being added, and the project has no funding or community events. Interestingly, these same properties are what modular adherents will perceive as a good thing, likely to indicate that the module is complete. Monolithic adherents don’t believe a project could ever be “complete.”
  3. The Misgendering Machines: Trans/HCI Implications of Automatic Gender Recognition -- I show that AGR consistently operationalizes gender in a trans-exclusive way, and consequently carries disproportionate risk for trans people subject to it. In addition, I use the dearth of discussion of this in HCI papers that apply AGR to discuss how HCI operationalizes gender and the implications that this has for the field’s research. I conclude with recommendations for alternatives to AGR and some ideas for how HCI can work toward a more effective and trans-inclusive treatment of gender. (via Alvaro Videla)
  4. Occult Defence Agency Budgeting Simulator -- a hilarious exercise whose point is about what happens the year after you cut the budget, with parallels to UK fiscal policy left as exercise for the (pixie-ravaged) reader. I've long held that simulations are a fantastic way to make a point. (via David Stark)

Four short links: 4 December 2018

Voice Technology, AI Summaries, Time Tracker, and Homomorphic Encryption

  1. Fifteen Unconventional Uses of Voice Technology (Nicole He) -- Students had half a semester to learn tools like the Web Speech API, Dialogflow, and Actions on Google, and then were tasked with making something...interesting. The in-class code examples we used are on GitHub. Here are 15 funny, subversive, and impressively weird final projects from the class.
  2. Summary of 2018's Most Important AI Papers -- To help you catch up, we’ve summarized 10 important AI research papers from 2018 to give you a broad overview of machine learning advancements this year. There are many more breakthrough papers worth reading as well, but we think this is a good list for you to start with.
  3. arbtt -- a time tracker that sits in the background. You write rules that tell it how to categorize your activity.
  4. Microsoft Simple Encrypted Arithmetic Library -- an easy-to-use but powerful homomorphic encryption library written in C++. It supports both the BFV and the CKKS encryption schemes. (via Microsoft Research Blog)

Four short links: 3 December 2018

Amazon and OSS, Audio to Keystrokes, The New OS, and Software Sprawl

  1. Amazon is Competing with Its Customers -- What's more, Kreps said, Amazon has not contributed a single line of code to the Apache Kafka open source software and is not reselling Confluent's cloud tool. Sometimes Amazon contributes back, but increasingly often it seems like its software MO is exploitation not co-creation. This is what prompted the creation of various "open except if you resell it as a cloud service"-source licenses, like the Commons Clause.
  2. kbd-audio -- tools for capturing and analyzing keyboard input paired with microphone capture.
  3. Kubernetes is the OS That Matters (Matt Asay) -- provocative clickbait title, but the point is important: if single-machine apps are the exception, then the lowest layer of critical shared software is no longer the OS but instead the cluster manager.
  4. Software Sprawl, The Golden Path, and Scaling Teams with Agency (Charity Majors) -- good talk on how to recover from "we're using too many shiny tools, and it's hard to make progress because there's no common set of tools, so everyone's reinventing the wheel, and omg fire."

Four short links: 30 November 2018

Advents are Coming, Open Source, Restricted Exports, and Misinformation Operations

  1. QEMU Advent Calendar -- An amazing QEMU disk image every day!. It's that time of year again! See also Advent of Code.
  2. De Facto Closed Source -- You want to download thousands of lines of useful, but random, code from the internet, for free, run it in a production web server, or worse, your user’s machine, trust it with your paying users’ data and reap that sweet dough. We all do. But then you can’t be bothered to check the license, understand the software you are running, and still want to blame the people who make your business a possibility when mistakes happen, while giving them nothing for it? This is both incompetence and entitlement.
  3. U.S. Government Wonders What to Limit Exports Of -- The representative general categories of technology for which Commerce currently seeks to determine whether there are specific emerging technologies that are essential to the national security of the United States include: (1) Biotechnology, such as: (i) Nanobiology; (ii) Synthetic biology; (iv) Genomic and genetic engineering; or (v) Neurotech. (2) Artificial intelligence (AI) and machine learning technology, such as: (i) Neural networks and deep learning (e.g., brain modeling, time series prediction, classification); (ii) Evolution and genetic computation (e.g., genetic algorithms, genetic programming); (iii) Reinforcement learning; (iv) Computer vision (e.g., object recognition, image understanding); (v) Expert systems (e.g., decision support systems, teaching systems); (vi) Speech and audio processing (e.g., speech recognition and production); (vii) Natural language processing (e.g., machine translation); (viii) Planning (e.g., scheduling, game playing); (ix) Audio and video manipulation technologies (e.g., voice cloning, deepfakes); (x) AI cloud technologies; or (xi) AI chipsets. (3) Position, Navigation, and Timing (PNT) technology. (4) Microprocessor technology, such as: (i) Systems-on-Chip (SoC); or (ii) Stacked Memory on Chip. (5) Advanced computing technology, such as: (i) Memory-centric logic. (6) Data analytics technology, such as: (i) Visualization; (ii) Automated analysis algorithms; or (iii) Context-aware computing. (7) Quantum information and sensing technology, such as (i) Quantum computing; (ii) Quantum encryption; or (iii) Quantum sensing. (8) Logistics technology, such as: (i) Mobile electric power; (ii) Modeling and simulation; (iii) Total asset visibility; or (iv) Distribution-based Logistics Systems (DBLS). (9) Additive manufacturing (e.g., 3D printing); (10) Robotics such as: (i) Micro-drone and micro-robotic systems; (ii) Swarming technology; (iii) Self-assembling robots; (iv) Molecular robotics; (v) Robot compliers; or (vi) Smart Dust. (11) Brain-computer interfaces, such as (i) Neural-controlled interfaces; (ii) Mind-machine interfaces; (iii) Direct neural interfaces; or (iv) Brain-machine interfaces. (12) Hypersonics, such as: (i) Flight control algorithms; (ii) Propulsion technologies; (iii) Thermal protection systems; or (iv) Specialized materials (for structures, sensors, etc.). (13) Advanced Materials, such as: (i) Adaptive camouflage; (ii) Functional textiles (e.g., advanced fiber and fabric technology); or (iii) Biomaterials. (14) Advanced surveillance technologies, such as: Faceprint and voiceprint technologies. It's a great list of what's in the next Gartner Hype Cycle report.
  4. The Digital Maginot Line (Renee DiResta) -- We know this is coming, and yet we’re doing very little to get ahead of it. No one is responsible for getting ahead of it. [...] platforms aren’t incentivized to engage in the profoundly complex arms race against the worst actors when they can simply point to transparency reports showing that they caught a fair number of the mediocre actors. [...] The regulators, meanwhile, have to avoid the temptation of quick wins on meaningless tactical bills (like the Bot Law) and wrestle instead with the longer-term problems of incentivizing the platforms to take on the worst offenders (oversight), and of developing a modern-day information operations doctrine.

Four short links: 29 November 2018

Security Sci-Fi, AWS Toys, Quantum Ledger, and Insecurity in Software in Hardware

  1. The Cliff Nest -- sci-fi story with computer security challenges built in.
  2. Amazon Textract -- OCR in the cloud, extracting not just text but also structured tables. Part of a big feature dump Amazon's done today, including recommendations, AWS on-prem, and a fully managed time series database.
  3. Quantum Ledger Database -- a fully managed ledger database that provides a transparent, immutable, and cryptographically verifiable transaction log owned by a central trusted authority. Amazon QLDB tracks each and every application data change and maintains a complete and verifiable history of changes over time. Many of the advantages of a blockchain ledger without the distributed pains. Quantum in the sense of "minimum chunk of something," not "uses quantum computing."
  4. Sennheiser Headset Software Enabled MITM Attacks -- When users have been installing Sennheiser's HeadSetup software, little did they know the software was also installing a root certificate into the Trusted Root CA Certificate store. To make matters worse, the software was also installing an encrypted version of the certificate's private key that was not as secure as the developers may have thought. This is the price of using software to improve hardware.

Four short links: 28 November 2018

FaaS, Space as a Service, Bot Yourself, and Facebook's RL Platform

  1. Firecracker -- Amazon's open source virtualization technology that is purpose-built for creating and managing secure, multitenant containers and functions-based services. Docker but for FaaS platforms. Best explanation is on lobste.rs: Firecracker is solving the problem of multitenant container density while maintaining the security boundary of a VM. If you’re entirely running first-party trusted workloads and are satisfied with them all sharing a single kernel and using Linux security features like cgroups, selinux, and seccomp, then Firecracker may not be the best answer. If you’re running workloads from customers similar to Lambda, desire stronger isolation than those technologies provide, or want defense in depth, then Firecracker makes a lot of sense. It can also make sense if you need to run a mix of different Linux kernel versions for your containers and don’t want to spend a whole bare-metal host on each one.
  2. Amazon Ground Station: Ingest and Process Data from Orbiting Satellites -- a sign that space is becoming more mainstream. Also interesting because they're doing a bunch of processing in EC2 rather than at the basestation. General-purpose computers often beat specialized ones.
  3. Me Bot -- A simple tool to make a bot that speaks like you, simply learning from your WhatsApp Chats. (via Hacker News)
  4. Horizon -- FB open sources reinforcement learning platform for large-scale products and services, built on PyTorch.