Four Short Links

Nat Torkington's eclectic collection of curated links.

Four short links: 22 October 2018

Perl in the Browser, Pharo Programming, Program Synthesis, and Raster Vision

  1. WebPerl -- run Perl in the browser, via WebAssembly and EmScripten. PerlMonks discussion. (via Hacker News)
  2. Pharo -- a pure object-oriented programming language and a powerful environment focused on simplicity and immediate feedback (think IDE and OS rolled into one). SmallTalk's ideas are ready for a comeback!
  3. Type-Driven Program Synthesis -- The talk will present two applications of type-driven synthesis. The first one is a tool called Synquid, which creates recursive functional programs from scratch given a refinement type as input. Synquid is the first synthesizer powerful enough to automatically discover programs that manipulate complex data structures, such as balanced trees and propositional formulas. The second application is a language called Lifty, which uses type-driven synthesis to repair information flow leaks. In Lifty, the programmer specifies expressive information flow policies by annotating the sources of sensitive data with refinement types, and the compiler automatically inserts access checks necessary to enforce these policies across the code.
  4. Raster Vision -- open source framework for deep learning on satellite and aerial imagery.

Four short links: 19 October 2018

PDF to Data Frame, Clever Story, Conceptual Art, and Automatic Patch Synthesis

  1. Camelot -- Python library that extracts tables of data from PDF documents, returning them as Pandas frames.
  2. STET -- short story told via footnotes, editorial markup, and more. Magnificent! (via Cory Doctorow)
  3. Solving Sol -- interpreting a conceptual artist's art as instructions, reframed as an AI problem. Clever!
  4. Human-Competitive Patches with Repairnator -- Repairnator is a bot. It constantly monitors software bugs discovered during continuous integration of open source software and tries to fix them automatically. If it succeeds to synthesize a valid patch, Repairnator proposes the patch to the human developers, disguised under a fake human identity. To date, Repairnator has been able to produce five patches that were accepted by the human developers and permanently merged in the code base.

Four short links: 18 October 2018

Git Playbook, Lessons Learned, Neural NLP, and Landscape Generation

  1. Flight Rules for Git -- the hard-earned body of knowledge recorded in manuals that list, step-by-step, what to do if X occurs and why. Essentially, they are extremely detailed, scenario-specific standard operating procedures. What to do after you shoot yourself in the foot in interesting ways with Git.
  2. Lessons Learned from Creating a Rich-Text Editor with Real-Time Collaboration -- This article describes how we approached the problem and what challenges we had to overcome in order to provide real-time collaborative editing capable of handling rich text. Check it out if you are interested in: learning what problems you may face when implementing real-time collaborative editing, building a rich-text editor with support for real-time collaboration, and how we approached collaborative editing in CKEditor 5.
  3. A Review of the Recent History of Natural Language Processing -- This post will discuss major recent advances in NLP focusing on neural network-based methods.
  4. Landscape -- software that builds the Cloud-Native Computing Foundation's landscape of products.

Four short links: 17 October 2018

Reservoir Computing, ProxyJump, SID Sequencer, and 2KB AI

  1. MEMS Neuromorphic Computing -- the construction of the first reservoir computing device built with a microelectromechanical system (MEMS). [...] [T]he neural network exploits the nonlinear dynamics of a microscale silicon beam to perform its calculations. The group's work looks to create devices that can act simultaneously as a sensor and a computer using a fraction of the energy a normal computer would use. Early-stage research but an interesting direction for the future of hardware.
  2. SSH ProxyJump -- it’s somewhat common to have what’s known as a “jump host” serve as an SSH gateway to a remote network. You use SSH to log into the jump host (or “jump server”) and from there use SSH to log into an internal host that’s not directly accessible from the internet. This useful utility makes it a one-step action.
  3. Booting defMON -- an introduction to an absolutely wild low-level sequencer for the C64 SID chips.
  4. Machine Learning on 2KB of RAM -- This paper develops a novel tree-based algorithm, called Bonsai, for efficient prediction on IoT devices—such as those based on the Arduino Uno board having an 8-bit ATmega328P microcontroller operating at 16 MHz with no native floating point support, 2KB RAM, and 32KB read-only flash. (jaws drop)

Four short links: 16 October 2018

Common Sense, Photorealistic Rendering, Logic Game, and the Grey-hat Patcher

  1. Teaching Machines Common Sense Reasoning (DARPA) -- To focus this new effort, MCS will pursue two approaches for developing and evaluating different machine common sense services. The first approach will create computational models that learn from experience and mimic the core domains of cognition as defined by developmental psychology. [...] The second MCS approach will construct a common sense knowledge repository capable of answering natural language and image-based queries about common sense phenomena by reading from the web.
  2. Physically Based Rendering -- a textbook that describes both the mathematical theory behind a modern photorealistic rendering system as well as its practical implementation.
  3. QED -- a short interactive text in propositional logic arranged in the format of a computer game.
  4. A Mysterious Grey-Hat is Patching MicroTik Routers -- "I added firewall rules that blocked access to the router from outside the local network," Alexey said. "In the comments, I wrote information about the vulnerability and left the address of the @router_os Telegram channel, where it was possible for them to ask questions." More helpful than some corporate IT departments...

Four short links: 15 October 2018

Robots, Cryptocurrencies, Bayes, and Brains

  1. What People See in a Robot (YouTube) -- In a study using 24 robots selected from this three-dimensional appearance space, I then show that the different dimensions separately predict inferences people make about the robot’s affective, social-moral, and physical capacities. (via RoboHub)
  2. Crypto is the Mother of All Scams and (Now Busted) Bubbles While Blockchain Is The Most Over-Hyped Technology Ever, No Better than a Spreadsheet/Database (Nouriel Roubini) -- Roubini's testimony to the Hearing of the U.S. Senate Committee on Banking, Housing and Community Affairs on Blockchains. It is clear by now that Bitcoin and other cryptocurrencies represent the mother of all bubbles, which explains why literally every human being I met between Thanksgiving and Christmas of 2017 asked me first if they should buy them. [...] A chart of Bitcoin prices compared to other famous historical bubbles and scams—like Tulip-mania, the Mississippi Bubble, the South Sea Bubble—shows that the price increase of Bitcoin and other crypto junkcoins was 2X or 3X bigger than previous bubbles, and the ensuing collapse and bust as fast and furious and deeper. [...] Actually calling this useless vaporware garbage a “shitcoin” is a grave insult to manure that is a most useful, precious, and productive good as a fertilizer in agriculture. It's all quotable. Read it.
  3. Bayes' Theorem in the 21st Century -- I recently completed my term as editor of an applied statistics journal. Maybe a quarter of the papers used Bayes’ theorem. Almost all of these were based on uninformative priors, reflecting the fact that most cutting-edge science does not enjoy Five-Thirty-Eight-level background information. Are we in for another Bayesian bust?
  4. Numenta's New Theory -- research paper, talk, NYT story. Will be interesting to see how this fares in peer review.

Four short links: 12 October 2018

Activity Alert, JavaScript Visualizations, OT vs. CRDT, and Senior Engineering

  1. Publicly Available Tools Seen in Cyber Incidents Worldwide (US-CERT) -- The tools detailed in this activity alert fall into five categories: remote access trojans (RATs), webshells, credential stealers, lateral movement frameworks, and command and control (C2) obfuscators. This activity alert provides an overview of the threat posed by each tool, along with insight into where and when it has been deployed by threat actors. Measures to aid detection and limit the effectiveness of each tool are also described. The activity alert concludes with general advice for improving network defense practices.
  2. Muze -- Tableau-like visualizations in JavaScript. Open source (MIT).
  3. Real Differences between OT and CRDT for Co-Editors -- key CRDT design issues include designing CRDT-special data structures and schemes for representing and manipulating object sequences, searching and executing identifier-based operations in the object sequence, and conversions between internal identifier-based operations and external position-based operations, which collectively deal with both application-specific and concurrency issues in co-editing. This approach has induced a myriad of CRDT-specific challenges and puzzles, such as the correctness of key CRDT data structures and functional components, tombstone overhead, variable and lengthy identifiers, inconsistent-position-integer-ordering and infinite loop flaws, position-order-violation puzzles, and concurrent-insert-interleaving puzzles.
  4. What's a Senior Engineer's Job? (Julia Evans) -- I want to talk here about the work that a senior engineer does.

Four short links: 11 October 2018

Decentralized Applications, Global Startups, Better Shuffling, and Prolog Text

  1. Decentralized Applications (MIT) -- interesting course to be taught by Robert T Morris. The goal of 6.S974 is to understand recent efforts in decentralized applications, to learn what the main design trade-offs are, and to identify areas for new research. My spidey-sense is tingling. This has all the hallmarks of one of those courses whose graduates build the next wave of companies and research areas.
  2. America Is Losing Its Startup Edge -- ignore the use of percentages and Decline of Roman^W American Empire alarmism, it's the rise of the rest of the world that's fascinating here. While it is true that venture-capital investment in the U.S. continues to rise, having reached more than $90 billion in 2017, such investment is growing even faster in other parts of the world, expanding by nearly 375%—more than twice the 160% increase here. China saw the largest jump, its share expanding from 4% of global venture investment in 2005 to a nearly a quarter of it by 2017.
  3. Playlist Shuffle -- This paper proposes a novel approach at shuffling a looping sequence that minimizes caveats of naive solutions, keeps computation low, and offers a high degree of variance. [...] The problem is how to repeatedly shuffle a cyclic list and avoid too close and too far duplicates.
  4. Art of Prolog, 2E -- this 1994 classic is now an open access title, free PDF download. Prolog is rational AI magic, while deep learning is intuitive AI magic.

Four short links: 10 October 2018

Better Education, Do You Need Blockchain?, Visualization Book, and Hiring Coders

  1. Generation of Greatness (Edwin Land) -- eye-wateringly sexist on the surface but (if you replace "boys" with "children" and "men" with "people") an astonishingly forward-thinking piece on education. I'd want to hire graduates of this approach. (via Javier Candero)
  2. Do You Need Blockchain? Flowchart -- from page 42 of the Blockchain Technology Overview report from NIST.
  3. Visualization Analysis and Design (Amazon) -- Tamara Munzner's systematic, comprehensive framework for thinking about visualization in terms of principles and design choices. The book features a unified approach, encompassing information visualization techniques for abstract data, scientific visualization techniques for spatial data, and visual analytics techniques for interweaving data transformation and analysis with interactive visual exploration. It emphasizes the careful validation of effectiveness and the consideration of function before form. (via review)
  4. Assessing Software Engineering Candidates (Bryan Cantrill) -- Joyent's guidance, originally published as a company RFD. While we advocate (and indeed, insist upon) interviews, they should come relatively late in the process; as much assessment as possible should be done by allowing the candidate to show themselves as software engineers truly work: on their own, in writing.

Four short links: 9 October 2018

Lost Lessons, Metaphors to Monads, Future of Work, and Awesome Starts at The Top

  1. Neither Paper Nor Digital Does Reading Well -- Develop a familiarity with, for example, Alan Kay’s or Douglas Engelbart’s visions for the future of computing and you are guaranteed to become thoroughly dissatisfied with the limitations of every modern OS. Reading up hypertext theory and research, especially on hypertext as a medium, is a recipe for becoming annoyed at The Web. Catching up on usability research throughout the years makes you want to smash your laptop agains the wall in anger. And trying to fill out forms online makes you scream "it doesn’t have to be this way!" at the top of your lungs. That software development doesn’t deal with research or attempts to get at hard facts is endemic to the industry. (via Daniel Siegel)
  2. The Unreasonable Effectiveness of Metaphor (YouTube) -- Julia Moronuki, author of Haskell from First Principles, sneaks up on the idea of monads by starting with how linguists and cognitive scientists understand metaphors. (via @somegoob)
  3. World Development Report 2019: The Changing Nature of Work -- In countries with the lowest human capital investments today, our analysis suggests that the workforce of the future will only be one-third to one-half as productive as it could be if people enjoyed full health and received a high-quality education.
  4. Chairman of Nokia Learned Deep Learning -- I realized that as a long-time CEO and chairman, I had fallen into the trap of being defined by my role: I had grown accustomed to having things explained to me. Instead of trying to figure out the nuts and bolts of a seemingly complicated technology, I had gotten used to someone else doing the heavy lifting. [...] After a quick internet search, I found Andrew Ng’s courses on Coursera, an online learning platform. Andrew turned out to be a great teacher who genuinely wants people to learn. I had a lot of fun getting reacquainted with programming after a break of nearly 20 years. Once I completed the first course on machine learning, I continued with two specialized follow-up courses on deep learning and another course focusing on convolutional neural networks, which are most commonly applied to analyzing visual imagery. Yow.

Four short links: 8 October 2018

Stripe Stats, Worker Ethics, FPGA Futures, and Internet Archive Stats

  1. The Story of Stripe (Wired UK) -- Over the past year, 65% of UK internet users and 80% of U.S. users have bought something from a Stripe-powered business.
  2. Tech Workers Want to Know: What Are We Building This For? (NYT) -- about time. I see plenty of places mandating their young kids are taught coding. Who's mandating their coders take ethics classes so they have an ability to think critically about the applications of what they develop?
  3. Inference: The Future of FPGA (Next Platform) -- Inference, which is almost exclusively run on Xeon servers in the data center these days, therefore represents maybe 1% of the workload in the server installed base and has driven a little less than 1% of the server spending, by our math. [...] But as organizations figure out how to use machine learning frameworks to build neural networks and then algorithms that they embed into their applications, there will be a lot more inference going on, and this will become a representative workload driving a lot of chip revenues.
  4. Internet Archive Stats -- 22PB of Internet Archive growing 4PB/y, including four million books, 200 million hours of broadcast news, and 300,000 playable classic video games, 1.5 billion pages crawled/week, 200 staffers.

Four short links: 5 October 2018

Supply Chain Security, ML in FB Marketplace, Datasette Ideas, and Scraper DSL

  1. Motherboard Supply Chain Compromise (Bloomberg) -- fascinating story of Chinese compromise of SuperMicro motherboards, causing headaches for AWS, Apple, and the U.S. military, among many others. See also tech for spotting these things and some sanity checking on the article's claims.
  2. How Facebook Marketplace Uses Machine Learning -- nice. It's increasingly clear there's not much that's user-facing that can't benefit from machine learning to prompt, augment, and check user input.
  3. Interesting Ideas in Datasette (Simon Willison) -- solid technical reflection on non-obvious approaches and techniques in his project.
  4. Ferret -- interesting approach: a DSL for writing web scrapers.

Four short links: 4 October 2018

Autonomy and UI, Replicating ML Research, FPGA Dev, and Standard Notes

  1. UI for Self-Driving Cars -- I'd never thought about it, but Ford has: how does a self-driving car signal its intentions to humans (and/or other autonomous vehicles around)? Through our testing, we believe these signals have the chance to become an accepted visual language that helps address an important societal issue in how self-driving vehicles interact with humans.
  2. Reproducing Machine Learning Research -- there's good news—reproducibility breaks down in three main places: the code, the data, and the environment. I’ve put together this guide to help you narrow down where your reproducibility problems are, so you can focus on fixing them.
  3. Open Source FPGA Dev Guide -- in case you've been curious about kicking the tires. (Yes, I know FPGAs don't have tires, please don't write in.)
  4. Standard Notes -- what to use if you're nervous about entrusting your data to someone else's product roadmap (EverNote or OneNote or Keep). Free, open source, and completely encrypted. Ticks all the boxes: 2FA, automated backups to cloud storage, versioning, cross-platform (Mac, Windows, iOS, Android, Linux), offline access...

Four short links: 3 October 2018

Positive Chatbot, Inside Serverless, TimBL's Next Project, and Voting Machines

  1. Ixy -- chat with a bot that helps you not descend into irate internet madness. Nifty idea! (via Evan Prodromou)
  2. Peeking Behind the Curtains of Serverless Platforms -- interesting implementation details. We characterize performance in terms of scalability, coldstart latency, and resource efficiency, with highlights including that AWS Lambda adopts a bin-packing-like strategy to maximize VM memory utilization, that severe contention between functions can arise in AWS and Azure, and that Google had bugs that allowed customers to use resources for free.
  3. Solid -- Tim Berners-Lee's new open source project (and startup), building apps from linked data.
  4. DEFCON Voting Machines Report -- tl;dr: online voting is a disaster-in-waiting, a calamity of vulnerabilities that shabby-suited shysters would be afraid to peddle but which our local and central governments have embraced. Those who are willing to trade the integrity of their democracy for the false promise of increased voter turnout deserve neither. It is noteworthy that this year the defenses of the virtual election office were fortified using Israeli military defense software, while attack tools were limited to what is available with Kali Linux

Four short links: 2 October 2018

Apple MDM, Source Explorer, Verification-Aware Programming, and Superstar Economics

  1. MicroMDM -- open source mobile device management system (IT department lingo for "rootkit") for Apple devices.
  2. Sourcegraph Open Sourced -- Code search and intelligence, self-hosted and scalable.
  3. Dafny -- a verification-aware programming language. Verification (proving software correct) is a critical research area for the future of software, imho.
  4. The Economics of Superstars -- The key difference between this technology and public goods is that property rights are legally assigned to the seller: there are no issues of free riding due to nonexclusion; customers are excluded if they are unwilling to pay the appropriate admission fee. The implied scale economy of joint consumption allows relatively few sellers to service the entire market. And fewer are needed to serve it the more capable they are. When the joint consumption technology and imperfect substitution features of preferences are combined, the possibility for talented persons to command both very large markets and very large incomes is apparent. (via Hacker News)

Four short links: 1 October 2018

DARPA History, Probabilistic Programming, Superstar Macroeconomics, and Interactive Narrative

  1. 60 Years of Challenges and Breakthroughs (DARPA) -- a short interesting history video about the internet, TCP/IP, Licklider, and more.
  2. Introduction to Probabilistic Programming -- a first-year graduate-level introduction to probabilistic programming. It not only provides a thorough background for anyone wishing to use a probabilistic programming system, but also introduces the techniques needed to design and build these systems. It is aimed at people who have an undergraduate-level understanding of either or, ideally, both probabilistic machine learning and programming languages. Probabilistic methods are a way of automating inference, and of use as we try to make software smarter.
  3. The Macroeconomics of Superstars (PDF download) -- We describe superstars as arising from digital innovations, which replace a fraction of the tasks in production with information technology that requires a fixed cost but can be reproduced at zero marginal cost. This generates a form of increasing returns to scale. To the extent that the digital innovations are excludable, it also provides the innovator with market power. Our paper studies the implications of superstar technologies for factor shares, for inequality, and for the efficiency properties of the superstar economy. (via Hacker News)
  4. Inform: Past, Present, Future (Emily Short) -- Graham Nelson's talk about how Inform came to be what it is, and where it's going. Inform is the amazing compiler that lets you write Infocom adventures...but is so much more than that. Anyone interested in programming language design, literate programming, or AR/VR interactive fiction should read this.

Four short links: 28 September 2018

Observing Kubernetes, Ada Lovelace, Screen Time, and 6502 C

  1. kubespy -- Tools for observing Kubernetes resources in real time.
  2. Ada Lovelace's Note G -- a very readable explanation of what she did and why it's notable and remarkable, complete with loops and versions of her program in C and Pascal. (via Chris Palmer)
  3. Limiting Children’s Screen Time to Less Than Two Hours a Day Linked to Better Cognition (Neuroscience News) -- a summary of a paper in Lancet, the leading British medical journal. Taken individually, limited screen time and improved sleep were associated with the strongest links to improved cognition, while physical activity may be more important for physical health. However, only one in 20 U.S. children aged between 8-11 years meet the three recommendations advised by the Canadian 24-hour Movement Guidelines to ensure good cognitive development—9-11 hours of sleep, less than two hours of recreational screen time, and at least an hour of physical activity every day.
  4. cc65 -- a complete cross development package for 65(C)02 systems, including a powerful macro assembler, a C compiler, linker, librarian, and several other tools. cc65 has C and runtime library support for many of the old 6502 machines. That's right, you can print "Hello, World" on your C64 (and Atari 2600 and Apple ][+ and NES and ...).

Four short links: 27 September 2018

Calendar Fallacies, Data Lineage, Firefox Monitor, and Glitch Handbook

  1. Your Calendrical Fallacy is... -- odds are high that if a programmer is sobbing into their keyboard, it's because of these pesky realities.
  2. Smoke: Fine-Grained Lineage at Interactive Speed -- lineage queries over the workflow: backward queries return the subset of input records that contributed to a given subset of output records while forward queries return the subset of output records that depend on a given subset of input records. (via Morning Paper)
  3. Introducing Firefox Monitor -- proactive alerting of your presence on HaveIBeenPwned. Introduced here.
  4. Glitch Employee Handbook -- fascinating to see how openly they operate. (via their very nicely done "come work for us" site)

Four short links: 26 September 2018

Walmart's Blockchain, Machine Learning and Text Adventures, Algorithmic Decision-Making, and Networked Brains

  1. Walmart Requires Lettuce, Spinach Suppliers to Join Blockchain (WSJ Blog) -- built on Hyperledger, by way of IBM. I read IBM's brief but still can't figure out the benefits over, say, Walmart running their own APIed database app, but I suspect it has to do with "this way, EVERY blockchain participant has to buy a big app from IBM, instead of just Walmart buying something to run for others to contribute to." (via Dan Hon)
  2. Inform 7 and Machine Learning (Emily Short) -- TextWorld’s authors feel we’re not yet ready to train a machine agent to solve a hand-authored IF game like Zork—and they’ve documented the challenges here much more extensively than my rewording above. What they have done instead is to build a sandbox environment that does a more predictable subset of text adventure behavior. TextWorld is able to automatically generate games containing a lot of the standard puzzles.
  3. Litigating Algorithms: Challenging Government Use of Algorithmic Decision Systems -- session notes from a day-long workshop the EFF ran with the Center on Race, Inequality, and the Law.
  4. BrainNet: A Multi-Person Brain-to-Brain Interface for Direct Collaboration Between Brains -- Five groups of three subjects successfully used BrainNet to perform the Tetris task, with an average accuracy of 0.813. Furthermore, by varying the information reliability of the senders by artificially injecting noise into one sender's signal, we found that receivers are able to learn which sender is more reliable based solely on the information transmitted to their brains. Our results raise the possibility of future brain-to-brain interfaces that enable cooperative problem solving by humans using a "social network" of connected brains.

Four short links: 25 September 2018

Software Engineering, ML Hardware Trends, Time Series, and Eng Team Playbooks

  1. Notes to Myself on Software Engineering -- Code isn’t just meant to be executed. Code is also a means of communication across a team, a way to describe to others the solution to a problem. Readable code is not a nice-to-have; it is a fundamental part of what writing code is about. A solid list of advice/lessons learned.
  2. Machine Learning Shifts More Work To FPGAs, SoCs -- compute power used for AI/ML is doubling every 3.5 months. FPGAs and ASICs are already predicted to be 25% of the market for machine learning accelerators in 2018. Why? FPGAs and ASICs use far less power than GPUs, CPUs, or even the 75 watts per hour Google’s TPU burns under heavy load. [...] They can also deliver a performance boost in specific functions chosen by customers that can be changed along with a change in programming.
  3. Time Series Forecasting -- one of those "three surprising things" articles. The three surprising things: You need to retrain your model every time you want to generate a new prediction; sometimes you have to do away with train/test splits; and the uncertainty of the forecast is just as important as, or even more so, than the forecast itself.
  4. Health Monitor -- Atlassian's measures of whether your team is doing well. Their whole set of playbooks is great reading for engineering managers.