Four Short Links

Nat Torkington's eclectic collection of curated links.

Four short links: 22 June 2018

To-Do Lists, Startup Numbers, Learning Projects, and Data Structure Synthesis

  1. A Supervised Approach To The Interpretation Of Imperative To-Do Lists -- While there has been work in the area of personal assistants for to-do tasks, no work has focused on classifying user intention and information extraction as we do. We show that our methods perform well across two corpora that span sub-domains, one of which we released. It's rare to find a data type that hasn't had a lot of NLP work done on it.
  2. The U.S. Startup is Disappearing -- While companies that were less than two years old made up about 13% of all companies in 1985, they only accounted for 8% in 2014. Ruh roh, says Schumpeter Doo.
  3. Deep Learning Project Reports and Posters -- nifty selection of projects from these Stanford undergrads. I'm struck by how diverse and interesting the projects are, yet from a 200-level course.
  4. Generalized Data Structure Synthesis (Adrian Colyer) -- Many systems have a few key data structures at their heart. Finding correct and efficient implementations for these data structures is not always easy. Today’s paper introduces Cozy (https://cozy.uwplse.org), which can handle this task for you, given a high-level specification of the state, queries, and update operations that need to be supported. I'm all about software writing software, or at least making life easier for those people who write software.

Four short links: 21 June 2018

Chinese Internet, Booting Linux, Pull Requests, and Commercialized Commons

  1. Beijing Wants to Rewrite the Rules of the Internet -- China’s cyber governance plan appears to have three objectives. One is a legitimate desire to address substantial cybersecurity challenges, like defending against cyber attacks and keeping stolen personal data off the black market. A second is the impulse to support domestic industry, in order to wean the government off its dependence on foreign technology components for certain IT products deemed essential to economic and national security. (In effect, these requirements exclude foreign participation, or make foreign participation only possible on Beijing’s terms.) The third goal is to expand Beijing’s power to surveil and control the dissemination of economic, social, and political information online.
  2. How Modern Linux Systems Boot -- "Sometimes the reasons for failure are obscure and annoying" could appear in every man page.
  3. The Art of Humanizing Pull Requests -- What are PR’s, how to effectively create a PR, how to give feedback on PR’s, and how to respond to feedback. For the junior dev in your life.
  4. How Markets Co-opted Free Software's Most Powerful Weapon (YouTube) -- Benjamin Mako Hill's LibrePlanet 2018 keynote. new proprietary, firm-controlled, and money-based models are increasingly replacing, displacing, outcompeting, and potentially reducing what’s available in the commons.[...] In the talk, I talk about how this happened and what I think it means for folks who are committed to working in commons. I talk a little bit about what the free culture and free software should do now that mass collaboration, these communities’ most powerful weapon, is being used against them. (via copyrighteous)

Four short links: 20 June 2018

Open Source, Milspec Origami, Machine Says No, and Golang

  1. A Bitter Guide to Open Source -- not really that bitter. A great list of all the stuff that big and successful projects need these days.
  2. Laser-Engraved Milspec Origami (IEEE Spectrum) -- The Army is interested in origami inductors because the technology could give deployed soldiers the ability to make replacement parts rather than rely on what could be a risky and expensive delivery. But never mind its practical purpose; the process is just absolutely mesmerizing to watch.
  3. When a Machine Fired Me -- the downsides of fully-automated business processes.
  4. Gotchas and Mistakes in Golang -- traps, gotchas, and common mistakes for new Golang devs.

Four short links: 19 June 2018

Product Feedback, Medical AI, DensePose, and Automating Debugging

  1. Developing a Continuous Feedback Loop -- short preso on how to get and manage a lot of feedback from customers.
  2. Google's Medical AI -- some details of studies and ambitions in the space. This quote is provocative: "They’ve finally found a new application for AI that has commercial promise."
  3. DensePose -- Facebook open sourced our real-time approach for mapping all human pixels of 2D RGB images to a 3D surface-based model of the body. See discussion.
  4. Debugging with Intelligence via Probabilistic Inference (Paper a Day) -- Xu et al., have built an automated debugger that can take a single failing test execution, and with minimal interaction from a human, pinpoint the root cause of the failure. What I find really exciting about it is that instead of brute force, there’s a certain encoded intelligence in the way the analysis is undertaken that feels very natural. The first IDE / editor to integrate a tool like this wins!

Four short links: 18 June 2018

Innovation Stack, Fundraising, Diversity and Fans, and APIs to MySQL Data

  1. The Innovation Stack (Steve Blank) -- a must-read for anyone whose company needs to "get more of that innovation thing happening here."
  2. Both Sides of the Table -- great advice for fundraising from VCs.
  3. Superfan! (Sacha Judd) -- on teams, life, and some ways in which they all go horribly wrong. Her most excellent talk from Velocity this year.
  4. xmysql -- One command to generate REST APIs for any MySQL Database.

Four short links: 15 June 2018

Pose Estimation, Data Ethics, Interactive Explanation, and Serverless Tool

  1. Through-Wall Human Pose Estimation Using Radio Signals -- RF-Pose provides accurate human pose estimation through walls and occlusions. It leverages the fact that wireless signals in the WiFi frequencies traverse walls and reflect off the human body. It uses a deep neural network approach that parses such radio signals to estimate 2D poses.
  2. Data Ethics Framework -- the UK shared their principles, the explanation of each principle, and the workbook for figuring out how to apply them.
  3. Predator and Prey (Mike Bostock) -- a really nice demo of the "what if we didn't publish static text and images, but instead you could interact with the explanation?". Inspired by Bret Victor, obvs.
  4. AWS SAM CLI -- a CLI tool for local development and testing of Serverless applications.

Four short links: 14 June 2018

Historic Handwriting Recognition, Proving Security, Formal Methods, and Dank Memes

  1. Back to the Future of Handwriting Recognition -- a very readable explanation of how RAND's GRAIL system could read handwriting in 1966. (via Avi Bryant)
  2. The Surprising Security Benefits of End-to-End Formal Proofs -- talking up formal methods in software engineering, whereby you can prove your system's correctness.
  3. Software Foundations -- book series that is a broad introduction to the mathematical underpinnings of reliable software.
  4. Dank Learning: Generating Memes Using Deep Neural Networks -- both models generalize relatively well to unseen images. The average meme produced from both is difficult to differentiate from a real meme and both variants scored close to the same hilarity rating as real memes, though this is a fairly subjective metric. I wish "hilarity" were a metric that more things were judged by.

Four short links: 13 June 2018

Security Papers, Science AI, Deliberation, and Doing Science

  1. Influential Security Papers -- a ranking of top-cited papers from the area of computer security. The ranking is automatically created based on citations of papers published at top security conferences.
  2. Aristo -- Allen Institute app that reads, learns, and reasons about science.
  3. Automated Planning and Acting -- book and slides. This book is about methods and techniques that a computational agent can use for deliberative planning and acting, that is, for deciding both which actions to perform and how to perform them, to achieve some objective.
  4. Notes on Everything is F*cked -- Sanjay Srivastava posted a syllabus for a course called Everything is Fucked. The course itself is intended as a joke, but the reading list seemed interesting. These notes on the reading list papers are a great romp through the reproducibility crisis, p-hacking, and the multiplicity of ways your science can be wrong.

Four short links: 12 June 2018

Text2Binary, GraphQL, USB, and Debugging

  1. t2b -- a wicked-powerful text macro language for building binary files.
  2. GraphQL Guide -- new book by John Resig and Loren Sands-Ramshaw. (Blog post)
  3. USB Type C is Still a Mess -- yes, yes it is.
  4. Sonar -- a platform for debugging mobile apps on iOS and Android. Visualize, inspect, and control your apps from a simple desktop interface. Use Sonar as is or extend it using the plugin API. (via Facebook)

Four short links: 11 June 2018

Tiny Machine Learning, Deep Video, Software 2.0, and Smart Camera

  1. Why the Future of Machine Learning is Tiny (Pete Warden) -- If you accept all of the points above, then it’s obvious there’s a massive untapped market waiting to be unlocked with the right technology. We need something that works on cheap microcontrollers, that uses very little energy, that relies on compute not radio, and that can turn all our wasted sensor data into something useful. This is the gap that machine learning, and specifically deep learning, fills.
  2. Deep Video Portraits (YouTube) -- excellent video faking advance, hot from SIGGRAPH.
  3. Building the Software 2.0 Stack (Andrej Karpathy) -- 1.0 is pipelines and stacks, 2.0 is machine-optimized structure and parameters for code. The talk is really good.
  4. Jevois Smart Machine Vision Camera -- video sensor + quad-core CPU + USB video + serial port, all in a tiny, self-contained package (28 cc or 1.7 cubic inches, 17 grams or 0.6 oz). Insert a microSD card loaded with the provided open source computer vision algorithms (including OpenCV 3.4 and many others), connect to your desktop, laptop, and/or Arduino, and give your projects the sense of sight immediately.

Four short links: 8 June 2018

Representative Recognition, Cyberwar, Data Science Projects, and Conversational Failure

  1. NYT Uses Software to Identify Members of Congress -- In addition to confirming the identity of a member, Who The Hill has helped The Times tell some stories we couldn’t have reported otherwise. Most recently, Rachel Shorey found members of Congress at an event hosted by a SuperPAC by trawling through images found on social media and finding matches.
  2. Cyberwar Map -- both a visualization of state-sponsored cyberattacks and an index of Cyber Vault documents related to each topic (represented as nodes on the map).
  3. Cookie-Cutter Data Science -- a standard directory structure and set of conventions for a data science project, with a tool that creates a new one.
  4. Conversations Gone Awry: Detecting Early Signs of Conversational Failure -- To this end, we develop a framework for capturing pragmatic devices—such as politeness strategies and rhetorical prompts—used to start a conversation, and analyze their relation to its future trajectory. Applying this framework in a controlled setting, we demonstrate the feasibility of detecting early warning signs of antisocial behavior in online discussions.

Four short links: 7 June 2018

Algorithmic Accountability, Killing Project Maven, AI Scares Past, and Submarine Data Center

  1. Algorithmic Accountability -- an interesting refutation of calls for transparency and explainability. Instead, A governance framework for algorithmic accountability is based on the principle that an algorithmic system should employ a variety of controls to ensure operators can: verify it works in accordance with the operator’s intentions, and identify and rectify harmful outcomes. Algorithmic accountability promotes desirable outcomes, protects against harmful ones, and ensures algorithmic decisions are subject to the same requirements as human decisions.
  2. Tech Workers vs. The Pentagon (Jacobin) -- interesting insider's account of how Google workers organized against the Project Maven work for the Pentagon. Also interesting: it revealed that Project Maven was actually a pilot project for future collaborations between Google and the military. In particular, Project Maven was part of Google’s push to win the Joint Enterprise Defense Infrastructure (JEDI) contract. JEDI is the military’s next-generation cloud that will network American forces all over the world and integrate them with AI. It’s basically Skynet. And all the big cloud providers want to win the contract because it’s worth $10 billion. Google's workers just took a $10 billion ethical position. Someone's renegotiating their sales targets down right now...
  3. Past Visions of Artificial Futures: One Hundred and Fifty Years under the Spectre of Evolving Machines -- We show that discussion of these topics arose in the 1860s, within a decade of the publication of Darwin’s "The Origin of Species," and attracted increasing interest from scientists, novelists, and the general public in the early 1900s. After introducing the relevant work from this period, we categorize the various visions presented by these authors of the future implications of evolving machines for humanity. We suggest that current debates on the co-evolution of society and technology can be enriched by a proper appreciation of the long history of the ideas involved.
  4. Microsoft's Submarine Data Center (Motherboard) -- shipping container sized, 864 servers, powered by tidal and wind energy, natural cooling, hardware that dies down there will not be replaced. Sysops of the Caribbean, Davy Jones' Bit Locker, Yo Ho Ho and a Container of Rum...the Dad jokes just write themselves.

Four short links: 6 June 2018

Future Analytics, Personal Data, Wireless Power, Counterintuitive Probability

  1. DAWN: Data Analytics for What's Next -- Stanford project working on an end-to-end suite of tools. Their breakdown of opportunities for improvement is an interesting read. The project homepage has more.
  2. Singapore Government Discussion Paper on AI and Personal Data -- The objective is to put forward a proposed accountability-based framework and provide common definitions and a common structure to facilitate constructive and systemic discussions on ethical, governance, and consumer protection issues relating to the commercial deployment of AI. (via ZDNet)
  3. Wireless Power in the Body (MIT) -- The implants are powered by radio frequency waves, which can safely pass through human tissues. In tests in animals, the researchers showed that the waves can power devices located 10 centimeters deep in tissue, from a distance of 1 meter.
  4. Counterintuitive Probability -- these will make your head ache. In the good way.

Four short links: 5 June 2018

Reinforcement Learning Notebooks, Music Translation, Service Fabric, and Nat Friedman

  1. Reinforcement Learning Notebooks -- there's also a good selection of other Jupyter AI notebooks in the Hacker News comments.
  2. Universal Music Translation Network -- We present a method for translating music across musical instruments, genres, and styles. This method is based on a multi-domain wavenet autoencoder, with a shared encoder and a disentangled latent space that is trained end-to-end on waveforms. (via NextWeb)
  3. ServiceFabric: a Distributed Platform for Building Microservices in the Cloud -- application lifecycle management of scalable and reliable applications composed of microservices running at very high density on a shared pool of machines, from development to deployment to management. (via Paper a Day)
  4. Hello, GitHub -- GitHub sold to Microsoft for $7.5 billion, and its new CEO will be the most excellent Nat Friedman (of Xamarin fame).

Four short links: 4 June 2018

Infinite Walking, Security Class, Collaborative Data Structures, and Brain Class

  1. Infinite Walking in VR -- It's the nature of the human eye to scan a scene by moving rapidly between points of fixation. We realized that if we rotate the virtual camera just slightly during saccades, we can redirect a user's walking direction to simulate a larger walking space.
  2. Defense Against the Dark Arts -- Tufts's online summer school intro to security class.
  3. Data Laced with History: Causal Trees and Operational CRDTs -- in the words of the author: I wanted to research more elegant ways to enable document sync and collaboration in my apps sometime last year, and ended up discovering a new class of data structure that made it possible to build collaboration into documents right on the data level, completely separate from the network architecture. (via Hacker News)
  4. 9.11 The Human Brain -- all the lectures from a fascinating and enjoyable MIT course. The lecturer is an interesting human, not a dull monotone.

Four short links: 1 June 2018

AI Touch, Drone Delivery, WTF, and Javascript Robotics

  1. Artificial Sense of Touch -- This rudimentary artificial nerve circuit integrates three previously described components. The first is a touch sensor that can detect even minuscule forces. This sensor sends signals through the second component -- a flexible electronic neuron. The touch sensor and electronic neuron are improved versions of inventions previously reported by the Bao lab. Sensory signals from these components stimulate the third component, an artificial synaptic transistor modeled after human synapses.
  2. Drone Delivery Coming to Vanuatu -- the nation is opening a tender for vaccine delivery services between islands. UNICEF and the government of Vanuatu expect that a few drone companies will become the long-term solution to the many logistical challenges of “last-mile delivery” of vaccines on the small islands.
  3. wtf -- a personal terminal-based dashboard utility, designed for displaying infrequently-needed, but very important, daily data.
  4. Johnny-Five -- an Open Source, Firmata Protocol based, IoT and Robotics programming framework, developed at Bocoup. Johnny-Five programs can be written for Arduino (all models), Electric Imp, Beagle Bone, Intel Galileo & Edison, Linino One, Pinoccio, pcDuino3, Raspberry Pi, Particle/Spark Core & Photon, Tessel 2, TI Launchpad and more.

Four short links: 31 May 2018

Internet Trends, Deep Learning, Governing Commons, and Invisible Asymptotes

  1. Mary Meeker Internet Trends Report 2018 -- growth of number of internet-connected devices and users has slowed, but usage is still growing. And check out that exponential growth in the number of Wi-Fi networks globally. Her preso has got a whole lot less focused as she's scrambling for things that may still indicate that the tech boom isn't over.
  2. Deep Learning's Value (Hacker News) -- If you think Deep (Reinforcement) Learning is going to solve AGI, you are out of luck. If you however think it's useless and won't bring us anywhere, you are guaranteed to be wrong. Frankly, if you are daily working with Deep Learning, you are probably not seeing the big picture (i.e. how horrible methods used in real-life are and how you can easily get very economical 5% benefit of just plugging in Deep Learning somewhere in the pipeline; this might seem little but managers would kill for 5% of extra profit).
  3. Governing the Commons: The Evolution of Institutions for Collective Action (Amazon) -- Dr Ostrom uses institutional analysis to explore different ways - both successful and unsuccessful - of governing the commons. In contrast to the proposition of the 'tragedy of the commons' argument, common pool problems sometimes are solved by voluntary organizations rather than by a coercive state. Among the cases considered are communal tenure in meadows and forests, irrigation communities and other water rights, and fisheries.
  4. Invisible Asymptotes -- interesting stories from inside Amazon, then the idea of invisible asymptotes: the things that will stop your growth but you don't know what they are (the "shoulders in the S-curve"). People hate paying for shipping. They despise it. It may sound banal, even self-evident, but understanding that was, I'm convinced, so critical to much of how we unlocked growth at Amazon over the years. People don't just hate paying for shipping, they hate it to literally an irrational degree.

Four short links: 30 May 2018

Rapidly Learning Games, Geo Toolbox, Philosophy and CS, and Moravec's Paradox

  1. Playing Hard Exploration Games by Watching YouTube -- This method of one-shot imitation allows our agent to convincingly exceed human-level performance on the infamously hard exploration games Montezuma's Revenge, Pitfall! and Private Eye for the first time, even if the agent is not presented with any environment rewards. (via @hardmaru)
  2. Kepler.gl: An Open-Source Geospatial Toolbox -- Uber's React-built geo toolkit. No word on whether there's a function for faking randomly circling cars near your location.
  3. Why Philosophers Should Care About Computer Science (Scott Aaronson) -- computational complexity theory—the field that studies the resources (such as time, space, and randomness) needed to solve computational problems—leads to new perspectives on the nature of mathematical knowledge, the strong AI debate, computationalism, the problem of logical omniscience, Hume’s problem of induction, Goodman’s grue riddle, the foundations of quantum mechanics, economic rationality, closed timelike curves, and several other topics of philosophical interest.
  4. Moravec's Paradox -- the discovery by artificial intelligence and robotics researchers that, contrary to traditional assumptions, high-level reasoning requires very little computation, but low-level sensorimotor skills require enormous computational resources.

Four short links: 29 May 2018

Data Beats Algorithms, Copyright Futures, Data Privacy, and Cryptocurrency Attacks

  1. You Need to Improve Your Training Data (Pete Warden) -- without changing the model or test data at all, the top-one accuracy increased by over 4%, from 85.4% to 89.7%. Written up in an Arxiv paper.
  2. Future Not Made -- potential products that won't exist if the EU passes a database copyright law. In the words of Cory Doctorow: The feature all these devices share is that they rely on databases of user-supplied assets -- annotations, recorded sensations, shapefiles -- of the sort that the EU is about to make legally impossible. (via BoingBoing)
  3. California Eyes Data Privacy Measures -- Mactaggart says the proposed law would not prevent Facebook, Google or a local newspaper from collecting users' data and using it to target ads to them. But users will have a right to stop companies from sharing or selling their data. And businesses would be required to disclose the categories of information they have on users — including home addresses, employment information and characteristics such as race and gender.
  4. Cost of a 51% Attack on Popular Cryptocurrencies -- the commentary on Hacker News is also interesting. As one notes: These attacks are only possible for coins where the last column is > 100%. That's still a distressingly large total market cap in minor coins, even if it doesn't include the big players.

Four short links: 28 May 2018

Hypergrowth, Metaphor-Oriented Programming, Zombie Data, and Science Robotics Challenges

  1. Productivity in the Age of Hypergrowth -- good tips and perspective on scaling engineering teams as companies ramp up hiring.
  2. Homespring Programming Language Reference -- Homespring uses the paradigm of a river to create its astoundingly user-friendly semantics. Each program is a river system which flows into the watershed (the terminal output). Information is carried by salmon (which represent string values), which swim upstream trying to find their home river. Terminal input causes a new salmon to be spawned at the river mouth; when a salmon leaves the river system for the ocean, its value is output to the terminal. In this way, terminal I/O is neatly and elegantly represented within the system metaphor. It's a (joke) metaphor-oriented programming language that makes my eyes water.
  3. Engauge Digitizer -- Extracts data points from images of graphs. This creates zombie data (the data were dead and interred in a graph, now they're almost live again). Beware ...
  4. Grand Challenges of Science Robotics -- (i) New materials and fabrication schemes; (ii) Biohybrid and bioinspired robots; (iii) New power sources, battery technologies, and energy-harvesting schemes; (iv) Robot swarms; (v) Navigation and exploration in extreme environments; (vi) Fundamental aspects of artificial intelligence; (vii) Brain-computer interfaces (BCIs); (viii) Social interaction; (ix) Medical robotics with increasing levels of autonomy; (x) Ethics and security for responsible innovation in robotics.