Four Short Links

Nat Torkington's eclectic collection of curated links.

Four short links: 17 October 2017

Complex Strategy, Equity Compensation, Software Complexity, and Visual Coding

  1. The New Dynamics of Strategy -- Paper a Day's summary of a paper on the Cynefin framework for looking at situations as either complex, complicated, chaotic, or obvious.
  2. Open Guide to Equity Compensation -- easy to understand, useful when you're confronted with the "do I equity or do I salary, or how much of both?" conversation.
  3. Why Software Projects Spiral Out of Control -- The complexity isn’t the problem, though. The problem is the way we choose to uncover it.
  4. Seymour: Live Coding for the Classroom -- very nice implementation of a bunch of Bret Victor's ideas. See values as you code, watch loops unfold, etc.

Four short links: 16 October 2017

Exploding WiFi, No More Jailbreaking, Tech Ethics, and AI Strategy

  1. Krack Attack -- force WPA2 to reuse a key, making your secure network roll over and expose its soft underbelly. The bug is in the protocol, not any particular implementation. As a friend pointed out, many wireless ISPs use WPA2-PSK to auth their subscriber terminals. The blood will flow from far more than your home WiFi network.
  2. How Apple Killed iOS Jailbreaking -- First, they force their opponent to find four vulnerabilities; fixing any one of which breaks the jailbreak and forces the attacker to find a new flaw that serves the same purpose. Second, and perhaps more critically, Apple ensures that at least one of those flaws must be in the boot sequence. This is a huge advantage because, unlike most programs, boot loaders are typically relatively small (hundreds or thousands of lines of code, not millions) and they don’t need a lot of new features added over time. Thus, attackers can’t count on the bootloaders introducing new flaws. This creates a "narrow pass," and, as Sun Tzu advised ("With regard to narrow passes, if you can occupy them first, let them be strongly garrisoned and await the advent of the enemy."), Apple has fortified it.
  3. The Ethical Minefields of Technology (Scientific American) -- “Society keeps up because the technology needs to be able to land somewhere,” says Duncan. “The same is not true of our governments, and to fix it will require effort and thoughtfulness that is not currently on display.”
  4. How AI Will Change Strategy (HBR) -- Most shoppers have noticed Amazon’s recommendation engine while they shop—it offers suggestions of items that their AI predicts you will want to buy. [...] Now, imagine the AI uses that data to improve its predictions.[...] At some point, as they turn the knob, the AI’s prediction accuracy crosses a threshold, such that it becomes in Amazon’s interest to change its business model. The prediction becomes sufficiently accurate that it becomes more profitable for Amazon to ship you the goods that it predicts you will want rather than wait for you to order them.

Four short links: 13 October 2017

Deep Learned Faces, $900 Selfie Camera, MSFT+AMZN AI, Rich Text Editor

  1. Generating Faces with Deconvolution Networks -- scroll down and watch the videos even if you don't read the text on how it was done. The "illegal" face section is hypnotic, particularly the randomly changing parameters ... it's like the subject is struggling to be a face. (via Matt Webb)
  2. Casio Is Selling $900 Selfie Cameras in China -- “We think that we can still be competitive in the digital camera market if we can make the purpose of the camera clear,” Niida says. “We don’t sell cameras with interchangeable lenses, so we haven’t been trying to sell cameras to people who are really into taking photos as a hobby. Rather than that, we want to focus on people who want to take beautiful photos more easily and who want to keep their memories.” Buried in there: The TR series is the result of aggressive focus on Casio’s part. It’s aimed at a particular demographic: young women in the Chinese-speaking world, and as such the facial recognition software is tuned specifically toward them. Yes, white guys, this isn't for you.
  3. Gluon -- Microsoft and Amazon join forces on an AI library, hoping to unseat (or at least rival) Google's tools.
  4. Trix -- A Rich Text Editor for Everyday Writing.

Four short links: 12 October 2017

Competitive Self-Play, Edge Machine Learning, Design Thinking, and Graph Analysis & Visualization

  1. Competitive Self-Play (OpenAI) -- amazing videos of wrestling techniques the software developed.
  2. Microsoft EdgeML -- machine learning algorithms for edge devices. (slides and papers 1 and 2 about this)
  3. A Virtual Crash Course in Design Thinking (Stanford) -- Using the video, handouts, and facilitation tips below, we will take you step by step through the process of hosting or participating in a 90-minute design challenge.
  4. Cytoscape.js -- Graph theory / network library for analysis and visualization.

Four short links: 11 October 2017

Out-of-Print Scanning, Developing ARSudoku, Testing Chatbots, Remote Work Culture

  1. Books From 1923 to 1941 Are Now Liberated (Internet Archive) -- a little known, and perhaps never used, provision of U.S. copyright law, Section 108h, allows libraries to scan and make available materials published 1923 to 1941 if they are not being actively sold. My favorite so far: Your Life: The Popular Guide to Desirable Living, which is about being liked, tips from millionaires, and health ("Laxatives: The Great Illusion! The golden rule for constipation victims is: Treat Your Colon Kindly"), even sex: "The Frigid Wives of Reno: An eminent scientist offers counsel and comfort to one wife out of every four in America." (wince)
  2. How We Built the ARKit Sudoku Solver -- if you haven't seen it, it's magic: you look at a Sudoku, and it's solved. Highlights: the digit recognition was trained on 600,000 example Sudoku puzzles...and then got confused when pointed at a screen, did the math and used his own server instead of AWS.
  3. Playbook for Testing Chatbots -- a nice structure for testing something quite free-form.
  4. Company Culture Makes or Breaks Remote Work -- Lots of people I speak to are worried about remoters not working, which is evidence of a culture that doesn’t trust its people. But honestly, it’s easy to tell if remote people aren’t working because tasks don’t get completed. THIS!

Four short links: 10 October 2017

CSV Security, Streaming Analytics, Database Readings, and Apple II+ on a Chip

  1. Dangers of CSV Injection -- oh my gosh. Fields that start with =, even if quoted strings, are formulae.
  2. AthenaX -- Uber's SQL-based streaming analytics platform. (Blog post)
  3. Readings in Database Systems, 5e -- for all your bedtime reading needs.
  4. Apple2fpga -- As a Christmas present to myself in 2007, I implemented a 1980s-era Apple II+ in VHDL to run on an Altera DE2 FPGA board. The point, aside from entertainment, was to illustrate the power (or rather, low power) of modern FPGAs. Put another way, what made Steve Jobs his first million can now be a class project for my 4840 embedded systems class.

Four short links: 9 October 2017

Larry Wall, Real-Time Editing, Testing Security Keys, and AI Prediction

  1. Inside the Head of Larry Wall -- great to see another Larry talk, filled with his usual mind-stretching approaches to language design and community. (via Slashdot)
  2. A Simple Approach to Building a Real-Time Collaborative Editor -- very readable guide to a surprisingly difficult problem.
  3. Testing Security Keys (Adam Langley) -- tl;dr: buy a YubiKey, don't cheap out on AliExpress.
  4. The Seven Deadly Sins of AI Prediction (Rodney Brooks) -- This is a problem we all have with imagined future technology. If it is far enough away from the technology we have and understand today, then we do not know its limitations. And if it becomes indistinguishable from magic, anything one says about it is no longer falsifiable.

Four short links: 6 October 2017

Sniff HTTP Requests, Self-Driving Simulator, AI Security, and Teaching Computational Thinking Through AI

  1. pcap2curl -- Read a packet capture, extract HTTP requests and turn them into cURL commands for replay.
  2. EuroPilot -- A toolkit for controlling Euro Truck Simulator 2 with python to develop self-driving algorithms.
  3. Awesome AI Security -- collection of links.
  4. Engineering Courses on Computational Thinking Through Solving Problems in Artificial Intelligence​ -- Problems in machine intelligence systems intrinsically connect students to algorithmic-oriented computing and essential mathematical foundations. Beyond knowledge representation, AI fosters a gentle introduction to data structures and algorithms. Focused on engaging mental tools, a computer is never a necessity. Neither coding nor programming is ever required. Instead, students enjoy constructivist classrooms designed to always be active, flexible, and highly dynamic. Learning to learn and reflecting on cognitive experiences, they rigorously construct knowledge from collectively solving exciting puzzles, competing in strategic games, and participating in intellectual discussions.

Four short links: 5 October 2017

Patent Shenanigans, FX History, Storied Career, and Pixel Buds

  1. Allergan Gave Patents to the Mohawk Tribe (Ars Technica) -- Lawyers for Allergan are hoping that the principle of sovereign immunity, in which Native American tribes are treated as sovereign nations in certain ways, will protect their patents from government review. (via Slashdot)
  2. Oral History of the Tech of Terminator 2 -- I was like, ‘Awesome.’ When someone says, ‘Yeah, we’re not sure how to do this,’ you can’t do worse.
  3. A 35-Year Old BASIC Program and Its Author -- this guy's career is impressive, and a great example of "be in the right place at the right time with the right skills". At Wal-Mart, his Telxon project led to the first large-scale deployment of barcode technology in retail and to him writing the first two FM wireless digital data transfer protocols (P-TAP and R-TAP). Oh, and he wrote DarkStar BBS.
  4. Google Pixel Buds -- in-ear buds that do real-time translation. Brilliant competitive move, using G's home-field advantage of AI to make Apple's earbuds look primitive.

Four short links: 4 October 2017

Interactive Data Cleaning, Nematoduino, Hacking WebUSB, and Web History

  1. ActiveClean (Paper-A-Day) -- if you build a model, try it out, look at what it misclassifies, discover bad data, clean, and repeat...you're going to be bitten on the ass by Simpson's Paradox.
  2. Nematoduino -- an Arduino UNO-compatible robotic simulation of the C. elegans nematode, At the core of the simulation is a spiking neural network incorporating 300 neuron cells of the biological worm's connectome, along with associated muscle cells.
  3. Hacking WebUSB -- WebUSB is a JavaScript API to allow web sites access to connected USB devices. It is aimed at scientific and industrial USB devices and does not support common devices like webcams, HIDs, or mass storage devices. However, many other USB devices can be accessed using the WebUSB API, and users may not realize the level of access gained whenever they grant permission to a web site.
  4. History of the Browser User-Agent String -- everything in software is like this, but cruftier.

Four short links: 3 October 2017

Reality is Real, TensorFlow in Prod, Dashboard, and Event Detection from Wikipedia

  1. We Are Not in a Simulation (Cosmos Magazine) -- Ringel and Kovrizhi showed that attempts to use quantum Monte Carlo to model systems exhibiting anomalies, such as the quantum Hall effect, will always become unworkable. They discovered that the complexity of the simulation increased exponentially with the number of particles being simulated. If the complexity grew linearly with the number of particles being simulated, then doubling the number of partices would mean doubling the computing power required. If, however, the complexity grows on an exponential scale—where the amount of computing power has to double every time a single particle is added—then the task quickly becomes impossible. Whew, I can finally sleep at night. (via Slashdot)
  2. TFX: A TensorFlow-based Production-Scale Machine Learning Platform -- best description is from The Morning Paper. The new baseline: so far, you’ve embraced automated testing, continuous integration, continuous delivery, perhaps continuous deployment, and you have the sophistication to rollout new changes in a gradual manner, monitor behaviour, and stop or rollback when a problem is detected. On top of this, you’ve put in place a sophisticated metrics system and a continuous experimentation platform. Due to the increasing complexity of systems, you might also need to extend this to a general purpose black-box optimization platform. But you’re still not done yet! All those machine learning models you’ve been optimizing need to be trained, validated, and served somehow. You need a machine learning platform. That’s the topic of today’s paper choice, which describes the machine learning platform inside Google, TFX.
  3. redash -- GPLv3 dashboard, connects to RedShift, ElasticSearch, BigQuery, MongoDB, MySQL, PostgreSQL.
  4. Wikipedia Graph Mining: Dynamic Structure of Collective Memory -- they use the changing popularity of pages to identify significant events, even separating predictable events like tournaments from unpredictable ones like tragedies.

Four short links: 2 October 2017

Mimic Robot, How Companies Develop, Tim on The Future, and Hardware Startups

  1. Bipedal Oriented Whole Body Master-Slave Robot (YouTube) -- the master/slave lingo is dated, but the video is pretty cool. I hope the paper will show up on the conference page soon.
  2. The Evolution of Continuous Experimentation in Software Product Development -- At first, [companies] inherit the Agile principles within the development part of the organization and expand them to other departments. Next, companies focus on various lean concepts such as eliminating waste, removing constraints in the development pipeline, and advancing toward continuous integration and continuous deployment. ... Continuous deployment is characterized by a bidirectional channel that enables companies not only to send data to their customers to rapidly prototype with them, but also to receive feedback data from products in the field. (via Adrian Colyer)
  3. Don't Fear Technology, Robots, or the Future -- interview with Tim O'Reilly. So much sense.
  4. Hardware Startups, Failure, and Success (CB Insights) -- We’ve tracked a tech startup’s chances of success after raising an initial seed round, including both hardware and software companies. Only 46% of them will succeed in raising even just one additional round of funding. We also found that 70% of them will die or become “zombies”—i.e., self-sustaining. Imagine a mindset where being self-sufficient makes you a zombie.

Four short links: 29 September 2017

Self-Funding, Floating Point, Tech Careers, and Directed Graphs

  1. Things Learned While Running Your Own Self-Funded Startup -- 7. If you have a product that a very large company really wants, they'll still do everything they can to delay purchasing it for the market price. They'll try to hire you or your partner(s) away individually, or they'll wait as long as possible to see if you encounter hard financial times and go under. They won't come and just offer to license your product or buy you out until they've exhausted all other possibilities. Amen! And double-amen to this: Our long-term roadmap is based off what customers are actually doing with our software right now.
  2. Floating Point Visually Explained -- a very understandable explanation of how floating point numbers are represented in binary. As the Hacker News commenters pointed out, A much easier and better way to understand floating point is to just do it in base 10. But I think you still need an explanation like this to get to "just do it in base 10."
  3. Three Paths in the Tech Industry: Founder, Executive, and Employee -- talks about pro and cons, successful strategies for each. This Founder downside is so on the money: Incredibly stressful. Even success hurts.
  4. dgsh -- the directed graph shell.

Four short links: 28 September 2017

Deep Learning, Knowledge Base, Algorithm Transparency, and Formal Methods

  1. New Theory Cracks Open the Black Box of Deep Learning (Quanta) -- Talk (on YouTube), and paper (on arXiv) are interesting, but the article itself has lots of layman-accessible morsels. Tishby and Shwartz-Ziv also made the intriguing discovery that deep learning proceeds in two phases: a short “fitting” phase, during which the network learns to label its training data, and a much longer “compression” phase, during which it becomes good at generalization, as measured by its performance at labeling new test data.
  2. YAGO -- YAGO is a large semantic knowledge base, derived from Wikipedia, WordNet, WikiData, GeoNames, and other data sources.
  3. ProPublica Seeks Source Code for New York City’s Disputed DNA Software -- good to see more places legally testing opaque algorithms.
  4. New Ways of Coding (The Atlantic) -- from testing to formal methods, a readable and accurate survey of discontent about modern software development. The real problem in getting people to use TLA+, he said, was convincing them it wouldn’t be a waste of their time. Programmers, as a species, are relentlessly pragmatic. Tools like TLA+ reek of the ivory tower. When programmers encounter “formal methods” (so called because they involve mathematical, “formally” precise descriptions of programs), their deep-seated instinct is to recoil. And yet, they're useful.

Four short links: 27 September 2017

Vespa Server, Open Source Chip, Tinder's Data, and New SDR Hardware

  1. Vespa.ai -- Yahoo's big data processing and serving engine. Used by Yahoo (now Oath) properties to serve ads, do searches, etc. By releasing Vespa, we are making it easy for anyone to build applications that can compute responses to user requests, over large data sets, at real time, and at internet scale—capabilities that up until now, have been within reach of only a few large companies. See also their blog post.​
  2. BOOM: An Open Source Out-of-Order RISC-V Core -- BOOM is an open source processor that implements the RV64G RISC-V Instruction Set Architecture (ISA). Like most contemporary high-performance cores, BOOM is superscalar (able to execute multiple instructions per cycle) and out-of-order (able to execute instructions as their dependencies are resolved and not restricted to their program order). BOOM is implemented as a parameterizable generator written using the Chisel hardware construction language that can be used to generate synthesizable implementations targeting both FPGAs and ASICs.
  3. Tinder Knows a Lot -- Some 800 pages came back containing information such as my Facebook “likes”, my photos from Instagram (even after I deleted the associated account), my education, the age rank of men I was interested in, how many times I connected, when and where every online conversation with every single one of my matches happened...
  4. LimeSDR Mini -- An open, full-duplex, USB stick radio for femtocells and more. (via Hack A Day)

Four short links: 26 September 2017

Metric Pitfalls, Big Data, Data Analysis, and Startup Advice

  1. Twelve Common Metric Interpretation Pitfalls (Paper a Day) -- from Microsoft experiences. This is solid gold for anyone doing online A/B testing. If you see an unexpected metric movement, positive or negative, it normally means there is an issue. For example, a too-good-to-be-true jump in number of clicks turned out to be because users were confused and clicking around trying to figure things out!
  2. Using Big Data to Solve Economic and Social Problems -- This introductory course, taught by Raj Chetty, shows how "big data" can be used to understand and solve some of the most important social and economic problems of our time. The course gives students an introduction to frontier research in applied economics and social science that does not require prior coursework in Economics or Statistics. Topics include equality of opportunity, education, health, the environment, and criminal justice. In the context of these topics, the course provides an introduction to basic statistical methods and data analysis techniques, including regression analysis, causal inference, quasi-experimental methods, and machine learning.
  3. Apache Arrow -- columnar in-memory data analysis; think R or Pandas, but as a C++ library with bindings to other languages.
  4. YC's Essential Startup Advice -- Your job as a founder will often seem to be continuously righting a capsized ship. This is normal.

Four short links: 25 September 2017

Group Theory Coloring Book, Architecture Diagrams, Cloud Landscape, and Internet of Radios

  1. Illustrated Group Theory -- a coloring book.
  2. Documenting Your Architecture -- clever use of Wireshark (nee Ethereal) and PlantUML, with a REPL, to map the interactions between components on a web system. What a clever hack.
  3. Cloud Native Landscape Project -- what's what in the world of cloud ops: Public Cloud, Provisioning, Runtime, Orchestration & Management, App Definition & Development, Platforms, Observability & Analysis. Mighty useful!
  4. WebSDR -- an Internet of radios connected to the Internet, which you can tune to your heart's content. (via Hacker News)

Four short links: 22 September 2017

Molecular Robots, Distributed Deep Nets, SQL Notebook, and Super-Accurate GPS

  1. Scientists Create World’s First ‘Molecular Robot’ Capable Of Building Molecules -- Each individual robot is capable of manipulating a single molecule and is made up of just 150 carbon, hydrogen, oxygen and nitrogen atoms. To put that size into context, a billion billion of these robots piled on top of each other would still only be the same size as a single grain of salt. The robots operate by carrying out chemical reactions in special solutions which can then be controlled and programmed by scientists to perform the basic tasks. (via Slashdot)
  2. Distributed Deep Neural Networks -- in Adrian Colyer's words: DDNNs partition networks between mobile/embedded devices, cloud (and edge), although the partitioning is static. What’s new and very interesting here though is the ability to aggregate inputs from multiple devices (e.g., with local sensors) in a single model, and the ability to short-circuit classification at lower levels in the model (closer to the end devices) if confidence in the classification has already passed a certain threshold. It looks like both teams worked independently and in parallel on their solutions. Overall, DDNNs are shown to give lower latency decisions with higher accuracy than either cloud or devices working in isolation, as well as fault tolerance in the sense that classification accuracy remains high even if individual devices fail. (via Morning Paper)
  3. Franchise -- an open-source notebook for sql.
  4. Super-Accurate GPS Chips Coming to Smartphones in 2018 (IEEE Spectrum) -- 30cm accuracy (today: 5m), will help with the reflections you get in cities, and with 50% energy savings.

Four short links: 21 September 2017

Synthetic Muscles, Smarter SSH, Kickstarter Post-Mortem, and Computational Drawing

  1. Additive Synthetic Muscles -- electrically-actuated high stress, high strain, low density, 3D-printable muscles.
  2. teleport -- modern SSH that groks bastion hosts, certificates, and more.
  3. Anatomy of a Kickstarter -- It is possible to outsource much of the Kickstarter process, including copywriting, fulfilment, customer support and marketing. I treated the whole process as a learning experience and set aside 50% of my time for three months to appreciate its nuances from start to finish, with a hard-stop due to other commitments. Post-Kickstarter I committed another three months over the following year to deliver experiences such as the expedition to Afghanistan and stretch goals. BackerKit was the obvious candidate to outsource operations to, but was rejected for violating the no-asshole rule: they were tone-deaf, evasive on responding to cost estimates, and nagging in a way that only organisations that live and die by CRM systems can be.
  4. rune.js -- a JavaScript library for programming graphic design systems with SVG in both the browser or node.js.

Four short links: 20 September 2017

AI Needs Ethics, Automotive-Grade Linux, Drawing Clocks, and Facial Recognition

  1. AI Research Needs an Ethical Watchdog (Wired) -- Right now, if government-funded scientists want to research humans for a study, the law requires them to get the approval of an ethics committee known as an institutional review board, or IRB. Stanford’s review board approved Kosinski and Wang’s study. But these boards use rules developed 40 years ago for protecting people during real-life interactions, such as drawing blood or conducting interviews. “The regulations were designed for a very specific type of research harm and a specific set of research methods that simply don’t hold for data science,” says Metcalf.
  2. Automotive-Grade Linux Debuts On The 2018 Toyota Camry -- you heard it here first: 2018 is the year of the Linux hatchback. You heard it here first!
  3. Clocks for Software Engineers -- The first and perhaps most difficult part of learning hardware design is to learn that all hardware design is parallel design. Things don’t take place serially, as in one instruction after another ... like they do in a computer. Instead, everything happens at once.
  4. Facial Recognition is Here to Stay -- I have to admit that when I saw facial recognition improving, and realised it'd be useful in a few years, I never imagined the use case would be "so the cashier at Chik-Fil-A would know your name."