Four Short Links

Nat Torkington's eclectic collection of curated links.

Four short links: 17 November 2017

Interactive Marginalia, In-Person Interactions, Welcoming Groups, and Systems Challenges

  1. Interactive Marginalia (Liza Daly) -- wonderfully thoughtful piece about web annotations.
  2. In-Person Interactions -- Casual human interaction gives you lots of serendipitous opportunities to figure out that the problem you thought you were solving is not the most important problem, and that you should be thinking about something else. Computers aren't so good at that. So true! (via Daniel Bachhuber)
  3. Pacman Rule -- When standing as a group of people, always leave room for 1 person to join your group. (via Simon Willison)
  4. Berkeley View of Systems Challenges for AI -- In this paper, we propose several open research directions in systems, architectures, and security that can address these challenges and help unlock AI’s potential to improve lives and society.

Four short links: 16 November 2017

Regulate IoT, Visualize CRISPR, Distract Strategically, and Code Together

  1. It's Time to Regulate IoT To Improve Security -- Bruce Schneier puts it nicely: internet security is now becoming "everything" security.
  2. Real-Space and Real-Time Dynamics of CRISPR-Cas9 (Nature) -- great visuals, written up for laypeople in The Atlantic. (via Hacker News)
  3. How the Chinese Government Fabricates Social Media Posts for Strategic Distraction, not Engaged Argument -- research paper. Application to American media left as exercise to the reader.
  4. Coding Together in Real Time with Teletype for Atom -- what it says on the box.

Four short links: 15 November 2017

Paywalled Research, Reproducing AI Research, Spy Teardown, and Peer-to-Peer Misinformation

  1. 65 of the 100 Most-Cited Papers Are Paywalled -- The weighted average of all the paywalls is: $32.33 [...] [T]he open access articles in this list are, on average, cited more than the paywalled ones.
  2. AI Reproducibility -- Participants have been tasked with reproducing papers submitted to the 2018 International Conference on Learning Representations, one of AI’s biggest gatherings. The papers are anonymously published months in advance of the conference. The publishing system allows for comments to be made on those submitted papers, so students and others can add their findings below each paper. [...] Proprietary data and information used by large technology companies in their research, but withheld from papers, is holding the field back.
  3. Inside a Low-Budget Consumer Hardware Espionage Implant -- The S8 data line locator is a GSM listening and location device hidden inside the plug of a standard USB data/charging cable. Has a microphone but no GPS, remotely triggered via SMS messages, uses data to report cell tower location to a dodgy server...and is hidden in a USB cable.
  4. She Warned of ‘Peer-to-Peer Misinformation.’ Congress Listened (NY Times) -- Renee's work on anti-vaccine groups (and her college thesis on propaganda in the 2004 Russian elections) led naturally to her becoming an expert on Russian propaganda in the 2016 elections.

Four short links: 14 November 2017

AI Microscope, Android Geriatrics, Doxing Research, and Anti-Goals

  1. AI-Powered Microscope Counts Malaria Parasites in Blood Samples (IEEE Spectrum) -- The EasyScan GO microscope under development would combine bright-field microscope technology with a laptop computer running deep learning software that can automatically identify parasites that cause malaria. Human lab workers would mostly focus on preparing the slides of blood samples to view under the microscope and verifying the results. Currently 20m/slide (same as a human), but they want to cut it to 10m/slide.
  2. A Billion Outdated Android Devices in Use -- never ask why security researchers drink more than the rest of society.
  3. Datasette (Simon Willison) -- instantly create and publish an API for your SQLite databases.
  4. Fifteen Minutes of Unwanted Fame: Detecting and Characterizing Doxing -- This work analyzes over 1.7 million text files posted to pastebin.com, 4chan.org, and 8ch.net, sites frequently used to share doxes online, over a combined period of approximately 13 weeks. Notable findings in this work include that approximately 0.3% of shared files are doxes, that online social networking accounts mentioned in these dox files are more likely to close than typical accounts, that justice and revenge are the most often cited motivations for doxing, and that dox files target males more frequently than females.
  5. The Power of Anti-Goals (Andrew Wilkinson) -- instead of exhausting aspirations, focus on avoiding the things that deplete your life. (via Daniel Bachhuber)

Four short links: 13 November 2017

Software 2.0, Watson Walkback, Robot Fish, and Smartphone Data

  1. Software 2.0 (Andrej Karpathy) -- A large nimber of programmers of tomorrow do not maintain complex software repositories, write intricate programs, or analyze their running times. They collect, clean, manipulate, label, analyze, and visualize data that feeds neural networks. Supported by Pete Warden: I know this will all sound like more deep learning hype, and if I wasn’t in the position of seeing the process happening every day, I’d find it hard to swallow too, but this is real. Bill Gates is supposed to have said "Most people overestimate what they can do in one year and underestimate what they can do in 10 years," and this is how I feel about the replacement of traditional software with deep learning. There will be a long ramp-up as knowledge diffuses through the developer community, but in 10 years, I predict most software jobs won’t involve programming. As Andrej memorably puts it, “[deep learning] is better than you”!
  2. IBM Watson Not Even Close -- The interviews suggest that IBM, in its rush to bolster flagging revenue, unleashed a product without fully assessing the challenges of deploying it in hospitals globally. While it has emphatically marketed Watson for cancer care, IBM hasn’t published any scientific papers demonstrating how the technology affects physicians and patients. As a result, its flaws are getting exposed on the front lines of care by doctors and researchers who say that the system, while promising in some respects, remains undeveloped. AI has been drastically overhyped, and there will be more disappointments to come.
  3. Robot Spy Fish -- “The fish accepted the robot into their schools without any problem,” says Bonnet. “And the robot was also able to mimic the fish’s behavior, prompting them to change direction or swim from one room to another.”
  4. Politics Gets Personal: Effects of Political Partisanship and Advertising on Family Ties -- Using smartphone-tracking data and precinct-level voting, we show that politically divided families shortened Thanksgiving dinners by 20-30 minutes following the divisive 2016 election.[...] we estimate 27 million person-hours of cross-partisan Thanksgiving discourse were lost in 2016 to ad-fueled partisan effects Smartphone data is useful data. (via Marginal Revolution)

Four short links: 10 November 2017

Syntactic Sugar, Surprise Camera, AI Models, and Git Recovery

  1. Ten Features From Modern Programming Languages -- interesting collection of different flavors of syntactic sugar.
  2. Access Both iPhone Cameras Any Time Your App is Running -- Once you grant an app access to your camera, it can: access both the front and the back camera; record you at any time the app is in the foreground; take pictures and videos without telling you; upload the pictures/videos it takes immediately; run real-time face recognition to detect facial features or expressions.
  3. Deep Learning Models with Demos -- portable and searchable compilation of pre-trained deep learning models. With demos and code. Pre-trained models are deep learning model weights that you can download and use without training. Note that computation is not done in the browser.
  4. Git flight rules -- Flight rules are the hard-earned body of knowledge recorded in manuals that list, step-by-step, what to do if X occurs, and why. Essentially, they are extremely detailed, scenario-specific standard operating procedures. [...]

Four short links: 9 November 2017

Culture, Identifying Bots, Attention Economy, and Machine Bias

  1. Culture is the Behaviour You Reward and Punish -- When all the “successful” people behave in the same way, culture is made.
  2. Identifying Viral Bots and Cyborgs in Social Media -- it is readily possible to identify social bots and cyborgs on both Twitter and Facebook using information entropy and then to find groups of successful bots using network analysis and community detection.
  3. An Economy Based on Attention is Easily Gamed (The Economist) -- Americans touch their smartphones on average more than 2,600 times a day (the heaviest users easily double that). The population of America farts about 3m times a minute. It likes things on Facebook about 4m times a minute.
  4. Frankenstein's Legacy: Four conversations about Artificial Intelligence, Machine Learning, and the Modern World (CMU) -- A machine isn’t a human. It’s not going to necessarily incorporate bias even from biased training data in the same way that a human would. Machine learning isn’t necessarily going to adopt—for lack of a better word—a clearly racist bias. It’s likely to have some kind of much more nuanced bias that is far more difficult to predict. It may, say, come up with very specific instances of people it doesn’t want to hire that may not even be related to human bias.

Four short links: 8 November 2017

Shadow Profiles, Theories of Learning, Feature Visualization, and Time to Reflect Reality

  1. How Facebook Figures Out Everyone You've Ever Met (Gizmodo) -- Behind the Facebook profile you’ve built for yourself is another one, a shadow profile, built from the inboxes and smartphones of other Facebook users. Contact information you’ve never given the network gets associated with your account, making it easier for Facebook to more completely map your social connections. (via Slashdot)
  2. Theories of Deep Learning (STATS 385) -- Stanford class. Lecture videos are posted after the lectures are given.
  3. Feature Visualization (Distill) -- How neural networks build up their understanding of images. Wonderfully visual.
  4. Mapping's Intelligent Agents -- Industry players are developing dynamic HD maps, accurate within inches, that would afford the car’s sensors some geographic foresight, allowing it to calculate its precise position relative to fixed landmarks. [...] Yet, achieving real-time “truth” throughout the network requires overcoming limitations in data infrastructure. The rate of data collection, processing, transmission, and actuation is limited by cellular bandwidth as well as on-board computing power. Mobileye is attempting to speed things up by compressing new map information into a “Road Segment Data” capsule that can be pushed between the master map in the Cloud and cars in the field. If nothing else, the system has given us a memorable new term, “Time to Reflect Reality,” which is the metric of lag time between the world as it is and the world as it is known to machines.

Four short links: 7 November 2017

Disturbing YouTube, Sketchy Presentation Tool, Yammer UI, and Dance Your Ph.D. Winners

  1. Something is Wrong on the Internet (James Bridle) -- This is a deeply dark time, in which the structures we have built to sustain ourselves are being used against us — all of us — in systematic and automated ways. It is hard to keep faith with the network when it produces horrors such as these. While it is tempting to dismiss the wilder examples as trolling, of which a significant number certainly are, that fails to account for the sheer volume of content weighted in a particularly grotesque direction. This is another reason why propping your kids in front of YouTube is unsafe and unwise.
  2. ChalkTalk -- a digital presentation and communication language in development at New York University's Future Reality Lab. Using a blackboard-like interface, it allows a presenter to create and interact with animated digital sketches in order to demonstrate ideas and concepts in the context of a live presentation or conversation.
  3. YamUI -- Microsoft open-sourced the reusable component framework that they built for Yammer. [B]uilt with React on top of Office UI Fabric components.
  4. Dance Your Ph.D. Finalists -- look at the finalists on this site, read about the winners on Smithsonian.

Four short links: 6 November 2017

IoT Standard, Probabilistic Programming, Go Scripting, and Front-End Checklist

  1. A Firmware Update Architecture for Internet of Things Devices -- draft submitted to IETF. It has a long way to go before it's a standard, but gosh it'd be nice to have this stuff without everyone reinventing it from scratch. (via Bleeping Computer)
  2. Pyro -- a universal probabilistic programming language (PPL) written in Python and supported by PyTorch on the back end. Pyro enables flexible and expressive deep probabilistic modeling, unifying the best of modern deep learning and Bayesian modeling.
  3. Neugram -- scripting language integrated with Go. Overview of the language.
  4. Front-End Checklist -- an exhaustive list of all elements you need to have / to test before launching your site / HTML page to production. (website)

Four short links: 3 November 2017

End of Startups, Company Strategy, Complex Futures, and Bitcoin Energy

  1. Ask Not For Whom The Deadpool Tolls -- We live in a new world now, and it favors the big, not the small. The pendulum has already begun to swing back. Big businesses and executives, rather than startups and entrepreneurs, will own the next decade; today’s graduates are much more likely to work for Mark Zuckerberg than follow in his footsteps.
  2. Notes on Developing a Strategy and Designing a Company -- These notes provide a sequence of steps for creating or evaluating a strategy and associated company design, drawing clear lines to quantitative and evidence-based evaluation of enterprise performance and to financial valuation. The notes are intended for practical use by managers or instructors of MBAs and executive MBAs.
  3. Designing Our Complex Future with Machines (Joi Ito) -- We should learn from our history of applying over-reductionist science to society and try to, as Wiener says, “cease to kiss the whip that lashes us.” While it is one of the key drivers of science—to elegantly explain the complex and reduce confusion to understanding—we must also remember what Albert Einstein said: “Everything should be made as simple as possible, but no simpler.” We need to embrace the unknowability—the irreducibility—of the real world that artists, biologists, and those who work in the messy world of liberal arts and humanities are familiar with.
  4. Bitcoin Energy Consumption -- 7.51 U.S. households powered for a day by one transaction; $1B of energy used in a year to mine; Bitcoin has the same energy consumption as all of Nigeria. "Bitcoin" is how homo economicus pronounces "externality."

Four short links: 2 November 2017

Capsule Neural Networks, Adversarial Objects, Deep Learning Language, and Crowdsourced Pop Star

  1. Dynamic Routing Between Capsules -- new paper from one of the deep learning luminaries, Geoff Hinton. Hacker Noon explains: In this paper the authors project that human brains have modules called “capsules.” These capsules are particularly good at handling different types of visual stimulus and encoding things like pose (position, size, orientation), deformation, velocity, albedo, hue, texture, etc. The brain must have a mechanism for “routing” low-level visual information to what it believes is the best capsule for handling it.
  2. Adversarial Objects -- Here is a 3D-printed turtle that is classified at every viewpoint as a “rifle” by Google’s InceptionV3 image classifier, whereas the unperturbed turtle is consistently classified as “turtle.”
  3. DeepNLP 2017 -- Oxford University applied course focussing on recent advances in analyzing and generating speech and text using recurrent neural networks.
  4. Virtual Singer Becomes Japanese Mega-Star (Bloomberg) -- CG-rendered pop star, singing crowdsourced songs. Crucial to Miku’s success is the ability for devotees to purchase the Yamaha-powered Vocaloid software and write their own songs for the star to sing right back at them. Fans then can upload songs to the web and vie for the honor of having her perform them at “live” gigs, in which the computer-animated Miku takes center stage, surrounded by human guitarists, drummers and pianists. This is fantastic. (via Slashdot)

Four short links: 1 November 2017

Crypto Docs, Ultrasound, Anti-Innovation Investors, and IoT Security

  1. Airborn OS -- attempt to do an open source Google Docs with crypto.
  2. ButterflyIQ -- ultrasound on a chip. IEEE covers it: announced FDA clearance for 13 clinical applications, including cardiac scans, fetal and obstetric exams, and musculoskeletal checks. Rather than using a dedicated piece of hardware for the controls and image display, the iQ works with the user’s iPhone. The company says it will start shipping units in 2018 at an initial price of about $2,000. See also adding orientation to ultrasound to turn 2D into 3D.
  3. Innovation vs. Activist Investors (Steve Blank) -- "activist investor" is all about financial games to transfer cash from banks to the investors, by loading the company with debt. The bad news is that, once they take control of a company, activist investors’ goal is not long-term investment. They often kill any long-term strategic initiatives. Often, the short-term cuts directly affect employee salaries, jobs, and long-term investment in R&D. The first things to go are R&D centers and innovation initiatives. They don't want genuine growth; they want fake growth that leaves the company weaker.
  4. Security, Privacy, and the Internet of Things (Matt Webb) -- if I meet a startup that has spent ages on its security, pre getting some real customer traction, I am going to be nervous that they have over-engineered the product and won't be able to iterate. The product will be too brittle or too rigid to wiggle and iterate and achieve fit. So, it's a balance.

Four short links: 31 October 2017

AI for Databases, One-Pixel Attacks, Adtech Uncanny Valley, and Mindreading Video

  1. Inference and Regeneration of Programs that Manipulate Relational Databases -- We present a new technique that infers models of programs that manipulate relational databases. This technique generates test databases and input commands, runs the program, then observes the resulting outputs and updated databases to infer the model. Because the technique works only with the externally observable inputs, outputs, and databases, it can infer the behavior of programs written in arbitrary languages using arbitrary coding styles and patterns.
  2. One-Pixel Attack for Fooling Deep Neural Networks -- The results show that 73.8% of the test images can be crafted to adversarial images with modification just on one pixel with 98.7% confidence on average.
  3. Facebook Is Not Listening To You -- but we are deep in the adtech uncanny valley.
  4. Recovering Video from fMRI -- the video and stills are impressive. (Still a blurry black-and-white picture and a set of guessed possible labels.)

Four short links: 30 October 2017

README Maturity Model, Open Source Project Maturity Model, Walmart Robots, and Sparse Array Database

  1. README Maturity Model -- from bare minimum to purpose.
  2. Apache's Open Source Project Maturity Model -- It does not describe all the details of how our projects operate, but aims to capture the invariants of Apache projects and point to additional information where needed.
  3. Walmart is Getting Robots -- The retailer has been testing the robots in a small number of stores in Arkansas and California. It is now expanding the program and will have robots in 50 stores by the end of January.
  4. TileDB -- manages massive dense and sparse multi-dimensional array data that frequently arise in important scientific applications.

Four short links: 27 October 2017

Gentle PR, Readable Arxiv, Sentiment Bias, and AI Coding from Sketches

  1. Tick Tock List (Matt Webb) -- simple and good advice for building working relationships with journalists.
  2. Arxiv Vanity -- renders papers from Arxiv as responsive web pages so you don't have to squint at a PDF.
  3. Sentiment Analysis Bias -- By classifying the sentiment of words using GloVe, the researchers "found every linguistic bias documented in psychology that we have looked for." Unsurprising, since the biases are present in the people who generate the text from which these systems are trained.
  4. AI Turns Sketched Interfaces into Prototype Code -- We built an initial prototype using about a dozen hand-drawn components as training data, open source machine learning algorithms, and a small amount of intermediary code to render components from our design system into the browser. We were pleasantly surprised with the result.

Four short links: 26 October 2017

License Plates, Speech Recognition, Social Proof, and Engineering Growth

  1. FMTYEWTK About Home-Made License Plate Readers -- this chap, horrified by an $86M government project, built a prototype in 57 lines of code. He talks here about the shortcomings of the prototype, and along the way you learn a lot about ALPR, Automated License Plate Recognition. So if $1 million gets you to 80% accuracy, and maybe $10 million gets you to 90% accuracy—when do you stop spending?
  2. Speech Recognition Is Not Solved -- The recent improvements on conversational speech are astounding. But, the claims about human-level performance are too broad. Below are a few of the areas that still need improvement.
  3. Social Proof -- five principles of social proof: 1. Avoid negative social proof; 2. Combine social proof with authority; 3. Combine social proof with scarcity; 4. Social proof works best with similar people; 5. Boost social proof with user-generated content.
  4. Engineering Growth Frameworks -- Documentation for Medium’s professional growth framework. Super useful for engineering organizations that don't yet have their own.

Four short links: 25 October 2017

Simpson's Paradox, Attention Economics, Dynamic Programming, and Retro Unit Testing

  1. Simpson's Paradox in Behavioral Data -- current behavioral data is highly heterogeneous: it is collected from subgroups that vary widely in size and behavior. Heterogeneity is evident in practically all social data sets and can be easily recognized by its hallmark, the long-tailed distribution. The prevalence of some trait in these systems, whether the number of followers in an online social network, or the number of words used in an email, can vary by many orders of magnitude, making it difficult to compare users with small values of the trait to those with large values. As shown in this paper, heterogeneity can dramatically distort conclusions of analysis.
  2. The Economics of Attention Markets -- Based on conservative estimates, in 2016 a typical American adult spent about 4.9 hours of a day focused mainly on consuming content from these media properties. That amounted to about 437 billion hours for all adults. Advertisers paid roughly $199 billion that year to media businesses to deliver messages to those consumers during those hours. That is the market for attention. Consumers supply time—their attention—to the market in return for content that entertains or informs them. Advertisers demand attention so they can deliver messages that will increase their sales and profits. Attention platforms—ad-supported media businesses—broker the connections between consumers and advertisers. This paper provides a primer on the economics of this market.
  3. Dynamic Programming from First Principles -- a readable introduction to a subject we covered in my third-year CS analysis of algorithms class.
  4. ZX-Spec -- a unit testing framework for Sinclair ZX Spectrum assembly. I boggle.

Four short links: 24 October 2017

If You Build It, Bias not Behavior, Cognitive Biases, and Future of SaaS Businesses

  1. Your First Ten Customers (Stripe) -- If you build it, they will do absolutely nothing..
  2. Using Sensors to Show that Men and Women are Treated Differently At Work (HBR) -- We collected email communication and meeting schedule data for 500 employees in one office, across all five levels of seniority, over the course of four months. We then gave 100 of these individuals sociometric badges, which allowed us to track in-person behavior. These badges, which look like large ID badges and are worn by all employees, record communication patterns using sensors that measure movement, proximity to other badges, and speech (volume and tone of voice, but not content). They can tell us who talks with whom, where people communicate, and who dominates conversations. [...] Our analysis suggests that the difference in promotion rates between men and women in this company was due not to their behavior but to how they were treated. This indicates that arguments about changing women’s behavior—to “lean-in,” for example—might miss the bigger picture: gender inequality is due to bias, not differences in behavior.
  3. Cognitive Biases in Programming -- the planning fallacy remains, for me, the defining characteristic of programmers. When you stumble blinking back into the light and realize you've spent 18 hours trying to make bug-free a process that saves you five minutes every two months, you have programmerbrain.
  4. Rising Table Stakes in SaaS (Tom Tonguz) -- software has eaten the world, the low-hanging fruit has been picked, and your competition is smarter/faster/cheaper versions of yourself and not bloated dinosaur client-server incumbents. What's next? (No, really, tell me what's next. I'm @gnat on Twitter.)

Four short links: 23 October 2017

Web Performance, Fact Checks, DIY Computer, and Performance Reviews

  1. Real-world Web Performance Budgets (Alex Russell) -- JavaScript is the single most expensive part of any page in ways that are a function of both network capacity and device speed. For developers and decision-makers with fast phones on fast networks, this is a double-whammy of hidden costs. [...] 45% of mobile connections occur over 2G worldwide. 75% of connections occur on either 2G or 3G
  2. Fact Checks -- If you have a web page that reviews a claim made by others, you can include a ClaimReview structured data element on your web page. This element enables Google Search results to show a summarized version of your fact check when your page appears in search results for that claim.
  3. Reform -- a portable personal computer that you can: repair by yourself with parts from the hardware store or 3D printing; thoroughly understand on any level; take apart, modify, and upgrade without regret; adapt to your tastes and use cases, staying with you for many years.
  4. GE's New Performance Reviews -- The company got rid of formal, forced ranking around 10 years ago. But now, GE’s in the middle of a far bigger shift. It’s abandoning formal annual reviews and its legacy performance management system for its 300,000-strong workforce over the next couple of years, instead opting for a less regimented system of more frequent feedback via an app. For some employees, in smaller experimental groups, there won’t be any numerical rankings whatsoever. [...] There’s an emphasis on coaching throughout, and the tone is unrelentingly positive. The app forces users to categorize feedback in one of two forms: to continue doing something, or to consider changing something. (via Next:Economy newsletter)