Four Short Links

Nat Torkington's eclectic collection of curated links.

Four short links: 23 July 2018

State Sponsored Trolling, Public Standards, Explorable Explanations, and iOS Network Debugging

  1. State Sponsored Trolling (Institute For The Future) -- authoritarians around the world have mastered social media. Bloomberg did some great follow-up work on the IFTF report. (via Cory Doctorow)
  2. Public Resource Wins Right to Publish Standards Used in Law -- The question in this case is whether private organizations whose standards have been incorporated by reference can invoke copyright and trademark law to prevent the unauthorized copying and distribution of their works. [...] Because the district court erred in its application of both fair use doctrines, we reverse and remand, leaving for another day the far thornier question of whether standards retain their copyright after they are incorporated by reference into law.
  3. Explorable Explanations -- explanations and simulators for things to help you learn them. Regular readers will know I'm a huge fan of simulations as learning tools.
  4. Wormholy -- debug network iOS apps from within the app: Add it to your project, and that's all! Shake your device or your simulator and Wormholy will appear. In case, for whatever reason, the Charles proxy doesn't do it for you.

Four short links: 20 July 2018

Convolutional Architectures, GPU Language, Acoustic Scenes, and Cybersecurity Numbers

  1. DARTS: Differentiable Architecture Search -- our algorithm excels in discovering high-performance convolutional architectures for image classification and recurrent architectures for language modeling, while being orders of magnitude faster than state-of-the-art non-differentiable techniques. And runs on a single GPU. Open source.
  2. The Spiral Language -- a functional language designed for GPUs by emphasizing inlining (GPUs don't have great stacks, so compilers have to handle subroutines carefully and differently than traditional architectures). Inlining is a trade-off that expresses the exchange of memory for computation. It should be the default instead of heap allocating.
  3. DCASE: Detection and Classification of Acoustic Scenes and Events -- workshops and a community for the researchers working on making sense of audio.
  4. Cybersecurity: Data, Statistics, and Glossaries (FAS) -- This report describes data and statistics from government, industry, and information technology (IT) security firms regarding the current state of cybersecurity threats in the United States and internationally. These include incident estimates and costs, and annual reports on data security breaches, identity thefts, cybercrimes, malware, and network securities.

Four Short Links: 19 July 2018

Microrobotics, Adaptive Chips, ACM Ethics, and Data Journalism

  1. DARPA's Insect-Scale Robot Olympics (IEEE) -- Yesterday, DARPA announced a new program called SHRIMP: SHort-Range Independent Microrobotic Platforms. The goal is “to develop and demonstrate multi-functional micro-to-milli robotic platforms for use in natural and critical disaster scenarios.”
  2. DARPA Changing How Electronics Are Made (IEEE) -- Step two, to be kicked off at the summit, is something we call “software-defined hardware.” That’s where the hardware is smart enough to reconfigure itself to be the type of hardware you want, based on an analysis of the data type that you’re working on. In that case, the very hard thing is to figure out how to do that data introspection, how to reconfigure the chip on a microsecond or millisecond timescale to be what you need it to be. And more importantly, it has to monitor whether you’re right or not, so that you can iterate and be constantly evolving toward the ideal solution.
  3. ACM Updates Ethics Code -- ACM revised their code of ethics to include references to emerging technology, discrimination, and data policy. They're also releasing case studies and an Ask An Ethicist advice column to help people understand how to apply the principles.
  4. Data Journalism Workshop Notes -- Harkanwal Singh gave a workshop on data journalism, which yielded these excellent notes via Liza Bolton.

Four short links: 18 July 2018

Program Synthesis, Climate Change, Remote Teams, and Go Memory Management

  1. Program Synthesis in 2018 -- this is a readable and deeply informative guide to the state of the art in program synthesis (generating programs from specifications). I'm highly interested in this field, as it's a possible future of programming, and when advances are made in useful areas, it will be highly disruptive.
  2. Lights Out: Climate Change Risk to Internet Infrastructure -- We align the data formats and assess risks in terms of the amount and type of infrastructure that will be under water in different time intervals over the next 100 years. We find that 4,067 miles of fiber conduit will be under water and 1,101 nodes (e.g., points of presence and colocation centers) will be surrounded by water in the next 15 years. We further quantify the risks of sea level rise by defining a metric that considers the combination of geographic scope and internet infrastructure density. We use this metric to examine different regions and find that the New York, Miami, and Seattle metropolitan areas are at highest risk.
  3. Managing Your Remote Developer Team if You're Non-Technical -- I feel like this also applies to technical remote managers, too.
  4. Getting to Go: Memory Management and Garbage Collection -- The Go language features, goals, and use cases have forced us to rethink the entire garbage collection stack and have led us to a surprising place. The journey has been exhilarating. This talk describes our journey. Detailed and for a technical audience.

Four short links: 17 July 2018

Sizing Teams, Publishing Incentives, Serverless Experience, and Configuration Languages

  1. Sizing Engineering Teams -- Teams should be six to eight during steady state. To create a new team, grow an existing team to eight to 10, and then bud into two teams of four or five. Never create empty teams. Never leave managers supporting more than eight folks.
  2. Cockygate -- in which a somewhat amusing lawsuit between Kindle Unlimited authors is pulled apart, and the incentives that drove weird behaviour are laid bare.
  3. AWS Kinesis with Lambdas: Lessons Learned -- These are our learnings from building a fully reactive serverless pipeline on AWS. See also the Hacker News comments with some other thoughtful heavy users sharing their cautionary tales.
  4. Dhall -- A configuration language guaranteed to terminate—useful for specifying Kubernetes, etc., configurations.

Four short links: 16 July 2018

Automate Everything, TDD Retention, Automating Programming, and System Design

  1. Unfollowing Everybody (Anil Dash) -- Anil has a good way of dealing with overload, but that's not the only reason I list it. Note how his method requires automation. A system that can't be automated is a prison.
  2. A Longitudinal Cohort Study on the Retainment of Test-Driven Development -- The use of TDD has a statistically significant effect neither on the external quality of software products nor on the developers’ productivity. However, we observed that participants using TDD produced significantly more tests than those applying a non-TDD development process, and that the retainment of TDD is particularly noticeable in the number of tests written.
  3. What ML Means for Software Development (Lorica, Loukides) -- a subject dear to my heart. I can't wait for software development to be improved. Good software developers have always sought to automate tedious, repetitive tasks; that’s what computers are for. It should be no surprise that software development itself will increasingly be automated.
  4. Learn How to Design Large-Scale Systems -- This repo is an organized collection of resources to help you learn how to build systems at scale. It even has Anki flashcards to help you prep for the exam.

Four short links: 13 July 2018

Technology Change, Rebuild Warnings, Google Cloud Platform, and Vale Guido

  1. Five Things We Need to Know About Technological Change (Neil Postman) -- a 1998 talk that just nailed it. (1) culture always pays a price for technology; (2) the advantages and disadvantages of new technologies are never distributed evenly among the population; (3) every technology has a philosophy that is given expression in how the technology makes people use their minds, in what it makes us do with our bodies, in how it codifies the world, in which of our senses it amplifies, in which of our emotional and intellectual tendencies it disregards; (4) A new medium does not add something; it changes everything; (5) media tends to become mythic. (via Daniel G. Siegel)
  2. Five Red Flags Signaling Your Rebuild Will Fail -- No clear executive vision for the value of a rebuild; You’re going for the big cutover rewrite; The rebuild has slower feature velocity than the legacy system; You aren’t working with people who were experts in the old system; You’re planning to remove features because they’re hard.
  3. Good, Bad, and Ugly of Google Cloud Platform -- informative, and well-written—e.g., While GCP services exhibit strong consistency, I can’t always say the same thing for the documentation.
  4. Guido Takes "Permanent Vacation" as Python's BDFL -- prompted in part by a particularly contentious language change proposal. I don't ever want to have to fight so hard for a PEP and find that so many people despise my decisions. I would like to remove myself entirely from the decision process. [...] I'll still be here, but I'm trying to let you all figure something out for yourselves. I'm tired, and need a very long break. Thanks for your years of service, Guido.

Four short links: 12 July 2018

Debugging, Just Code, Causal Inference, and Infosec

  1. Why Isn't Debugging Treated as a First-Class Activity? (Robert O'Callahan) -- Another of my theories is that many developers have abandoned interactive debuggers because they're a very poor fit for many debugging problems (e.g., multiprocess, time-sensitive, and remote workloads—especially cloud and mobile applications). Debugging isn't really taught at schools, either. It's an odd forensic science. What are your favourite debugging tutorials, papers, or books? Let me know: @gnat.
  2. Just Code Challenge -- I'm a little late, but it's still a good idea. The idea is for you to make one program (or app) a week throughout the summer. These apps don’t have to do anything fancy, although they should do something that is at least a little bit useful or fun. Any type of app counts—desktop, iOS, or web.
  3. Causal Inference Book -- The book is divided in three parts of increasing difficulty: causal inference without models, causal inference with models, and causal inference from complex longitudinal data.
  4. -- A collection of information security essays and links to help growing teams manage risks.

Four short links: 11 July 2018

Metadata, AI Strategies, Program Synthesis, and Text-Based Browser

  1. You are your Metadata: Identification and Obfuscation of Social Media Users using Metadata Information -- We demonstrate that through the application of a supervised learning algorithm, we are able to identify any user in a group of 10,000 with approximately 96.7% accuracy. Moreover, if we broaden the scope of our search and consider the 10 most likely candidates, we increase the accuracy of the model to 99.22%. We also found that data obfuscation is hard and ineffective for this type of data: even after perturbing 60% of the training data, it is still possible to classify users with an accuracy higher than 95%. (via Wired UK)
  2. Overview of National AI Strategies -- where each country is at, what their goals are, etc.
  3. Building a Program Synthesizer -- Build a program synthesis tool, to generate programs from specifications, in 20 lines of code using Rosette. I'm interested in work people are doing to automatically create software. Like this example, most packages are still in a math-like larval stage. It's going to be interesting once they cross from "looks like a 1980s AI course" to "looks like Gmail".
  4. Browsh -- a text-based browser that uses the Firefox engine underneath (but rendering to text).

Four short links: 10 July 2018

Troubling Trends, Satellite Imagery, Management and Autonomy, and Brutalist Web Design

  1. Troubling Trends in Machine Learning Scholarship -- In this paper, we focus on the following four patterns that appear to us to be trending in ML scholarship: (i) Failure to distinguish between explanation and speculation. (ii) Failure to identify the sources of empirical gains—e.g., emphasizing unnecessary modifications to neural architectures when gains actually stem from hyper-parameter tuning. (iii) Mathiness: the use of mathematics that obfuscates or impresses rather than clarifies—e.g., by confusing technical and non-technical concepts. (iv) Misuse of language—e.g., by choosing terms of art with colloquial connotations or by overloading established technical terms.
  2. RoboSat -- mapbox open-sourced their machine learning system that does semantic segmentation on aerial and satellite imagery. Extracts features such as: buildings, parking lots, roads, water.
  3. On Management and Autonomy -- in our experience, too many managers err on the side of mistrust. They follow the basic premise that their people may operate completely autonomously, as long as they operate correctly. This amounts to no autonomy at all. The only freedom that has any meaning is the freedom to proceed differently from the way your manager would have proceeded. So true! (Parents: this applies to children, as well)
  4. Brutalist Web Design -- a manifesto.

Four short links: 9 July 2018

DNA Neural Nets, Ethics of AI, Oblivious Search, Physical Products

  1. Scaling up Molecular Pattern Recognition with DNA-based Winner-Take-All Neural Networks (Nature) -- they use two molecular bio techniques (DNA-strand-displacement and a "seesaw DNA gate motif") to implement a winner-takes-all type of neural network...with a soup of DNA. It recognizes digits from a 10x10 grid of pixels. The network successfully classified test patterns with up to 30 of the 100 bits flipped relative to the digit patterns "remembered" during training, suggesting that molecular circuits can robustly accomplish the sophisticated task of classifying highly complex and noisy information on the basis of similarity to a memory. (via The Next Web)
  2. The Ethics and Governance of Artificial Intelligence (MIT) -- video from three classes is online.
  3. Oblix: An Efficient Oblivious Search Index -- the new word I learned was "oblivious," meaning that the actions of the algorithm over encrypted data do not reveal which (encrypted) documents match the keyword being searched for. Paper a Day makes sense of their work.
  4. Sonos One and Amazon Alexa Teardowns -- using the physical product engineering to relate the smart speaker market positions of Sonos and Amazon. I work with mechanical and electrical engineers, and the invisible degrees of complexity in physical products always amazes me.

Four short links: 6 July 2018

REST vs. GraphQL, Chinese Sources, Popcorn Robots, and (Human) Learning Research

  1. Should You Migrate from REST to GraphQL? -- a nice precis of the good and bad parts of REST and GraphQL so you can make an informed decision about when to use.
  2. Abacus News: Unboxing China -- interesting website that's a bit like TechMeme but for China, and more consumer focused. See also The ChinAI Newsletter where you can read Jeff Ding's weekly translations of writings on AI policy and strategy from Chinese thinkers—will also include general links to all things at the intersection of China and AI. (via CognitionX)
  3. Popcorn-Driven Robotic Actuators -- Popcorn is a cheap, biodegradable way to actuate a robot (once). Fun silliness.
  4. Self-Regulated Learning: Beliefs, Techniques, and Illusions -- In this review, we summarize recent research on what people do and do not understand about the learning activities and processes that promote comprehension, retention, and transfer. Share with the student or life-long learner in your life.

Four short links: 5 July 2018

Programming Language Ideas, Probability in Language, React Tutorial, and Open Plan Pain

  1. Papers on Programming Languages: Ideas from 1970s for Today -- I suspect a vanishingly small number of these are unimplementable in Perl 6.
  2. If You Say Something is Likely, How Likely Do People Think It Is? (HBR) -- more fascinating research into how people translate probabilities into language and back again. There is a serious possibility that you will enjoy this.
  3. React from Zero -- tutorial in the classic "just get something working, then hack on it" style. (via Simon Willison)
  4. The Impact of The "Open" Workspace on Human Collaboration? (Royal Society) -- Contrary to common belief, the volume of face-to-face interaction decreased significantly (approx. 70%) in both cases, with an associated increase in electronic interaction. In short, rather than prompting increasingly vibrant face-to-face collaboration, open architecture appeared to trigger a natural human response to socially withdraw from officemates and interact instead over email and IM.

Four short links: 4 July 2018

Engagement, Leadership, Code Viz, and Automation

  1. The Three Games of Customer Engagement Strategy -- know what the growth hacker behind your favorite apps is trying to get you to do. Either you are playing to win attention, transactions, or productivity.
  2. Founder to CEO: Matt's Book for Startups -- really good systems and mental models for being effective as a leader.
  3. A Human-Readable Interactive Representation of a Code Library ​-- The interactive document below is an alternate representation of Fuzzyset.js. I created it as an experiment to help me and other programmers understand the internal workings of the library. And I made it look like a page on GitHub to simulate what it might be like if these kinds of documents were commonly provided with programs.
  4. Manual Work Is a Bug -- Four phases: Document the steps, create automation equivalents, create automation, self-service and autonomous services.

Four short links: 3 July 2018

Automation and Employment, Matrices for Deep Learning, Tim Berners-Lee, and How to Read

  1. The Rise of the Robot Reserve Army: Automation and the Future of Economic Development, Work, and Wages in Developing Countries -- In an adaption of the Lewis model of economic development, the paper uses a simple framework in which the potential for automation creates “unlimited supplies of artificial labor,” particularly in the agricultural and industrial sectors due to technological feasibility. This is likely to create a push force for labor to move into the service sector, leading to a bloating of service-sector employment and wage stagnation but not to mass unemployment, at least in the short-to-medium term. (via Sam Kinsley)
  2. The Matrix Calculus You Need for Deep Learning -- We assume no math knowledge beyond what you learned in calculus 1, and provide links to help you refresh the necessary math where needed. Note that you do not need to understand this material before you start learning to train and use deep learning in practice; rather, this material is for those who are already familiar with the basics of neural networks and wish to deepen their understanding of the underlying math.
  3. Solid: Recentralizing the Web -- Tim Berners-Lee's latest project. Solid (derived from "social linked data") is a proposed set of conventions and tools for building decentralized web applications based on linked data principles. (via Vanity Fair)
  4. How to Read (Robert Heaton) -- purposefully reading, with note-taking so you can write a review and build a memory deck.

Four short links: 2 July 2018

Soft Robots, Debugging Serverless, Map Privacy, and Building Footprints

  1. Adaptive and Resilient Soft Tensegrity Robots -- neat idea for soft robots that invent gaits with a minimum of physical trails, and the video is cute.
  2. Debugging Serverless -- what observability means in this context, and the things you have to pay attention to if you want observability.
  3. Apple Maps and Privacy -- buried in this piece on Apple rebuilding its Maps data using cars driving the streets: “We specifically don’t collect data, even from point A to point B,” notes Cue. “We collect data—when we do it—in an anonymous fashion, in subsections of the whole, so we couldn’t even say that there is a person who went from point A to point B. We’re collecting the segments of it. As you can imagine, that’s always been a key part of doing this. Honestly, we don’t think it buys us anything [to collect more]. We’re not losing any features or capabilities by doing this.”
  4. U.S. Building Footprints -- This data set contains 124,885,597 computer-generated building footprints in all 50 U.S. states. This data is freely available for download and use. Contributed by Microsoft. (via Bing blog)

Four short links: 29 June 2018

Stereotype Framework, Humanoid Robots, Dark Patterns, and Retro Terminal Emulator

  1. A Model of (Often Mixed) Stereotype Content: Competence and Warmth Respectively Follow From Perceived Status and Competition -- you can position stereotypes on two axes, warmth and competence. In different quadrants, status and competition predict the response to those stereotypes.
  2. Asimo Retired -- it sounds like the company wants to start focusing on how to apply the technology that it has to make robots that don't just promote its brand, but actually help out with things like elder care and disaster relief. Honda has done a lot of work on Asimo, but, as the article says, mainly for brand building. It was an iconic humanoid robot. I wonder if this represents (the beginning of?) the end of charismatic anthropomorphic robots.
  3. Deceived by Design -- dark patterns in Google, Facebook, and Windows products.
  4. Cool Retro Term -- open source Cathode-like app for Linux and Mac, emulating the old glass terminals with pixelated bitmap fonts making flickering green text on a black curved screen.

Four short links: 28 June 2018

Migrating Storage, Secure Coding, Barbie Roboticist, and Internet Developments

  1. Migrating Messenger Storage (Facebook Engineering) -- Once we decided to update the service and move to MyRocks, migrating data between storage systems while keeping Messenger up and running for more than one billion accounts proved to be an interesting challenge. It's amazing how much effort it takes to keep something looking the same.
  2. Secure Coding Practices in Java -- Researchers looked at StackOverflow answers. [W]e identified security vulnerabilities in the suggested code of accepted answers. The vulnerabilities included using insecure hash functions such as MD5, breaking SSL/TLS security through bypassing certificate validation, and insecurely disabling the default protection against Cross Site Request Forgery (CSRF) attacks. (via Paper a Day)
  3. Barbie's Latest Career is Robotics Engineering (Engadget) -- As of today, six free coding experiences are now available, as is a new STEM-themed doll -- Robotics Engineer Barbie. The lessons are geared toward beginners, kindergarten-aged and older, and aim to teach logic, problem-solving, and the basics of coding.
  4. Another 10 Years Later (Geoff Huston) -- what’s new, what’s old, and what’s been forgotten in another decade of the internet’s evolution. Very interesting for a software person like me to catch up on what's new in the networking world.

Four short links: 27 June 2018

Value Judgements, Bank Hacking, SDR for Engineers, and Licensing Bugs

  1. Automated Fact-Value Distinction in Court Opinions -- In an application, we show that the value segments of opinions are more informative than fact segments of the ideological direction of U.S. Circuit Court opinions.
  2. ATMs That Spray Money (Bloomberg) -- an entertaining tale of a criminal operation that spear phishes bank employees to get them to install malware, then takes over ATMs and (at a designated hour when their bag men are standing by) makes the ATMs spew money ... and the cops who caught them.
  3. Software Defined Radio for Engineers -- free book that aims to provide a hands-on learning experience using SDR for engineering students and industry practitioners who are interested in mastering the design, implementation, and experimentation of a communication system.
  4. Why Licensing Bugs Matter -- In this work, we report a study aimed at characterizing licensing bugs by (i) building a catalog categorizing the types of licensing bugs developers and other stakeholders face, and (ii) understanding the implications licensing bugs have on the software projects they affect. The presented study is the result of the manual analysis of 1,200 discussions related to licensing bugs carried out in issue trackers and in five legal mailing lists of open source communities. Our findings uncover new types of licensing bugs not addressed in prior literature, and a detailed assessment of their implications.

Four short links: 26 June 2018

Tracking Dots, Stuff That Matters, Reputation, and Pen Testing

  1. DEDA - tracking Dots Extraction, Decoding, and Anonymization toolkit. Read and decode the tracking dots that commercial laser printers insert, and anonymize your own documents. The paper behind it is a good read, too.
  2. Episode 51: A Conversation with Tim O’Reilly -- podcast, with transcript. I’m really interested in framing the questions around life, liberty, and the pursuit of happiness for everyone. How does technology help us do that? How does it make it a better world, for everyone? Not just for a few, but for everyone? And if that’s our goal, how do we think differently about technology?
  3. Reputation System for Artificial Societies -- Understanding the principles of consensus in societies and finding ways to make consensus more reliable becomes critically important as connectivity and interaction speed increase in modern distributed systems of hybrid collective intelligences, which include both humans and computer systems. We propose a new form of reputation-based consensus with greater resistance to reputation gaming than current systems have. We discuss options for its implementation and provide initial practical results. I'm interested in all attempts to develop around empathy, consensus, and productive online engagement.
  4. SWORD Dropbox -- $15 OpenWRT-based DIY disposable pen-test tool.