Radar

Four short links: 30 August 2017

IRS Mining, Government as a Platform, Developing for Alexa, and Testing Reading Comprehension

By Nat Torkington

August 30, 2017

The IRS Is Mining Taxpayer Data On Social Media — Although historically, the IRS chose tax returns to audit based on internal mathematical mistakes or mismatches with third-party reports (such as W-2s), the IRS is now engaging in data mining of public and commercial data pools (including social media) and creating highly detailed profiles of taxpayers upon which to run data analytics. This article argues that current IRS practices, mostly unknown to the general public, are violating fair information practices. (via Slashdot)
Government as a Platform, Tranche 1 (Pia Waugh) — One of the dangers is that if you see something better than what you have, and assume it to be sufficient, then you miss the opportunity to leapfrog. I like Pia’s approach here (identify three concepts that came from the user research, and then from all the different things that users were trying to do, from their needs we identified a couple of juicy examples that would show and help us test those concepts).

Learn faster. Dig deeper. See farther.

Join the O'Reilly online learning platform. Get a free trial today and find answers on the fly, or master something new and useful.

Why it’s Hard to Develop a Conversational Alexa Skill — Alexa’s interaction model is not conversation-friendly. […] Lack of context is a common problem. […] Alexa’s language model has fewer problems with interpreting “female,” but it is not unusual to see “email” in the transcripts. This article has a lot of good detail about what goes wrong.
Adversarial Examples for Evaluating Reading Comprehension Systems — Our method tests whether systems can answer questions about paragraphs that contain adversarially inserted sentences, which are automatically generated to distract computer systems without changing the correct answer or misleading humans. In this adversarial setting, the accuracy of 16 published models drops from an average of 75% F1 score to 36%; when the adversary is allowed to add ungrammatical sequences of words, average accuracy on four models decreases further to 7%. Suggesting that current software doesn’t understand the text, it merely performs well on the tests. So…as good as most high school students, then?

Post topics: Four Short Links