- Learning to Generalize from Sparse and Underspecified Rewards (Google) -- where an agent receives a complex input, such as a natural language instruction, and needs to generate a complex response, such as an action sequence, while only receiving binary success-failure feedback. Such success-failure rewards are often underspecified: they do not distinguish between purposeful and accidental success. Generalization from underspecified rewards hinges on discounting spurious trajectories that attain accidental success, while learning from sparse feedback requires effective exploration. [...] The MeRL approach outperforms our alternative reward learning technique based on Bayesian Optimization, and achieves the state-of-the-art on weakly-supervised semantic parsing. It improves previous work by 1.2% and 2.4% on WikiTableQuestions and WikiSQL datasets respectively. An important area of machine learning because most successes and failures don't come with a root cause analysis.
- Generating Combinations -- Gosper's Hack is a very elegant piece of code for generating combinations. I love hacks like this (this one first appeared in the classic MIT text, HAKMEM. Gosper is Bill Gosper who also invented the Game of Life glider gun among his many claims to fame).
- Manifold's Decision-Making Process -- there's nothing specific to Manifold here, this is just good advice about knowing who is making a decision and then involving people according to the consequence and irreversibility of the decision. Every organisation has to learn how to make decisions before its dysfunction grinds progress to a halt.
- Workism is Making Americans Miserable (The Atlantic) -- The economists of the early 20th century did not foresee that work might evolve from a means of material production to a means of identity production. They failed to anticipate that, for the poor and middle class, work would remain a necessity; but for the college-educated elite, it would morph into a kind of religion, promising identity, transcendence, and community. Call it workism. The punchline is great, and the journey there is hard to argue with: The vast majority of workers are happier when they spend more hours with family, friends, and partners, according to research. Work is not that.
Article image: Four Short Links