Four short links: 25 December 2019
Multiple Regexps, NLP Beyond English, MSR Roundup, and Embedded Linux
- Hyperscan — Intel’s library for fast testing a string against multiple regexps.
- Natural Language Isn’t Just English — English isn’t a great representative of the diversity of languages in the world: It’s a spoken language, not a signed language; it has a well-established, long-used roughly phone-based orthographic system; … with white space between words; … using (mostly) only lower-ascii characters; it has relatively little morphology; and, thus, fewer forms of each word; it has relatively fixed word order, etc. It just happens to have a massive training set.
- Microsoft Research 2019 Reflection — roundup of MSR’s work in ML, ethics, UI, security, and open source.
- Buildroot — a simple, efficient and easy-to-use tool to generate embedded Linux systems through cross-compilation.