Ben Lorica

Ben Lorica

Ben Lorica is the Chief Data Scientist and Director of Content Strategy for Data at O'Reilly Media, Inc.. He has applied Business Intelligence, Data Mining, Machine Learning and Statistical Analysis in a variety of settings including Direct Marketing, Consumer and Market Research, Targeted Advertising, Text Mining, and Financial Engineering. His background includes stints with an investment management company, internet startups, and financial services.

Where 2.0: The State of the Geospatial Web Where 2.0: The State of the Geospatial Web
by Brady Forrest, Ben Lorica, Roger Magoulas, Andrew Turner
June 2009
Ebook: $399.00

Virtual Worlds: A Business Guide Virtual Worlds: A Business Guide
by Ben Lorica, Roger Magoulas
June 2009
Ebook: $249.00 Ebook: $249.00

Twitter and the Micro-Messaging Revolution: Communication, Connections, and Immediacy--140 Characters at a Time Twitter and the Micro-Messaging Revolution: Communication, Connections, and Immediacy--140 Characters at a Time
by Abdur Chowdhury, Gregor Hochmuth, Ben Lorica, Roger Magoulas, Sarah Milstein, Tim O'Reilly
June 2009
Ebook: $99.00

Recent Posts | All O'Reilly Posts

Ben blogs at:

Welcome to Intelligence Matters

May 14 2014

Editor’s note: this post was co-authored by Ben Lorica and Roger Magoulas Today we’re kicking off Intelligence Matters (IM), a new series exploring current issues in artificial intelligence, including the connection between artificial intelligence, human intelligence and the brain. IM … read more

The re-emergence of time-series

April 09 2013

My first job after leaving academia was as a quant 1 for a hedge fund, where I performed (what are now referred to as) data science tasks on financial time-series. I primarily used techniques from probability & statistics, econometrics, and … read more

An update on in-memory data management

February 21 2013

By Ben Lorica and Roger Magoulas We wanted to give you a brief update on what we’ve learned so far from our series of interviews with players and practitioners in the in-memory data management space. A few preliminary themes have … read more

Seven reasons why I like Spark

August 21 2012

A large portion of this week’s Amp Camp at UC Berkeley, is devoted to an introduction to Spark – an open source, in-memory, cluster computing framework. After playing with Spark over the last month, I’ve come to consider it a … read more

Active Facebook users by region: November, 2010

November 16 2010

With Facebook unveiling an integrated messaging system for its more than 500 million users, I decided to update a few charts that breakdown its users by region. read more

Hiring trends among the major platform players

November 15 2010

After recently re-reading Tim's post on the major internet platform players, I looked at recent hiring trends* among the companies he highlighted. First I examined year-over-year changes in number of job postings (from Aug to Oct 2009 vs. Aug to Oct 2010). Consistent with the recent flurry of articles about… read more

Windows Phone apps are more expensive than iPhone apps

November 05 2010

The Windows Marketplace for Mobile now has about 1,400 apps spread across 16 categories. In this short post I'll provide some basic statistics and compare it with the grandaddy of app stores: the U.S. iTunes store. read more

Crowdsourcing Specific Microtasks

October 25 2010

Since the first-ever Mechanical Turk meetup a year ago, there has been an explosion in crowdsourcing services and a well-attended conference in San Francisco. I remain enthusiastic about crowdsourcing, but the number of companies has me worried about quality of work. Fortunately specialization is already occurring, so for particular tasks… read more

Amazon's cloud platform still the largest, but others are closing the gap

August 31 2010

Tim's recent tweet on the growing demand for Google App Engine skills inspired me to measure the popularity of the major cloud computing platforms. Elance is one of many job boards in our data warehouse of U.S. job postings1 , and I wanted to measure demand across many more job… read more

The number of Hadoop jobs continue to rise

August 08 2010

While still a small fraction1 of data management job postings, the number of job posts that mention "hadoop" continue to grow steadily. Year-over-year, there were 300% more such job posts2 in the first seven months of 2010 compared to the same period in 2009: The fraction of "hadoop" jobs posted… read more

Which Social Gaming companies are Hiring

July 29 2010

Disney's announced purchase of Mountain View gaming startup Playdom, follows on the heels of EA's purchase of London-based Playfish last November. Based on active users Zynga remains by far the biggest online social gaming company, but what other independent companies are growing? To see which companies are expanding, I used… read more

Where Facebook's half a billion users reside

July 21 2010

Facebook announced that they now reach 500 million active users (just five and half years after launching). But where do these half a billion users reside? Refreshing my post from February, the share of users from Asia continues to rise and now stands at 17% of all Facebook users. Over… read more

Popular iPhone games stay highly-ranked only for a few weeks

June 30 2010

With 40,000+ Games to choose from, the list of Top 100 free and paid games are frequently scanned by iPhone gamers. In this short post, I'll share some basic statistics on popular games sold through the U.S. iTunes app store. read more

Actually, half of all iPad Books are Fiction

May 05 2010

Suggestions to my previous post inspired me to normalize our metadata1 for titles available through the U.S. iBooks app. A comment prompted me to rollup iBooks publishers into publishing conglomerates2: Comments from other readers gave me the idea to map the 100+ iBooks categories to the more familiar BISAC categories.… read more

A few weeks in, a third of iPad Books are Fiction

April 29 2010

Measured in terms of number of titles, half of the over 46,000 (paid and free) books available through the iBooks app are from 6 categories1. Fiction & Literature alone account for close to a third of all available iBooks titles: The current set of titles is indicative of the publishers… read more

Big Data shakes up the Speech Industry

April 23 2010

I spent a few hours at the Mobile Voice conference and left with an appreciation of Google's impact on the speech industry. Google's speech offerings loomed over the few sessions I attended. Some of that was probably due to Michael Cohen's keynote1 describing Google's philosophy and approach, but clearly Google… read more

Cookbooks: The highest priced iPad book category

April 21 2010

Just like the iTunes app store, the iBooks app on the iPad spotlights the Top Paid (and Top Free) books within each category. Here are some charts that compare the average price (by rank)1 across the major categories. The average price of the Top 50 titles across the major categories… read more

Big Data Analytics: From Data Scientists to Business Analysts

April 19 2010

The growing popularity of Big Data management tools (Hadoop; MPP, real-time SQL, NoSQL databases; and others1) means many more companies can handle large amounts of data. But how do companies analyze and mine their vast amounts of data? The cutting-edge (social) web companies employ teams of data scientists2 who comb… read more

Twitter By The Numbers

April 14 2010

I collected some interesting stats from today's presentations at Chirp. Over a thousand people attended the conference and the numbers below attest to how vibrant the Twitter platform is. Today's announced API enhancements will make the Twitter ecosystem even more interesting: 1. # of registered users: 105,779,710 (1,500% growth over… read more

Games & Entertaiment account for Half of all iPad apps

April 09 2010

98% of apps in the U.S. iTunes app store label themselves as "iPad compatible", but most were written for iPhones or iPods. One week into its launch there are about 2,300 apps† that run only on iPads. Measured in terms of number of unique apps, Games and Entertainment account for… read more

Google's New Marketplace Has over a Thousand Apps

March 17 2010

One week† into its public launch, the Google Apps Marketplace has just under 1,500 (enterprise) apps. Combined with's app exchange (also with over a thousand apps), enterprises interested in moving to cloud apps have an increasing number of software tools to choose from. Popular apps (measured in terms of… read more

1 in 4 Facebook Users Come From Asia or the Middle East

March 03 2010

Asia's share of the more than 400 million active Facebook users recently surged past 15%: With a market penetration of 1.7% in Asia and Africa, the company has barely scratched the surface in both regions. While the company continued to add users in Southeast Asia, there were an additional 2.3… read more

Long Tail iTunes Book Apps Are More Expensive

February 22 2010

In an earlier post, I examined the average price of the Top 100 PAID apps and noted that the relationship between price and popularity was somewhat dependent on the category. But in the Book category, I concluded that the Top 10 PAID apps were on average cheaper than those ranked… read more

The Most Efficient iPhone Developers

February 11 2010

Last week marked the first time the U.S. iTunes store had over 150,000 apps available. Close to 31,000 different developers (or "sellers") were responsible for those apps, with many offering one to five apps, while a few offered over a hundred different apps. Which developers consistently produce top-selling apps? I… read more

Manifold Learning, Calculus & Friendship, and Other Math Links

January 17 2010

One of the largest gatherings of mathematicians, the joint meetings of the AMS/MAA/SIAM, took place last week in San Francisco. Knowing that there were going to be over 6,000 pure and applied mathematicians at Moscone West, I took some time off from work and attended several sessions. Below are a… read more

Collecting, Aggregating, and Analyzing Data Exhaust

January 14 2010

Next week, O'Reilly's Research Director Roger Magoulas, will lead an exciting panel discussion on Big Data†. The focus will be on the piles of data that companies have been collecting, and are just beginning to analyze: The internet and social media create a mountain of random, unstructured, and at times… read more

Apps Per Seller Across the US iTunes Categories

December 14 2009

Measured in terms of number of unique apps†, the Top 5 categories in the U.S. app store have been Games, Books, Entertainment, Travel and Utilities. But comparing categories in terms of number of apps doesn't capture the challenge of developing applications in different categories. As I noted in an earlier… read more

Asia Continues to be Facebook's Strongest Growth Region

November 20 2009

With Facebook topping 330 million active users over the past week, the company's strongest growth region continues to be Asia. Over the last 12 weeks, Facebook added close to 17M active users in Asia alone. Since my previous post, the share of active users from Asia grew by 2% (to… read more

Counting Unique Users in Real-time with Streaming Databases

November 11 2009

As the web increasingly becomes real-time, marketers and publishers need analytic tools that can produce real-time reports. As an example, the basic task of calculating the number of unique users is typically done in batch mode (e.g. daily) and in many cases using a random sample from the relevant log… read more

Games Top the Charts in the iPhone and Android App Markets

November 03 2009

While it might be true that the number of Book apps is growing at a faster rate, Games continue to dominate the list of popular U.S. iTunes Apps. Games accounted for about a fifth of all iTunes apps over the past week†, but the category continued to have a disproportionate… read more

Twitter Users Most Followed by the Web 2.0 Summit Crowd - O'Reilly ...

October 28 2009

I took the set of users† who posted tweets containing the hashtag #w2s and determined who those users followed. Unlike the list of the most followed users in all of Twitter, the list isn't dominated by celebrities... read more

Twitter Users Most Followed by the Web 2.0 Summit Crowd

October 28 2009

I took the set of users† who posted tweets containing the hashtag #w2s and determined who those users followed. Unlike the list of the most followed users in all of Twitter, the list isn't dominated by celebrities. (A few coders landed in the top 50.) Regular Radar readers will be… read more

Pipelining and Real-time Analytics with MapReduce Online

October 20 2009

Most of the news related to the real-time web these days centers around the adoption of decentralized, push-oriented† protocols (pubsubhubbub, rsscloud) designed to reduce latency in web publishing. Less discussed are the analytic tools that can are capable of crunching through data in real-time. As more of the web moves… read more

Mechanical Turk app on the iPhone Provides Work for Refugees

October 13 2009

Mechanical Turk service provider CrowdFlower† and microwork non-profit Samasource have teamed up to make their services available to iPhone users. Users of CrowdFlower's mechanical turk platform can now opt to send their tasks to iPhone users. Previously, CrowdFlower users could choose between Amazon mechanical turks or CrowdFlower's stable of turks.… read more

The iPhone as a Gaming Platform: Share of Top Apps By Category

October 08 2009

As a follow-up to my recent post on the Top Grossing Apps list on iTunes, I examined three lists highlighted in the app store: the Top Paid, Top Free, and Top Grossing Apps. Believing that many users scan these lists, developers covet a spot on any of these Top 100… read more

The Price of The Top Grossing iTunes Apps

October 06 2009

In response to developer complaints that more expensive apps were getting buried at the bottom of popularity rankings, Apple recently introduced a separate ranking based on revenue. (The Top 100 Paid apps ranks apps are based on number of downloads.) In this post, I'll validate that compared to downloads, the… read more

There are Over a Million People Actively Using Facebook Right Now

September 24 2009

A little over a week ago Facebook reached a major milestone: 300 million active users. The fastest-growth region continues to be Asia, but growth in other overseas regions such as the Americas and Africa have also been strong. Currently reaching only 1% of potential users in Asia and Africa, Facebook… read more

Mobile Banks in the Developing World Prove Simpler is Better

September 17 2009

Recent initiatives designed to make U.S. consumer financial products simpler and intelligible to customers, reminds me of a study we did on Mobile Banks† in the developing world. Designed to work on the simplest mobile devices and originally targeting the unbanked, mobile banks evolved from simple services (transfer of mobile… read more

Resetting Expectations: Some Augmented Reality Links

September 09 2009

1. Mobile Devices and AR: Besides employing the location of users (Wikitude), there are generally two ways to overlay data onto the real world: through markers ( (2D) bar codes) or through automatic object/image recognition algorithms ("markerless"). The Economist gives a good overview of the different mobile applications that are… read more

The Most Popular iTunes Apps Aren't Always The Cheapest

August 27 2009

While the most popular aren't always the cheapest, on average, the Top 10 Paid apps† tend to be cheaper than less popular ones (those ranked 45 to 55 or 91 to 100): The situation varies across categories and in this post I'll briefly examine a few of the larger ones.… read more

Compared to the US, Facebook is Younger in Asia and the Middle East

August 18 2009

Since my last post on the number of active Facebook users, the company once again doubled in Asia, adding more than 14 million active users over the last 12 weeks†. Through the latter part of last week, the company had over 266 million active users. As the company becomes more… read more

Big Data and Real-time Structured Data Analytics

August 13 2009

The emergence of sensors as sources of Big Data hightlights the need for real-time analytic tools. Popular web apps like Twitter, Facebook, and blogs are also faced with having to analyze (mostly unstructured) data in near real-time. But as Truviso founder and UC Berkeley CS Professor Michael Franklin recently noted,… read more

The iTunes App Store Rolls with the Travel Season

August 10 2009

Sometime last week, the iTunes app store passed 70,000 unique apps (70K apps have appeared in the app store since it launched). One of the fastest-growing categories in the U.S. iTunes app store has been Travel, displacing Education to move into the top 5 largest categories. Welcome to summer vacation!… read more

Infographic of the Day: Who Came to the US in 2008

August 06 2009

From Good magazine's infographics team, a visual summary listing the 20 countries with the most number of immigrants to the U.S. in 2008: (This recent interview lists similar examples from the pages of Good, including graphics that compare (American) museums and major subway systems.)... read more

The US Online Job Market Improved Slightly in July

August 05 2009

Measured in terms of online job postings, the U.S. job market† improved slightly in July. Here are two views of the number of job postings per day: note the slight uptick in July 2009 in both graphs. The worst year-over-year decline occurred in April, the online job market subsequently shed… read more

Infographic of the Day: The US as Manufacturing Leader

August 04 2009

The U.S. remains the world's largest manufacturer, but with growth projected to be flat, China is poised to overtake it by 2015. In 2007 the U.S. accounted for 20% of manufacturing, China had 12%. (From the Wall St. Journal.)... read more

Infographic of the Day: Market Share of Health Insurers

August 03 2009

In many U.S. states, recent consolidation in the Health Insurance industry has left consumers with fewer choices. In all but 3 states, the top 2 health plans have over 50% market share. Market Share of Two Largest Health Plans in Each State [Red = 80 to 100% , Salmon= 70… read more

iTunes App Store Incubation Period Increases In Most Categories

July 31 2009

Over the last few weeks, media coverage of the iTunes app store often touches on concerns about Apple's approval process. Some apps drew enough complaints that Apple pulled them off the app store. With thousands of developers wanting to launch apps and Apple unable to come up with a more… read more

HadoopDB: An Open Source Parallel Database

July 28 2009

The growing need to manage and make sense of Big Data, has led to demand for analytic databases, which many companies are attempting to fill (Teradata, Netezza, Vertica, DATAllegro, Greenplum†, Aster Data, Infobright, Kickfire, Dataupia, ParAccel, Exasol, ...). As an alternative to current shared-nothing analytic databases, HadoopDB is a hybrid… read more

News Providers are Embracing the iPhone

July 16 2009

To mark another iPhone milestone (1.5 billion app downloads in a year), I checked our iTunes app store data warehouse†. I was expecting the Books category to continue to register the fastest-growth but was instead greeted by an explosion in News (and to a lesser extent, Navigation) apps: On any… read more

Recent Posts | All O'Reilly Posts