Errata

Mining the Social Web

Errata for Mining the Social Web

Submit your own errata for this product.

The errata list is a list of errors and their corrections that were found after the product was released.

The following errata were submitted by our customers and have not yet been approved or disproved by the author or editor. They solely represent the opinion of the customer.

Color Key: Serious technical mistake Minor technical mistake Language or formatting error Typo Question Note Update

Version Location Description Submitted by Date submitted
Ex 2-6
United States

Hi, Example 2-6 displays code for microformats_mapquest_geo.py and suggests that the example URL should be http://local.mapquest.com/franklin-tn. However Mapquest Local is no longer suppported and they suggest that we now use Mapquest Vibe (mqvibe.mapquest.com)

Femi Anthony  Jul 10, 2012 
Example 2-9
code section

The purpose of this example is to demonstrate parsing restaurant review information as delineated by the hReview tag. However the suggested url - http://www.yelp.com/biz/bangkok-golden-fort-washington-2 no longer makes use of such a tag and no data is produced.

Femi Anthony  Jul 12, 2012 
Printed Page page 108
~middle

when running the code, unsure where connections['values'] is defined?

Thank you.

Andrew M. Neiderer

Andrew M. Neiderer  Apr 02, 2016 
Printed Page 5
Example 1-3. Retrieving Twitter search trends

It seems that the last correction on this section was for API 1.0, but, at least for me, it no longer works even for that version without authentication. Here's what I had to do in order to get this example working:

First, I created an account on Twitter and registered an application at https://dev.twitter.com/apps/new .

Then, I used
>>> consumer_key, consumer_secret = "4papHqXEJLsqVkkq4zuUhO", "bgaopWSAIP7x1245a60kMXeQ8jNIo0BZZLl2aNKd2k"
to store my application's authentication codes (which of course are not these), as given in the app's page.

After that, I authorized the app to use my user account with
>>> twitter.oauth_dance("My App Name", consumer_key, consumer_secret, "token.txt")
. "token.txt" is the file name I chose to store the retrieved ouath data.

I then called
>>> oauth_token, oauth_secret = read_token_file("token.txt")
to load the recently stored data.

At this point, I created an API object like described by the last workaround:
>>> twitter_api = twitter.Twitter(domain="api.twitter.com", api_version='1', auth=twitter.OAuth(oauth_token, oauth_secret, consumer_key, consumer_secret))
but with the added authentication.

Then I again followed the workaround and did:
>>> world_trends = api.trends._(1) # Using 1 for global location
>>> [trend for trend in world_trends()[0]['trends']]

And everything worked fine. But it should be noted that this method uses the deprecated API 1.0, and using the newest version (1.1) requires little modification:

>>> twitter_api = twitter.Twitter(domain="api.twitter.com", api_version="1.1", auth=twitter.OAuth(oauth_token, oauth_secret, consumer_key, consumer_secret))
>>> world_trends = twitter_api.trends.places._(1) # Here's the major difference
>>> [trend for trend in world_trends()[0]['trends']]

Andr? S? de Mello  Jan 11, 2013 
PDF Page 5, 7
Example 1-4, Example 1-6

I thin there was some issue regarding the Twitter API changing. A similar issue appears to be for this section as well. The result of the code on my system is:

twitter.api.TwitterHTTPError: Twitter sent status 404 for URL: 1.1/search.json using parameters: (q=SNL&rpp=100&page=1)
details: {"errors":[{"message":"Sorry, that page does not exist","code":34}]}

Opening the link in the paragraph that followed works. After some searching around, I signed up for an API key and used OAuth in instantiating the Twitter object. I then had to drop a couple of the params being passed, as apparently, they are no longer used either. My ending code is as follows:


=====
from twitter import *
from vars import * #private vars for auth
import json

twitter_api = Twitter(auth=OAuth(oauth_token, oauth_secret, consumer_key, consumer_secret))

search_results = twitter_api.search.tweets(q="SNL", count=2)
=====

I presume that this change may also be affecting the iteration on page 7, Example 1-6, which states "TypeError: string indices must be integers"

I'd like some clarification on how to parse this data given the change in the search code.

Thanks!

jktravis  Apr 28, 2013 
PDF Page 9
Last Paragraph

All of the word tokens in the example are in lower case. Yet, when I run the code I get some results that are in upper or mixed case. I believe there's a missing call to lower(), perhaps as early as in example 1.7's list comprehension. I suspect the 3rd line of the example 1.7 listing was intended as:

words += [ w.lower() for w in t.split() ]

Peter Haglich  Sep 20, 2012 
PDF Page 31
Example 2-6

MapQuest local seems to have changed their URL format from

http://local.mapquest.com/franklin-tn

to

http://local.mapquest.com/us/tn/franklin/

Peter Haglich  Sep 21, 2012 
PDF Page 31
Example 2-6

MapQuest Local doesn't seem to embed geo microformat data any more.


$ python microformats__mapquest_geo.py http://local.mapquest.com/us/tn/franklin/
No location found

Peter Haglich  Sep 21, 2012 
PDF Page 103
example 1-11

Last executable line of program is:
sorted(nx.degree(g))
This should produce the degree of each node as shown in the book as:
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1......
However in order to get that output, this line should be:
sorted(nx.degree(g).values())

abayomi king  May 22, 2012 
PDF Page 111
example1=12

The value of n1 for one tweet caused the following error :
UnicodeEncodeError: 'charmap' codec can't encode character u'\u201c' in position 14: character maps to <undefined>
File "C:\Users\me\Desktop\python programs\collective intell book\ex1.3.py", line 89, in <module>
print n1, n2, g[n1][n2]['tweet_id']
File "C:\Python27\Lib\encodings\cp850.py", line 12, in encode
return codecs.charmap_encode(input,errors,encoding_map)

The value of n1 that caused this error was seen on mousing over a breakpoint as: u'@NfamousKaye :\201c@RollingStone'
The n2 that corresponds is: HotFemaleRapper

I don't know enough of the regular expression notation to fix it. But removing the last + sign in :
re.compile(r"(RT|via)((?:\b\W*@\w+)+)
Prevents the error from occurring

abayomi king  May 22, 2012 
Printed Page 126
Example 5.4

I can't seem to run the script example 5.4.

I've had some problems installing couchdb, so I accept this may be a configuration issue, but I'm writing this to you just in case...

I ran script 5.3 (latest from github)

python the_tweet__harvest_timeline.py user 16 envagency

this creates the db, the curl result is below

curl http://127.0.0.1:5984/tweets-user-timeline-envagency
{"db_name":"tweets-user-timeline-envagency","doc_count":1624,"doc_del_count":0,"update_seq":1624,"purge_seq":0,"compact_running":false,"disk_size":4518001,"data_size":4459942,"instance_start_time":"1335271904150644","disk_format_version":6,"committed_update_seq":1624}

If I then run example 5.4 (latest from github) I get an error.

python the_tweet__count_entities_in_tweets.py tweets-user-timeline-envagency

(I note that you can supply a second param FREQ_THRESHOLD) but I believe if it's not supplied I get a default value... anyway error is:

Traceback (most recent call last):
File "the_tweet__count_entities_in_tweets.py", line 85, in <module>
db.view('index/entity_count_by_doc', group=True)],
File "/usr/lib/pymodules/python2.6/couchdb/client.py", line 871, in __iter__
for row in self.rows:
File "/usr/lib/pymodules/python2.6/couchdb/client.py", line 893, in rows
self._fetch()
File "/usr/lib/pymodules/python2.6/couchdb/client.py", line 881, in _fetch
data = self.view._exec(self.options)
File "/usr/lib/pymodules/python2.6/couchdb/client.py", line 766, in _exec
resp, data = self.resource.get(**self._encode_options(options))
File "/usr/lib/pymodules/python2.6/couchdb/client.py", line 978, in get
return self._request('GET', path, headers=headers, **params)
File "/usr/lib/pymodules/python2.6/couchdb/client.py", line 1035, in _request
raise ServerError((status_code, error))
couchdb.client.ServerError: (500, ('EXIT', '{{badmatch,[]},\n [{couch_query_servers,new_process,3},\n {couch_query_servers,lang_proc,3},\n {couch_query_servers,handle_call,3},\n {gen_server,handle_msg,5},\n {proc_lib,init_p_do_apply,3}]}'))

I see if I go to futon:

http://127.0.0.1:5984/_utils/database.html?tweets-user-timeline-envagency

that in the View drop-down I now get Index > entity_count_by_doc (so this is created by script 5.4) but if I try to view from within futon I get the same error:

<127.0.0.1>

Error: EXIT

{{badmatch,[]},
[{couch_query_servers,new_process,3},
{couch_query_servers,lang_proc,3},
{couch_query_servers,handle_call,3},
{gen_server,handle_msg,5},
{proc_lib,init_p_do_apply,3}]}

Any idea what I haven't got configured correctly?

I'm running Apache CouchDB 1.2.0 on Ubuntu 10.04

Many thanks




Anonymous  Apr 24, 2012 
PDF, Other Digital Version Page 176
2nd paragraph

Jaccard distance implementation as:

len(X.union(Y)) - len(X.intersection(Y)))/float(len(X.union(Y))

should be:

(len(X.union(Y)) - len(X.intersection(Y)))/float(len(X.union(Y)))


Zhitong He  Feb 19, 2012