AI & ML Business Data Innovation Research Security

Try the O’Reilly learning platform

With the O’Reilly learning platform, you get the resources and guidance to keep your skills sharp and stay ahead. Try it free for up to 14 days.

Start trial

Try a course for free

Join a live online event on the O’Reilly platform to learn from the experts shaping tech.

See what’s coming soon

Get the Radar Trends newsletter

Your email

Country

Please read our privacy policy.

Content > Topics > Data science

Identifying viral bots and cyborgs in social media

Analyzing tweets and posts around Trump, Russia, and the NFL using information entropy, network analysis, and community detection algorithms.

By Steve Kramer, PhD November 8, 2017 • 19 minute read

LinkedIn X Facebook Threads Bluesky Reddit

Paragon Science Twitter bot virality results (source: Courtesy of Steve Kramer, used with permission)

Particularly over the last several years, researchers across a spectrum of scientific disciplines have studied the dynamics of social media networks to understand how information propagates as the networks evolve. Social media platforms like Twitter and Facebook include not only actual human users but also bots, or automated programs, that can significantly alter how certain messages are spread. While some information-gathering bots are beneficial or at least benign, it was made clear by the 2016 U.S. Presidential election and the 2017 elections in France that bots and sock puppet accounts (that is, numerous social accounts controlled by a single person) were effective in influencing political messaging and propagating misinformation on Twitter and Facebook. It is thus crucial to identify and classify social bots to combat the spread of misinformation and especially the propaganda of enemy states and violent extremist groups. This article is a brief summary of my recent bot detection research. It describes the techniques I applied and the results of identifying battling groups of viral bots and cyborgs that seek to sway opinions online.

For this research, I have applied techniques from complexity theory, especially information entropy, as well as network graph analysis and community detection algorithms to identify clusters of viral bots and cyborgs (human users who use software to automate and amplify their social posts) that differ from typical human users on Twitter and Facebook. I briefly explain these approaches below, so deep prior knowledge of these areas is not necessary. In addition to commercial bots focused on promoting click traffic, I discovered competing armies of pro-Trump and anti-Trump political bots and cyborgs. During August 2017, I found that anti-Trump bots were more successful than pro-Trump bots in spreading their messages. In contrast, during the NFL protest debates in September 2017, anti-NFL (and pro-Trump) bots and cyborgs achieved greater successes and virality than pro-NFL bots.

Obtaining Twitter source data

The data sets for my Twitter bot detection research consisted of ~60M tweets that mentioned the terms “Trump,” “Russia,” “FBI,” or “Comey”; the tweets were collected via the free Twitter public API in separate periods between May 2017 and September 2017. I have made the source tweet IDs as well as many of our analysis results files available in a data project published at data.world. Researchers who wish to collaborate on this project at data.world should send a request email to datapartners@paragonscience.com.

Detecting bots using information entropy

Information entropy is defined as the “the average amount of information produced by a probabilistic stochastic source of data.” As such, it is one effective way to quantify the amount of randomness within a data set. Because one can reasonably conjecture that actual humans are more complicated than automated programs, entropy can be a useful signal when one is attempting to identify bots, as has been done by a number of previous researchers. Of the recent research in social bot detection, particularly notable is the excellent work by groups of researchers from the University of California and Indiana University. Their “botornot” system uses a random forest machine learning model that incorporates 1,150 features derived from user account metadata, friend/follower data, network characteristics, temporal features, content and language features, and sentiment analysis.

For our current work, I elected to adopt a greatly simplified approach for social bot detection using two types of information entropy scores—one based on the distributions of time lags between successive posts and a second based on the ordering of words within the posts. Accounts that send messages at uniform time intervals or with messages with unusually static or similar text context might be bots or cyborgs.

Next, I calculated the Z-scores of both the timing entropy and text entropy. In the results presented here, I set a minimum threshold of 10 social posts by a user in order to analyze said user’s posts, and then I applied a conservative threshold of 2.5 for the Z-score (that is, raw scores at or above 2.5 standard deviations above the mean) for either entropy metric in order to flag possible bots. By lowering the threshold I would, of course, detect more bots, but at the risk of false positives that might inadvertently flag actual human users as bots. In the future, I hope to calculate the ROC curve for my dual-entropy approach to characterize the tradeoffs between false positives and false negatives.

Measuring the virality of bots using the k-core decomposition

The k-core of a graph is a maximal subgraph in which each vertex has at least degree k. The coreness of a vertex is k if it belongs to the k-core but not to the (k+1)-core. The k-core decomposition is performed by recursively removing all the vertices (along with their respective edges) that have degrees less than k. Previous research has suggested that the k-core decomposition of a network can be very effective in identifying the individuals within a network who are best positioned to spread or share information. I used the k-core decomposition in 2016 to analyze more than 120M tweets related to the 2016 U.S. Presidential elections to identify the most influential users. For this bot detection research, I performed a k-core decomposition of the heterogeneous user/hashtag/URL Twitter networks for each day on which I collected samples between May and September 2017.

By combining our entropy scores with the corresponding coreness values, I was able to identify which bots or cyborgs (that is, humans who use specialized software to automate their social media posts) were most successful in prompting other users (some of which were also bots) to share or react to their posts, thus attaining positions closer to the center of the daily Twitter networks. (This k-core decomposition approach was used in a similar fashion by Bessi and Ferrara to measure the embeddedness of social bots.)

The 3-D scatter plot in Figure 1 shows clearly that the vast majority of the identified social bots are unsuccessful, remaining at the outer realms of the networks with low coreness values because no or few other users interact with them. Successful bots achieved higher coreness values because other users retweeted or replied to their posts. Normal human users (not shown) would be near the origin, and it is easy to discern that the higher the Z-score of either entropy metric, the less successful the bots become. This is most likely due to the fact that human users are readily able to recognize the bots’ abnormal postings and thus do not tend to share those bots’ posts. In summary, the more human-like the bot’s behavior, the more likely it is that actual users will share that bot’s posts.

Unsuccessful bots

The most extreme value of the text entropy Z-score (outside the plot boundaries) is 143 (with a raw text entropy of 1.0) for the Twitter user @says_k_to_trump. A few sample tweets are shown below. Note that every tweet is the single letter “k” sent in reply to each of @realDonaldTrump’s tweets. That entropy Z-score reflects the fact that this user’s tweets’ contents are completely deterministic with no uncertainty. Understandably, no other user has interacted with @says_k_to_trump, so that bot has remained at the outermost edge of the network with a coreness of 1.

Twitter user screenshot — Figure 2. Screenshot courtesy of Steve Kramer.

The most extreme value of the timing entropy Z-score is 122.7 for the Twitter user @trade_debate. Note the very uniform timing pattern of that user’s tweets in Table 1. Starting with the second tweet, that user tweeted at a constant interval of two seconds.

Table 1: Most extreme timing entropy examples

Datetime	Tweet text
2017-08-14 20:58:30	RT @sdonnan: Donald Trump and the modern complexities of “Made in America”. My @FT “Big Read” ahead of this week’s #NAFTA talks. https://t.…
2017-08-14 20:59:04	RT @FoxNews: China implements UN sanctions against North Korea, as Trump trade probe looms https://t.co/RD4KwQigzO
2017-08-14 20:59:06	RT @FoxNews: Moments Ago: @POTUS signs measure that could result in severe trade penalties for China. https://t.co/OWIgslyi3f https://t.co/…
2017-08-14 20:59:08	RT @CNNPolitics: President Trump signs a memorandum on Chinese trade practices https://t.co/stNgqVwENW
2017-08-14 20:59:10	RT @MinhazMerchant: US set to launch investigation into Chinese theft of IPR as prelude to trade sanctions. Beijing put on notice https://t…
2017-08-14 20:59:12	RT @Reuters: Chinese state newspaper says Trump trade probe will ‘poison’ relations https://t.co/XhwibAKD4H https://t.co/eQMD58yRYj
2017-08-14 20:59:14	RT @Reuters: Chinese state newspaper says Trump’s order to investigate Chinese trade practices will “poison” relations https://t.co/RzgYm1o…
2017-08-14 20:59:16	RT @politico: The mayor of a small agricultural community in Iowa says Trump “fooled a lot of people” when he pulled out of TPP https://t.c…
2017-08-14 20:59:18	RT @BreitbartNews: Out: RESIST In: Trump was right back when I campaigned against but you should let me do stuff for him https://t.co/40iSi…
2017-08-14 20:59:20	RT @thehill: Trump tries to shifts focus from Charlottesville with tweets on trade, military, Dems: https://t.co/cuYRVJFuU5 https://t.co/MA…
2017-08-14 20:59:22	RT @DrDenaGrayson: ߑ簟４RUE PRIORITIESߑ簟＠WH confirms #Trump himself insisted on starting speech w/trade & economy, NOT #racist attack‼️ htt…
2017-08-14 20:59:24	RT @DrDenaGrayson: #Trump began his speech on trade deals & economy, then 2 days too late he finally condemned #bigotry, hatred & violenc…
2017-08-14 20:59:26	RT @nytimes: Trump suggested he’d take a lighter approach on trade issues with China if it does more to pressure North Korea https://t.co/O…
2017-08-14 20:59:28	RT @CNN: Beijing says US threats to get tough on trade with China won’t help solve the crisis over North Korea https://t.co/cBGRfWlRBV http…
2017-08-14 20:59:30	RT @XHNews: #BREAKING: Trump signs executive memorandum on China despite worries about potential harms to trade ties with China https://t.c…
2017-08-14 20:59:32	RT @christinawilkie: If you want to know who stands to benefit most from Trump’s saber rattling on China trade & IP theft, check out his gu…
2017-08-14 20:59:34	RT @christinawilkie: List of the defense contractors (and one kitchen counter maker) invited to White House today for Trump’s event launchi…
2017-08-14 20:59:36	RT @foxandfriends: President Trump to strike the first blow in U.S. trade war against China https://t.co/1T9MacNoMv

Successful bots

In contrast, one of the most successful bots is @Bhola021, which achieved a coreness value of 96 on 2017-08-12. Several sample tweets are shown below in Table 2. This is primarily a digital marketing bot rather than a political or propaganda bot. Note, in particular, the behavior of retweeting other user accounts with similar names and very similar tweet text.

Table 2: Tweets from a successful marketing bot

Datetime	Tweet text
2017-08-12 2:49:36	Donald Trump’s 22-Year-Old Daughter Is The New Queen Of Instagram. https://t.co/PtzBUwujew
2017-08-12 2:50:13	Anonymous Is Taking Down Donald Trump On April 1 And There Is A Way You Can Be Part Of It. https://t.co/td6AGeuk44
2017-08-12 2:56:15	RT @bhola0957: Anonymous Is Taking Down Donald Trump On April 1 And There Is A Way You Can Be Part Of It. https://t.co/ipQrIsmo2r
2017-08-12 2:57:00	RT @bhola0957: Donald Trump’s 22-Year-Old Daughter Is The New Queen Of Instagram. https://t.co/XOg6YsZztA
2017-08-12 2:57:22	RT @bhola5033: Anonymous Is Taking Down Donald Trump On April 1 And There Is A Way You Can Be Part Of It. https://t.co/AwtEXGHdbq
2017-08-12 2:57:35	RT @bhola5033: Donald Trump’s 22-Year-Old Daughter Is The New Queen Of Instagram. https://t.co/RgkwPrdIc6
2017-08-12 2:57:57	RT @lovecommand102: Anonymous Is Taking Down Donald Trump On April 1 And There Is A Way You Can Be Part Of It. https://t.co/U2stRHl2dN
2017-08-12 2:59:01	RT @lovecommand102: Donald Trump’s 22-Year-Old Daughter Is The New Queen Of Instagram. https://t.co/AuEd85y7Wj
2017-08-12 2:59:28	RT @lovecommand103: Anonymous Is Taking Down Donald Trump On April 1 And There Is A Way You Can Be Part Of It. https://t.co/ObFn5wgGXp
2017-08-12 2:59:46	RT @lovecommand103: Donald Trump’s 22-Year-Old Daughter Is The New Queen Of Instagram. https://t.co/KV9J0ZRSgM

With the approach described above, one can identify potential bots and measure their degree of success, or embeddedness, within the evolving social networks. As we will see next, these results can be enhanced significantly with community detection algorithms.

Identifying communities of viral bots and cyborgs

To understand more clearly how the most successful viral bots and cyborgs function within the Twitter network, I created a sub-network based on the tweets sent by those bots, extracting user mentions and URLs from replies and retweets. In this example, I generated a network using the 16,057 tweets sent by the top 20 bot accounts from August 7-19, 2017. The generated network consists of 73,569 links among 2,949 nodes. A k-core decomposition of this network resulted in a maximum coreness of 20. I then applied the Louvain community detection algorithm to identify the relevant groups within the center of the network for all nodes with coreness ≥ 10. In the Polinode interactive network displayed in Figure 3, each color represents a different community within the network. Among the top 20 bots, there is a highly interconnected network of bots with similar names (porantext, porantexts_, lovedemand101, lovecommand102, etc.) that retweet and share each other’s posts. These botnets are evidently commercial bots that attempt to drive click traffic to webpages with provocative titles such as “Donald Trump Kicked One Direction Out Of His Hotel And Here’s Why” and “We Will Ruthlessly Ravage US troops, North Korea Warns Donald Trump On The Sun’s Day” as the top two article titles.

Figure 3. Network of top Trump viral bots and cyborgs in August 2017. Courtesy of Steve Kramer.

Because I am particularly interested in effects of social bots in spreading information and swaying public opinions in politics, I filtered the source tweets to include only those that include the word “Russia” in the tweet text. When I performed the k-core decomposition and entropy calculations on the Russia-related Twitter network, a different set of influential bots and cyborgs emerged for the period of August 7-19, 2017.

The Polinode network shown below in Figure 4 displays 17 different sub-groups in the network created by the top 20 Russia-related bots and cyborgs.

Figure 4. Network of top Russia-related viral bots and cyborgs in August 2017. Courtesy of Steve Kramer.

Community 1 is a pro-Trump group centered around the bot account named MyPlace4U (see Figure 5).

Community 1 pro-Trump bots — Figure 5. Community 1 (pro-Trump bots). Courtesy of Steve Kramer

In contrast, Community 10 is an anti-Trump group centered around the Twitter account named RealMuckmaker (see Figure 6), which was actually the most successful cyborg in this data set.

Community 10 anti-Trump bots — Figure 6. Community 10 (anti-Trump bots). Courtesy of Steve Kramer.

Table 3 below lists the top 20 viral bots and cyborgs in the Trump/Russia Twitter network for August 7-19, 2017. Note that only six of the top 20 viral bots and cyborgs act to support Donald Trump. Trump-supporting users are highlighted in red. I chose each user’s sample tweet text by calculating the mean text similarity of each tweet to the rest of that user’s tweets and selecting the tweet with the highest mean similarity using the Levenshtein distance and the fuzzywuzzy Python module.

Table 3: Top 20 Russia-related Twitter bots and cyborgs in August 2017

Rank	Twitter user	Coreness	Pro-Trump?	Sample tweet text
1	RealMuckmaker	20	N	RT @RealMuckmaker: Trump ‘surprised’ by Manafort raid in Russia probe @CNNPolitics https://t.co/CNdyvCzHMi
2	LedJEFFlin	18	N	RT @LedJEFFlin: ZEMBLA – The dubious friends of Donald Trump: the Russians https://t.co/3aTpoHnNDK via @YouTube
3	YourAnonCentral	13	N	RT @YourAnonCentral: @LouiseMensch @Plantflowes @MarcusC22973194 @PuestoLoco Russia is the broker of this conspiracy of tyranny, no less da…
4	Dax_x98	12	N	RT @Dax_x98: #Resistance #ImpeachTrump #TrumpLies #NotMyPresident #Resist #Trump #LockHimUp #FBR #TrumpRussia #TrumpSupporters #Republicans…
5	ActionTime	10	N	#TrumpRUSSIA White House Uses N.Korea To Distract US from Mueller’s Broadening Trump-Russia Probe.Trump’s HUMILIATED by Fellow Dictator Kim
6	natalikazadorn2	10	N	RT @OlehTyukov: #Красоты #Россия #Russia https://t.co/OM3cPDCQgB
7	newmirokliment1	10	Y	@mfa_russia @RusEmbUSA @natomission_ru @RussianEmbassy @ambruspresse @RusConsulGen @amrusbel @RusBotWien https://t.co/yxdW7zG3LX
8	OfficialNWM	10	N	RT @Im_TheAntiTrump: #TrumpRussia Cover lifted, a CIA spy offers his take on Trump & Russia & it’s fascinating. https://t.co/55hptGq9Yp
9	SoniaKatiMota	10	Y	Evidence – #Ukraine’s Gov’t Accusation of Russian Aggression VS The People of #Donbass. #DeepState #NATO #Russia https://t.co/KaC9p7M1n1
10	Vancelvania	10	N	@IlyaBeraha @RusEmbUSA @Russia @StateDept @statedeptspox @EURPressOffice @mfa_russia @tassagency_en @SputnikInt… https://t.co/8ggCFwreRf
11	Mario__Savio (suspended)	9	Y	“#ICantBeTheOnlyPerson #FakeTerrorismExperts like Malcolm Nance named as””The Channel”” for #Russia https://t.co/lH1YiY4ULI @BrianKarem #MAGA”
12	mr70	9	N	RT @Joannetrueblue: New Trump-Russia emails could pose a ‘devastating’ legal entanglement for Paul Manafort #DemForce #TrumpRussia https:/…
13	MyPlace4U	9	Y	RT @SalamMorcos: New Report: The DNC hack was actually a leak, and not a hack from Russia. https://t.co/PShpW58mSa https://t.co/Ax44v9OhC4
14	11worldpeace	8	N	Impeach Trump: Forget Russia. Is Provoking a Nuclear War with North Korea Grounds for Impeachment? https://t.co/bNamUYmseO via @democracynow
15	Darnbunnies	8	N	@markets @ShoChandra Russia,Russia,Russia. We are not distracted. Comey/Flynn turned on you. Manafort is next. Muel… https://t.co/C5opmadwD4
16	KDS_APEDAI	8	Y	@Hariborn @SatyajitHINDUS1 @ALOKVj78 @DrKinKam @veerendrakumarr @alokg2k @Russia @china @adgpi weak to support a war
17	Lucyredrocks	8	N	RT @winterschild11: @RocqueinBTR @scooby_doo1 @Lucyredrocks @_Russia_HD_ @HeffronDrive @dbeltwrites @ktothe5th @YUMAPIG1 @kevingschmidt @Mi…
18	perfectsliders	8	Y	DNC Hack Was ‘Inside Job,’ Not by Russia <– @PamelaGeller
19	scooby_doo1	8	N	@RocqueinBTR @Lucyredrocks @winterschild11 @_Russia_HD_ @HeffronDrive @dbeltwrites @ktothe5th @YUMAPIG1… https://t.co/RR30icdRXc
20	92a312	7	Y	RT @SoniaKatiMota: Excellent! 2014 #Ukraine Crisis – What You’re Not Being Told. #NATO, #DeepSate #Russia https://t.co/sgQHUE3YW3

Tracking the battles among groups of Russia-related bots and cyborgs

To discern how successful the different groups of Russia-related bots and cyborgs were in spreading their messages on Twitter, I calculated the daily mean and maximum coreness values attained by the six pro-Trump users in Table 3 versus the remaining 14 anti-Trump (or neutral) users in Table 3. Figure 7 (interactive version here) shows that, overall, the anti-Trump group was more successful in spreading its messages during the period of August 7-19, 2017, with the greatest peak on August 11 led by @RealMucker, which promoted a link to a particular CNN Politics article regarding the FBI’s raid on the home of former Trump campaign manager Paul Manafort.

Figure 7. Maximum coreness values of groups of Russia-related Twitter bots/cyborgs. Courtesy of Steve Kramer.

Discovering prominent bots and cyborgs in the NFL protests controversy

I applied the same entropy-based bot detection and network analysis approach to over 1M tweets that included the terms “Trump” and “NFL” from September 14-25, 2017. The Polinode network shown below in Figure 8 displays 16 different sub-groups in the network created by the top 20 NFL-related bots and cyborgs. Nine of the groups are opposed to the NFL protests while seven are in favor of the NFL players who took a knee in protest.

Figure 8. Network of top Trump/NFL-related viral bots and cyborgs in September 2017. Courtesy of Steve Kramer.

As in the Russia-related example, I calculated the maximum daily coreness value for the pro-NFL and anti-NFL groups within the top 20 viral NFL-related bots. Figure 9 shows that the anti-NFL (and pro-Trump) bots and cyborgs were more successful in spreading their social content than the pro-NFL group. Refer to my data.world data project for further details.

Figure 9. Maximum Coreness Values of Groups of NFL-Related Twitter Bots/Cyborgs. Courtesy of Steve Kramer.

Uncovering Facebook bots and cyborgs during and after the 2016 U.S. Presidential elections.

Given the increasing number of reports of Russian involvement in last year’s elections across multiple social platforms, I wanted to apply the entropy-based bot detection method to election-related Facebook data. Our friend and research colleague Jonathon Morgan, the CEO of Yonder and co-founder of Data for Democracy, kindly provided a data set of 10.5M public Facebook comments from Donald Trump’s Facebook page collected between July 2016 and April 2017.

Unfortunately, because I have only the text content and timestamps of the users’ Facebook comments, I do not have the full social network structure available as I did in the previous Twitter examples. Consequently, it is not possible to perform the same type of k-core decomposition. I found that the number of “likes” is not a particularly strong or reliable predictor of the degree of success for a bot or cyborg. The 20 Facebook users with the most extreme Z-scores of text entropy are listed in Table 4 below. The top user, Nadya Noor, had a text entropy score more than 253 standard deviations above the mean score for the rest of the users.

Table 4: Top 20 most extreme text bots and cyborgs from Trump Facebook comments

Facebook user	Text entropy score	Timing entropy score	Z-score text	Z-score timing	# of posts	Avg # of likes
Nadya Noor	6630.770	0.048	*253.320*	-0.465	39	0
Gold AL	556.234	0.024	21.244	-1.484	217	0.08755760369
Hanadi Kasem Agha	433.089	0.039	16.539	-0.873	35	1.485714286
Hafed Ali	128.920	0.045	4.919	-0.597	27	0
Gol Pamchal	105.875	0.078	4.038	0.757	13	0
David Haugen	105.467	0.019	4.023	-1.700	183	0.2295081967
Fred Bagnall	99.769	0.019	3.805	-1.697	178	0.3202247191
Lev Koshkin	91.340	0.049	3.483	-0.447	24	0
Ahmed Hamdi	90.650	0.039	3.457	-0.873	36	0
Yousry Girgis	85.234	0.050	3.250	-0.388	23	0
Alao Ahmad	84.678	0.091	3.228	1.289	11	0.1818181818
Elizabeth Dominguez	23.688	0.169	0.898	4.542	121	0.06611570248
Johnathan Morissette	19.068	0.327	0.722	11.114	63	0
Omid Omidi	19.068	0.181	0.722	5.035	15	0
Ricky Sujanani	12.192	0.138	0.459	3.271	20	0
Rizgar Kh Jacob	11.757	0.188	0.443	5.345	217	0.01843317972
Robin Van Doorn	8.790	0.126	0.329	2.741	16	0
Ana Ferreira	5.840	0.123	0.216	2.654	11	0.1818181818
Jose Antonio Guadarrama	5.460	1.611	0.202	64.530	358	0
David Quinlan	5.177	0.123	0.191	2.654	11	0.09090909091

The most extreme user based on text entropy, Nadya Noor, posted very similar texts in Arabic during February 2017 (see Table 5).

Table 5: Sample Facebook comments from most extreme text bot (Nadya Noor)

Comment	Datetime
ياالله العن أمريكا على مافعلته في العراق والعراقيين منذ٢٠٠٣والى الان ياالله ياالله ياالله	2017-01-28T01:13:05+0000
ياالله العن أمريكا على مافعلته في العراق والعراقيين منذ٢٠٠٣والى الان ياالله ياالله ياالله	2017-01-28T01:13:51+0000
ياالله العن أمريكا على مافعلته في العراق والعراقيين منذ٢٠٠٣والى الان ياالله ياالله ياالله	2017-01-28T01:14:02+0000
ياالله العن أمريكا على مافعلته في العراق والعراقيين منذ٢٠٠٣والى الان ياالله ياالله ياالله	2017-01-28T01:14:16+0000
ياالله العن أمريكا على مافعلته في العراق والعراقيين منذ٢٠٠٣والى الان ياالله ياالله ياالله	2017-01-28T01:14:36+0000
ياالله العن أمريكا على مافعلته في العراق والعراقيين منذ٢٠٠٣والى الان ياالله ياالله ياالله	2017-01-28T01:16:31+0000
ياالله العن أمريكا على مافعلته في العراق والعراقيين منذ٢٠٠٣والى الان ياالله ياالله ياالله	2017-01-28T01:16:45+0000
الله يلعن أمريكا الله يلعن بوش الله يلعن بلير وان شاءالله يبعث لكم خسفا اونارا بسبب مافعلتموه بالعراق الله يحرق أمريكا الله يحرق أمريكا	2017-02-02T07:13:33+0000
الله يلعن أمريكا على مافعلته بالعراق وشعب العراق كنا شعبامتالف متحاب متسامح لانعرف الطائفيه والأمن والامان في شوارعنا وبيوتنا ومحافظاتنا الله يلعنك أمريكا ان شاءالله الى الجحيم انت وشعبك الغدار	2017-02-02T16:10:07+0000
الله يلعن أمريكا على مافعلته بالعراق وشعب العراق كنا شعبامتالف متحاب متسامح لانعرف الطائفيه والأمن والامان في شوارعنا وبيوتنا ومحافظاتنا الله يلعنك أمريكا ان شاءالله الى الجحيم انت وشعبك الغدار	2017-02-02T19:28:25+0000
الله يلعن أمريكا على مافعلته بالعراق وشعب العراق كنا شعبامتالف متحاب متسامح لانعرف الطائفيه والأمن والامان في شوارعنا وبيوتنا ومحافظاتنا الله يلعنك أمريكا ان شاءالله الى الجحيم انت وشعبك الغدار	2017-02-02T19:28:54+0000
الله يلعن أمريكا على مافعلته بالعراق وشعب العراق كنا شعبامتالف متحاب متسامح لانعرف الطائفيه والأمن والامان في شوارعنا وبيوتنا ومحافظاتنا الله يلعنك أمريكا ان شاءالله الى الجحيم انت وشعبك الغدار	2017-02-02T19:29:14+0000

Figure 10 shows a Google translation of one of that user’s typical, strongly anti-American comments.

Figure 10. Google translation of sample comment by Nadya Noor. Screenshot courtesy of Steve Kramer.

In the future, I plan to apply community detection algorithms to the text content and embedded URLs in these Facebook bots’ posts to determine their primary discussion topics and political leanings.

Conclusions

In this article, I have demonstrated how it is readily possible to identify social bots and cyborgs on both Twitter and Facebook using information entropy and then to find groups of successful bots using network analysis and community detection. Given the extreme risks of disinformation and propaganda being spread through social media, it is our hope that this approach, along with the work of other researchers, will enable greater transparency and help protect democracy and the authenticity of online discourse. I invite researchers who wish to collaborate on studies of these data sets to request access to become collaborators on our data project hosted on data.world.

Post topics: Data science

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills