Constella Intelligence

United Nations Migration Pact: Public Sphere Digital Abnormalities Analysis

Amidst a European digital conversation which has given ample attention to the topic of immigration, particularly within the context of the EU Elections in which migration is at the core of the public debate, the Global Compact for Safe, Orderly and Regular Migration (GCM), also coined as the “UN Migration Pact” or the “Marrakech Pact”, attracted voluminous interest in the digital sphere ahead of its signing and adoption. The pact was initially approved in July 2018 by all 193 member nations except for the United States, which backed out last year, only later to have several European nations withdraw before its formal adoption in December. On December 10th and 11th, the Intergovernmental Conference to Adopt the Global Compact for Safe, Orderly and Regular Migration was held in Marrakech, Morroco with the goal of approving the Compact. Despite the opposition and withdrawals from several governments, the non-binding Compact was formally endorsed by the United Nations General Assembly on December 19th, 2018 through a vote by 164 UN member state signatories. The Compact outlines a plan to “prevent suffering and chaos” for global migration, advocating for a “cooperative framework” for dealing with the multifaceted challenges of international migration for both nations and migrants.

Key findings from Constella’s research include:

1) The 200 most active users in the debate demonstrate similar levels of activity as the small percentages of highly active users identified in other analyses across France, Germany, Italy, Poland, and Spain. In the case of the UN Migration Pact, 0.43% of users were responsible for 11.31% (21,163) of all messages over the period analyzed.

2) Several of the most used keywords, hashtags, and terms in the debate suggest that the Marrakech Pact was to some degree used as a proxy for attacks against Chancellor Angela Merkel and President Emmanuel Macron. Examples among the top 25 include #macron (#7, used 22,926 times), “macron” (#8, used 19,468 times), #giletsjaunes (#10, used 14,728 times), “généraux” (#16, used 9,364 times), “trahison” (#19, used 8,381 times), and #merkel (#21, used 8,362 times). The references to “généraux” and “trahison” make mention of a story spread in the wake of the Yellow Vests concerning French generals accusing President Macron of treason, as described in Constella’s content propagation analysis in France.

3) 533 users across the geographies and languages analyzed posted in more than one language on the topic of the UN Migration Pact. 508 of these 533 multilanguage users (95.3%) were found to be non-geolocated, voluntarily selecting to not publicly exhibit geolocation. This proportion of non-geolocated users is dramatically different than the composition of geolocated versus non-geolocated users in the general debate, at 45% to 55%, respectively.

4) Users identified as French, German, or Italian by either language or geolocation dominated the debate among the 533 multilanguage users, with French users participating considerably in the Italian and German conversations, Italians substantially participating in the French and German conversations, and Germans participating heavily in the French and Italian conversations. Some Spanish users participated in the Italian and French conversations, and vice-versa, although they were relatively few. The Polish conversation and the impact of users identified as Polish were considerably smaller than the others.

5) The 533 multilanguage users represent 1.15% of users (46,183) and posted 4.77% (8,919) of the 187,128 messages over the period analyzed. These users actively retweet users from the same community, with 141 of the 533 users (26.45%) retweeting each other at least once. 32.65% (174) of the 533 multilanguage user accounts were created in either 2018 or 2019, supporting evidence from additional analyses that indicate a correlation of high activity users and creation dates within the past 2 to 3 years.

6) Sites such as philosophia-perennis.com (#4), RT.com (#8), tichyseinblick.de (#9), voxnews.info (#14), and infowars.com (#15) were among the most shared and were propagated by multilanguage users with high frequency. These sites all appeared in Constella’s previous analyses as domains with highly segmented audiences and were identified by Constella’s analysts as having aided in the production or diffusion of disinformation content.

7) By analyzing the most shared domains through the aggregation of all content including videos, articles, and other media shared, Constella’s analysts observed that while all the domains had been created between 1996 and 2018, 16.67% (25 domains) of the most shared domains have been created in the last three years. This corresponds with other analyses signaling that recently created media is playing an increasingly important role in the public digital debate.

8) Based on an independent statistical analysis of the likelihood of the frequency, distribution, and behavior of the non-geolocated, multilanguage users, we conclude that behavior such as that reflected by the group of non-geolocated users who post in 2 or more languages has a chance of occurring 1 in 205,000 times, and the event of non-geolocated users posting in 3 or more languages only has a chance of occurring 1 in 6,000,000 times.

Digital Public Opinion Analysis

Constella’s team of data scientists analyzed 167,236 total results from 46,183 authors to understand the scope of the public digital conversation on the UN Migration Pact across Germany, France, Italy, Spain and Poland and the role of high activity users participating in the conversation from December 9th to March 11th, 2019. Users participating in the conversation were identified based on references of keywords and terms associated with the Pact and participation in conversations referencing the Pact. 41.33% of total results and 38.34% of users were identified within the French-language conversation, 30.02% of total results and 21.84% of users were identified within the German-language conversation. 20.55% of total results and 22.76% of users were identified within the Italian-language conversation, 7.95% of results and 17.79% of users were identified within the Spanish-language conversation, and 0.14% of results and 0.47% of users were identified within the Polish-language conversation.

Most Used Keywords, Hashtags, and Terms

In analyzing top keywords and hashtags (terms) appearing in the general conversation, we observe several French, German, and Italian terms frequently referenced, with Spanish terms referenced to a lesser degree. By far, the most referenced keyword or hashtag in the debate is #pactedemarrakech (FR), used 72,562 times. The second most referenced term, substantially behind #pactedemarrakech in frequency, is #migrationspakt (DE). Examples of other key terms linked to the debate on the Marrakech Pact and ranking among the top 25 most frequently used are #macron (#7, used 22,926 times), “macron” (#8, used 19,468 times), #giletsjaunes (#10, used 14,728 times), “généraux” (#16, used 9,364 times), “trahison” (#19, used 8,381 times), and #merkel (#21, used 8,362 times). The high volume of references to both German Chancellor, Angela Merkel, and French President, Emanuel Macron, evidence a focus on these two public political figures who advocated for international support of the Pact. The references to “généraux” and “trahison” are indicative of a story spread in the wake of the Yellow Vests concerning French generals accusing President Macron of treason, as detailed in our previous content propagation analysis in France. Seeming to correspond with the trend of users in the general debate, the most used terms by the multilanguage users (users posting in more than one language) include those previously highlighted such as #merkel, #macron, and #giletsjaunes, among others.

Please find the full list of the top 150 most frequently used keywords, hashtags, and terms at the end of this analysis.

Most Active Users

Among these 46,183 unique authors, the most active geolocated users in the debate are primarily in Germany, France, and Italy. Out of the top 100 most active users in the debate, 29% are located in Germany, with 18% in France, and 3% in Italy. Out of the top 200 most active users in the debate, 24% are located in Germany, with 21.5% in France, and 2% in Italy. Over 50% of the top 200 highest activity users during the period of analysis do not express their geolocation.

When looking at the top 100, top 150, and top 200 most active users in the general debate, we see an increase in the proportion of users relative to the overall conversation that are geolocated in France. Furthermore, when looking at the top 100, top 150, and top 200 most active users in subsequent fashion, Constella’s analysts observed an increase in the percentage of users who do not exhibit geolocation but are participating in the public digital discourse on the Marrakech Pact. Among the top 200 most active users during the period of analysis of 93 days, the number of posts generated by any individual user ranged from 60 to 2223. Analyzing the top 200 most active users in the entire debate on the UN Migration Pact, we found that of the 46,183 users in total, these 200 users represent 0.43% of all users identified. Nevertheless, out of 187,128 posts in total, these users posted 21,163, or 11.31% of all messages in the entire conversation on the Marrakech Pact during the period analyzed, demonstrating disproportional activity by almost a half-percent of all users in the general conversation.

Multilanguage Users and Abnormality Analysis

As referred to previously, in analyzing the digital conversation around the Pact our analysts identified several users posting in multiple languages. These users were found to be sharing similar domains and content in addition to using similar keywords, hashtags, and terms. The analysis has identified a network of 533 authors that overlap in their activity either by country and/or language, evidencing single users posting across boundaries of language or location with a high concentration of efforts in France, Germany, and Italy. Furthermore, 508 of these 533 users (95.3%) posting in multiple languages were found to be non-geolocated, voluntarily selecting to not publicly exhibit geolocation.

Among the users participating in more than one conversation based on language, we identified 234 (42.31%) as French-language users, 142 (25.68%) as Italian-language users, 127 (22.87%) as German-language users, 39 (7.05%) as Spanish-language users, and 11 (1.99%) as Polish-language users. Users were either self-geolocated or were classified by the language of their most frequent participation. As seen below, users identified as French either by language or geolocation demonstrated a substantial level of participation in the Italian language conversation, and to a lesser but considerable degree in the German language conversation, although they also posted in Spanish and Polish too. Users identified as Italian either by language or geolocation exhibited notable activity in both German and French language conversations, as well as Spanish to a lesser degree, while users identified as German-based on language or geolocation were most active in both the French and Italian conversations. Users identified as Spanish exhibited participation in both the Italian and French conversations — these users also participated in the German conversation to a much lesser degree. The below map demonstrates the flow of users who are identified through language or geolocation “interfering” or “overlapping” into other conversations through their digital activity.

Network Activity of Abnormal Multilanguage Users

The graphic above shows a more detailed analysis of the 533 users who posted in several languages and participated in the conversation on the Marrakech Pact across the five countries analyzed. Constella’s data scientists created an ego-network of retweets and replies to these users in order to identify which other users they were able to attract, which communities emerged, which users influence each community, to what extent these 533 users demonstrated coordination, and the key issues addressed within each community.

  1. France & Italy Community (39.07% of users, 58.45% of comments): Focused on these countries and Germany. The users in this community have a tendency to share messages from Giorgia Meloni (an Italian politician and journalist, leader of Brothers of Italy) and Alessandro Meluzzi (former Forza Italia senator) to legitimize and validate their opinions and positions through these recognizable figures.
  2. Germany Community (29.53% of users, 20.72% of comments): Focused on Germany and criticize Angela Merkel’s decisions regarding the Marrakech Pact.
  3. France Community (24.53% of users, 18.07% of comments): Focused on France and Emmanuel Macron’s decisions. They share disinformation and criticisms of Macron in addition to messages from Marine Le Pen.
  4. Spain & Italy Community (4.48% of users, 1.95% of comments): The users from these communities criticize Pedro Sánchez’s decisions. They also tend to share messages from Alessandro Meluzzi and Georgia Meloni.

The 533 multilanguage users represent 1.15% of users (46,183), and of the 187,128 total posts over the period analyzed, these 533 users posted 4.77% of all messages (8,919). It is interesting to highlight their tendency to interact with each other: out of the total 4,529 retweets by the multilanguage users, 141 of the 533 users (26.45%) retweet each other at least once. Among those retweeting in the ego-network represented above, most retweeters belong to the France & Italy community. The users whose content is most retweeted are also from this community. Out of the total 849 replies, 23 (4.32%) replied to each other. This signals a tendency of multilanguage users retweeting each other at a much higher rate than replying to one another, with more than one of every four of the multilanguage users retweeting another multilanguage user. Of all multilanguage users retweeting, 51.34% are from the France & Italy community, while multilanguage users in the same community account for 63.3% of users being retweeted. This high frequency of content propagation of other users posting in several languages in the same identified communities signals an elevated level of affinity among these abnormal users.

We also observed that most of the users that retweet or are retweeted are the original multilanguage users that were used to expand the network, that is, the abnormal users initially found participating in multiple languages. Most of the users that make or receive replies have emerged in an organic manner due to their relationship with the original multilanguage users.

Creation Dates of the 533 Multilanguage Users’ Accounts

32.65% (174) of the 533 multilanguage user accounts were created in either 2018 or 2019, supporting evidence from additional analyses that indicate a correlation of high activity users and creation dates within the past 2 to 3 years.

533 Multilanguage Users – Most Shared Domains

In an analysis of the most shared domains by the 533 multilanguage users, several domains previously identified in Constella’s analyses of the public socio-political debate across Europe appeared. Many of these sites were identified to have been propagated with high intensity in specific communities due to the segmented editorial approaches of these domains. Sites such as philosophia-perennis.com (#4), RT.com (#8), tichyseinblick.de (#9), voxnews.info (#14), and infowars.com (#15) were among the most shared, propagated by multilanguage users with high frequency. These sites all appeared in Constella’s previous analyses as domains with highly segmented audiences and were identified by Constella’s analysts as having aided in the production or diffusion of disinformation content.

The full list of 150 domains and the most shared content during the period analyzed can be seen in the tables at the end of this analysis.

Top Shared Content Across Communities

Top shared content across various communities includes messages from political figures like Marine Le Pen, messages with anti-immigration and anti-Islam rhetoric from Italian public figures such as Alessandro Meluzzi and Giorgia Meloni, and messages bearing nationalist and xenophobic discourse from Italy’s Casa Pound. Content shared in these communities among the multilanguage users exhibited anti-immigrant and anti-Islam sentiment.

 

 

 

 

Domain Creation and Sharing Volume

In order to understand the role of media in this debate, Constella analyzed the top 150 most shared domains and assessed the temporal distribution of their creation dates. Of the domains created between 1996 and 2018, 16.67% (25 domains) have been created in the last three years. 8 sites created in 2018 were among the top 150 most shared domains, while 9 sites created in 2017 and 9 sites created in 2016 also appeared among the most shared domains in the conversation which indicates that recently created media is playing an increasingly important role in the public digital debate.

Temporal Analysis of Activity

In conducting a temporal analysis of activity around the Marrakech Pact, the vast majority of activity takes place between December 15th and 23rd, 2018. On the 19th of December, the Pact was officially endorsed by a UN vote despite several European nations withdrawing from the signing of the Pact. Although the Intergovernmental Conference to Adopt the Global Compact for Safe, Orderly and Regular Migration was held a week prior in Marrakech, the activity of the 533 multilanguage users peaks on the 19th with 980 comments around the formal signing and adoption of the Pact and slowly decreases gradually in intensity.

Statistical Analysis of Non-geolocated User Activity

It is observed that the difference between the distribution of users that publicly exhibit geolocation and those that do not exhibit geolocation are significantly distinct. The most notable differences are those between identities that post in 3 or more languages.
Assuming that the distribution of users that voluntarily show their geolocation and post in various languages reflects a distribution that would emerge organically in the public digital debate, we observe that it is highly unlikely to find behavior such as that exhibited by the non-geolocated users publishing in multiple languages (As all users must post in at least one language, Constella’s analysts considered modeling the additional languages in which users post as a Poisson distribution, with respective means for geolocated and non-geolocated users of 0.00253 and 0.02259, which provides a difference that is statistically significant (p < 0.0001). User’s behavior has been modeled using a negative binomial distribution in order to simulate how probable it is to find a group of non-geolocated users posting in multiple languages, such as the ones identified in this analysis).

We can conclude abnormal behavior by users in this group (non-geolocated and publishing in multiple languages) given:

  1. It is 10 to 70 times more probable to find non-geolocated users that publish in more than 1 language (10x for more than 1 language and 70x for 3 or more than 3 languages).
  2. Behavior such as that reflected by the group of non-geolocated users who post in 2 or more languages has a chance of occurring 1 in 205,000 times, and the event of non-geolocated users posting in 3 or more languages only has a chance of occurring 1 in 6,000,000 times.

Annex 1: 533 Multilanguage Users – Most Shared Domains

Annex 2: Top 150 Most Used Keywords, Hashtags, or Terms in the General Conversation

Interested in our work? Please contact us at info@constellaintelligence.com. To learn more about Constella, subscribe to our newsletter below.