By David Tayouri, M.Sc. in Computer Science, Cyber Intelligence Department Manager, Israel Aerospace Industries (IAI)
THE POTENTIAL AND THE CHALLENGES
The digital era has changed the way we communicate. These days, relationships and conversations between people, which are part of the human nature, take place through the web. Using social media, we can keep in touch with our friends and family, share posts with them, messages, pictures and videos; we can share our experiences with each other, be updated on our friends’ statuses, support them when they need it, and read their statements.
When we look from the perspective of the Law Enforcement Agencies (LEAs), we find the social media is a goldmine of intelligence. The Intelligence Officer can collect:
- Information on targets – mails, phone numbers, living address, education, employment history
- Activities – posts, tweets, remarks/likes to posts, pictures and pages
- Social connections – friends, followers, followers, groups, etc.
The Intelligence Officer’s challenge is how to extract meaningful intelligence from this goldmine. The social media has many challenges for them:
- Huge amount of data – Facebook users (1.5 billion) share 2.5 billion pieces of content each day, Twitter users (650 million) tweet 500 million times each day, Instagram users (180 million) share 58 million pictures each day, YouTube users (1 billion) upload 300 hours of videos per minute.
- Different content types – there are different structured data types, unstructured texts, images, videos, etc.
- Complex relationships between the different entities in social networks – in the real world two persons may be related if they belong to the same family, are neighbours, attended the same school/university, worked at the same organisation, served in the same army unit, were at the same hotel at the same time, etc. The social network adds other possible relationships: friends in Facebook, following in Twitter, connected in LinkedIn, commenting on each other’s posts, writing in the same blog, and members of the same forum or social media group, to mention a few.
- Privacy settings – the default setting in most of the social network services prevents the passive collection of information on targets. This means that the legacy Open Source Intelligence (OSINT) collection tools will fail to gather relevant intelligence. To be exposed to the interesting information, the Intelligence Officer should get close to the targets, and become their friend.
- Multiple virtual identities – people may have several email accounts and social media profiles. To build the complete picture of a target, the Intelligence Officer needs to connect the different information pieces of the target, collected from different sources or from different virtual identities.
SOCIAL MEDIA ANALYSIS
The answer to these challenges resides in a good analysis process. An example of social media analysis is behaviour analysis. Social media behaviour analysis may enable predicting evolving major events. Today, it seems that whenever a major global event happens, some of the intelligence organisations are surprised time after time.
According to experts, social media played a significant role during the Arab Spring. According to a 2011 research on this subject[1], social media played a central role in shaping political debates, helped spread awareness about ongoing events all over the world, and, finally, online revolutionary conversations often preceded mass protests on the ground. However, the revolutions in the Arab countries were not anticipated.
Can social media analysis indeed enable predicting emerging events? In 2010, researchers from HP Labs showed how social media expresses a collective wisdom, which can be utilised to forecast future outcomes[2]. They managed to predict the collective behaviour of going to a movie by analysing the traces it left in Twitter. They found that by employing only eight features (the average hourly number of tweets related to the movie for each of the seven days prior to the movie opening and the number of opening theatres for the movie), movie revenue can be predicted with high accuracy. This method may be extended to different topics, such as election outcomes, emerging riots against government policy, public opinion on military actions, etc.
A German research from 2010[3], which analysed over 100,000 Twitter messages mentioning parties or politicians prior to the German federal election, demonstrated that Twitter can be considered a valid indicator of political opinion. The researchers found that Twitter is indeed used as a platform for political deliberation. The mere number of tweets reflects voter preferences and comes close to traditional election polls, while the sentiment of Twitter messages closely corresponds to political programs, candidate profiles, and evidence from the media coverage of the campaign trail. They also found that the sentiment profiles of politicians and parties plausibly reflect many nuances of the election campaign. In addition, the researchers found that the mere number of messages reflects the election result and even comes close to traditional election polls.
Social media analysis can also enable revealing who are the key people, stakeholders or influencers in different groups or major events. This can be achieved by analysing the behaviour of the individuals within a group. For example, a content published by an influencer is spread much faster than content published by a regular group member. Another way of identifying an influencer is analysing connections within a group. Influencers generally have many direct friends and followers, but what makes them truly valuable is the number and relevance of their extended or indirect connections[4].
Social media analysis can help understanding the motivations and politics behind groups and events. This can be done by content analysis – analysing the published content of the group members, and searching for keywords and sentiments. Features such as the presence of intensifiers and positive/negative/neutral emoticons and abbreviations can clearly be most useful in content analysis.
Another field of social media analysis is identity resolution. People may have different virtual identities – several email accounts, one or more Facebook profiles, one or more Twitter profiles, etc. Identity Resolution aims to connect different virtual identities of the same physical persona. This is needed to enable the LEAs connecting between monitored activities of their targets.
According to a 2015 study published in the Science Journal[5], researchers succeeded in taking “anonymised” information to re-identify people’s records by name, according to uniqueness of behaviour combined with publicly available information. The researchers analysed credit card transactions of 1.1 million people shopping in 10,000 stores, over a 3-month period. Each record included date of each transaction, amount charged and name of the store. The records were “anonymised”, i.e. they didn’t include any personal details like names and account numbers. The uniqueness of people’s behaviour made it easy to single them out: 90% of the shoppers were re-identified as unique individuals. It is interesting to mention that the research shows that women are more re-identifiable than men in credit card metadata.
Another case of identity resolution is shown in an article published in Gawker[6] in 2014. The data set in this case were cab rides in New York City in 2013, made public by New York City’s Taxi and Limousine Commission – a total of 173 million records. Each record included date & time, geographic coordinates, fares, tips and cab details (encrypted with MD5). Cab medallion (license number) has particular format, so they were easily decrypted. Since paparazzi photographers in New York City frequently capture celebrities entering or exiting yellow taxi cabs, and many of the pictures depicted the cab’s unique medallion number, combing pictures metadata and dataset records resulted re-identify of celebrities in the dataset.
PRIVACY IN SOCIAL MEDIA?
The other side of the coin is privacy. Everyone would like to keep his/her personal information private or share it with his/her selected family and friends. This is the reason many social networks include strong privacy and data protection provisions. However, cyber-criminals, insurgents and terrorists may abuse these provisions to hide themselves and their activities.
In the age of digital media, do we really have any privacy? When sharing so much information in the social media, a huge amount of personal data is shared. Adults may be more radical and share less; however, teenagers today are freely giving up personal information. A report by Pew Research Center from 2013[7] shows that teens share a wide range of information about themselves on social media sites. For example, 40% of teen Facebook users do not keep their profiles private. According to this report, teens are sharing more information about themselves on social media sites than they did in the past.
It is important to mention that one doesn’t necessarily need to have a social media profile, to have presence the social media. Even if you do not have a Twitter account, others may mention you in their tweets, and even if you don’t have Facebook and/ or Instagram accounts, others may tag you in pictures they post. You may reduce your digital footprint, but it is almost impossible to have no trace at all in the social networks.
SUMMARY
The digital era changed the way we communicate. Social media became the place we share our experiences, opinions, statements etc. For LEAs, the social media is a goldmine of intelligence, where they can collect information on targets, their activities and their social connections. But extracting meaningful intelligence is not easy. Analysis can help for predicting emerging events, revealing who are the influencers in groups and major events, and what are the motivations behind them. Identity Resolution can help differentiating two people with the same name and connect different virtual identities of the same person. The other side of the coin, as with many other good things, social media comes with a price – loss of privacy.
With the adequate tools and systems, the Intelligence Officer will be able to collect from the social media not only the relevant intelligence on targets and groups of interest, but also build the complete intelligence picture, and predict emerging events.■
REFERENCES
- Howard, Philip N.; Duffy, Aiden; Freelon, Deen; Hussain, Muzammil; Mari, Wil; Mazaid, Marwa (2011). “Opening Closed Regimes: What Was the Role of Social Media during the Arab Spring?”
- Asur, S., Huberman, B. A. (2010, August). “Predicting the future with social media”. In Web Intelligence and Intelligent Agent Technology (WI-IAT), 2010 IEEE/WIC/ ACM International Conference on (Vol. 1, pp. 492-499). IEEE.
- A.Tumasjan, T.O. Sprenger, P. G. Sandner, I. M. Welpe (2010). “Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment”, Technische Universität München Lehrstuhl für Betriebswirtschaftslehre Strategie und Organisation. Leopoldstraße 139, 80804 Munich, Germany http://www.aaai.org/ocs/index.php/ ICWSM/ICWSM10/paper/viewFile/1441/1852Predicting
- Hall, Taddy. “10 Essential Rules for Brands in Social Media”, 22 March 2010. http://adage.com/digitalnext/article?article_id=142907
- Y.A. de Montjoye, L. Radaelli, V.K. Singh, A. Pentland. “Unique in the Shopping Mall: On the Reidentifiability of Credit Card Metadata”, Science 30 Jan 2015, Vol. 347, Issue 6221, pp. 536-539, DOI: 10.1126/science.1256297 http://www.sciencemag.org/content/347/6221/536.full
- J.K. Trotter. “Public NYC Taxicab Database Lets You See How Celebrities Tip”, 23 October 2014. http://gawker.com/the-public-nyc-taxicab-database-that-accidentally-track-1646724546
- M. Madden, A. Lenhart, S. Cortesi, U. Gasser, M. Duggan, A. Smith, M. Beaton (2013). “Teens, Social Media, and Privacy”, Pew Research Center. http://www.pewinternet. org/2013/05/21/teens-social-media-and-privacy/
ABOUT THE AUTHOR
David Tayouri, M.Sc. in Computer Science, has 25 years of experience in software architecture, development and project mana-gement. David has been dealing with different aspects of the web since his thesis on 1997. David is one of the cyber activity leaders in Israel Aerospace Industries (IAI) and manages the cyber intelligence department for the last three years.