OSINT Toolbox Talk: Investigating usernames and extracting user data and media from Instagram

OSINT Toolbox Talk

Extracting user data and media from Instagram and investigating social media usernames


In this week’s OSINT Toolbox Talk we will roundup the three most effective tools discussed earlier in the week. We will start by taking a close look at how you can extract Instagram user data using Instahunter. Then, we will look at Instaloader and how it can be used to extract public user media from Instagram accounts. Finally, we will introduce Social Analyzer, a highly recommended tool that can used to investigate usernames online (including social media). Social Analyzer is a top-notch tool that comes with an awesome user interface; it has received glowing reviews by many OSINT’ers – it is definitely a must have!


Investigating usernames with Social Analyzer Investigating usernames with Social Analyzer https://github.com/qeeqbox/social-analyzer

One thing is certain when searching for OSINT tools, there is an abundance of online resources and scripts that can be used to identify social media accounts associated with usernames. Each has its benefits and drawbacks; however, Social Analyzer is in a league of its own and is a very impressive tool that I thoroughly recommend for Digital Investigators and OSINT practitioners.

Social Analyzer is an API, Command-Line Interface (CLI) and Web Application that is used for analysing and discovering a target’s profile across over 800 social media platforms and websites. The tool comes complete with several analysis and detection modules which the user can select from during the discovery phase of any investigation. According to the tool’s GitHub repository, the detection modules use a rating process based on a range of detection techniques. This detection process provides the user with a rate value from zero to 100; with zero being an ‘unlikely’ match; and 100 being a ‘highly likely’ match. The contributors to Social Analyzer further indicate that the tool can help in investigating profiles related to cyberbullying, online grooming, online child sexual exploitation, cyberstalking and disinformation.

The sources used by Social Analyzer vary greatly, though it is undoubtedly the most far-ranging. Social media platforms that the tool checks against including Facebook, Twitter, Instagram, Twitter, TikTok, Reddit, Vkontakte and Odnoklassniki (to name but a few). Additionally, the tool can be deployed on multiple operating systems including Linux, Windows and macOS.

THE COMMAND-LINE INTERFACE

Social Analyzer can be deployed within a CLI, either as a Python or Nodejs script. Additionally, it is compatible with Docker – adding greater flexibility for the Digital Investigator based on their individual preferences. Social Analyzer’s output remains very much the same when using the tool within the Python, Nodejs or Docker CLI; it parses through social media platforms and webpages and outputs its results within the CLI itself. As a flexible tool, Social Analyzer also provides the investigator with a range of optional features and arguments such as the ability to extract available metadata alongside primary search results in addition to extracting profiles, URLs and patterns (if available).

THE WEB APPLICATION INTERFACE

Undoubtedly, this is the jewel in Social Analyzer’s crown. Using Mozilla Firefox’s Extended Support Release and Nodejs, Social Analyzer can be deployed as a very slick web application that offers plenty of features in addition to the functionality that is generated through the Command Line Interface. The web application is very easy to use from a UI/UX perspective, the investigator can input the target username and configure the optional search parameters which can display information including extracted names from the username alongside the origin of such names. Whilst this particular feature is not altogether useful when dealing with British/American names, it can prove to be useful when dealing with foreign or uncommon names. The main outputs from the web application interface include the detected profiles associated with the target username in addition to extracted metadata. However, as the image above indicates, the Social Analyzer web application also outputs a link chart showing the profiles associated with the target username – this is undoubtedly a brilliant feature. Also, the web application interface also allows the investigator to save the results either within a JSON file or within a pre-compiled ZIP folder. Lastly, the link chart can also be saved as a standalone image, which is useful in the event that it is needed to be attached to a separate digital investigation report.

SUMMING UP

In my opinion, Social Analyzer deserves much more credit than it has already received – it is undoubtedly a very flexible, user-friendly and results-driven tool that should be included in every Digital Investigator’s toolbox. During the test of the tool, I could not find any major flaws. Admittedly, the UI for the web application interface could benefit from some slight tweaks, particularly with regards to the link chart. The tool could potentially benefit from allowing the investigator to search against multiple target usernames to potentially identify common social media platforms used in addition to similarities between extracted metadata. However, for what the tool is and what it provides, it is fantastic and comes highly recommended!


Analysing Instagram user data with Instahunter Analysing Instagram user data with Instahunter https://github.com/Araekiel/instahunter

When it comes to OSINT and Facebook-associated platforms including Instagram, it is considered by many to be a tedious relationship and often akin to ‘cat and mouse’ whereby Facebook continuously seeks to implement features designed to prevent us from collecting data from its platforms. As such, programmers are faced with the constant need to adjust their programmes and scripts in order to account, or sometimes circumvent such features.

There are plenty of tools and scripts that can be used to extract Instagram data, and in an upcoming OSINT Workflow article, I will provide a step-by-step guide on how Digital Investigators and OSINT practitioners can extract and visualise Instagram user followers – stay tuned! For now, we will take a close look at ‘Instahunter’, a Python-based script that we can use to extract Instagram user data.

Downloading and deploying Instahunter is incredibly easy, what I personally like about this tool is that there is no requirement to create a Sock Puppet account in order to use the tool. I should add that whilst Instahunter is quite effective, it only works with public profiles. As is to be expected, gaining user and post data from private profiles will require the use of a Sock Puppet and for the target to accept any request to connect. Our last OSINT Workflow article discusses the effective use of Sock Puppets in this regard.

Instahunter works by extracting posts and user data from Instagram’s frontend API in JSON or text format. In it’s current form, Instahunter provides the following key capabilities:

  • Fetch latest or top public posts with a hashtag
  • Fetch public user data with a username
  • Fetch recent public posts of a user with a username
  • Search for users on Instagram

As I have already pointed out, Instahunter is a very easy tool to deploy through Python, it does exactly what it says, and enables you to quickly and efficiently extract data from public Instagram profiles. Unfortunately, it does not provide the capability to scrape media from public Instagram profiles. However, in an upcoming OSINT Tool Review, we will look at a tool that can be used in this regard.

To sum up, Instahunter does what it says, it produces results and can save you time in the event that you need to extract data from a public Instagram account.


Extracting user media from Instagram using Instaloader Extracting user media from Instagram using Instaloader https://github.com/instaloader/instaloader

In our previous OSINT Tool Review, we took a close look at Instahunter, and how it can be used to extract user and post data from public Instagram profiles. That same article further indicated that the Python-based script was not able to extract media data from Instagram. In this OSINT Tool Review, we build on the last article by presenting Instaloader, another Python-based script that provides Digital Investigators and OSINT practitioners with the capability to extract media from Instagram profiles and output the media within a separate folder on your system.

Citing Instaloader’s Github repository, the contributors to the script indicate that it provides the following capabilities:

  • Download public and private profiles, hashtags, user stories, feeds and saved media
  • Download comments, geotags and captions of each post
  • Automatically detect profile name changes and renames the target directory accordingly
  • Allow fine-grained customisation of filters and where to store downloaded media
  • Automatically resume previously interrupted download iterations

While the script itself does have the capability to extract media from private Instagram profiles, this is only possible if, for instance, you have a Sock Puppet account that is following the private profile. For insight with regards to developing effective Sock Puppets for investigative uses, you can read our OSINT Workflow article which discusses this topic at great length.

Deploying Instaloader within your Python environment is incredibly easy. The script itself is very flexible; for example, users can instruct the script what data to extract such as geotags, stories, hashtags, IGTV posts and tagged media. The default setting for the script is to extract all of the aforementioned data and output it as a JSON file alongside the corresponding media file. Where you need to extract media and associated data from a target profile that has been updated, Instaloader is flexible enough to enable you to do this by invoking the ‘–fast-update’ command. Returning to the subject of private profiles, Instaloader will require you to input your username and password in order to extract the target media. However, when logging in the first time around, Instaloader stores the session cookies in a file in your temporary directory. This session cookie will then be reused later the next time the ‘–login’ command is invoked. This means that you can extract media from private profiles non-interactively when you already have a valid session cookie file.

One issue that constantly arises with regards to the use of tools and scripts against Instagram targets is the rate-limiting feature that exists on the platform, this ultimately prevents us from extracting media in bulk. To address this, Instaloader has a logic to keep track of its requests to Instagram and defer subsequent ones – ensuring that Instaloader does not reach Instagram’s rate limits. However, as indicated in our previous article regarding Instahunter, Facebook continuously seeks to implement features designed to prevent us from collecting data from its platforms, including Instagram. This often means that programmers who produce quality tools and scripts such as Instaloader are faced with the constant need to address, or sometimes circumvent such features.

That aside, Instaloader is very easy to install, quick to deploy and does exactly what it is intended to do. During my tests, I found no issues in using the tool against public and private profiles. Looking through the Instaloader’s Github repository, it is clear to see that the contributors have consistently updated the script to account for various safeguards that Instagram has implemented since 2016. Therefore, I have every confidence that the Instaloader team will continue to maintain the script for its users. All-in-all, I highly recommend this tool for all Digital Investigators and OSINT practitioners.


Let's talk today Are you ready to begin discussing our range of training and capability development solutions?