In my last post in this min-series on OSINT, I covered some of the techniques used in the sub-category of SOCMINT and how posting data across social media could allow a threat actor to uncover sensitive data.

In this post, I will take a look at metadata.

What is metadata?

Metadata  is data that provides information about other data, but not the content of the data.

For example, take a book – The content of the book is the data, but information such as the ISBN, the author name, the publication date, the sale price, the publishing house, etc. would all be metadata.

You can get quite a lot of useful information about something just from its metadata.

Lets take a look at a typical web transaction where a user opens a web browser and visits a website.

The network transactions which take place would consist of the following items.

  • IP address of client
  • IP address of DNS server
  • DNS query
  • IP address of web server
  • HTTP get/post request
  • HTTP response

Just by examining this data we will be able to identify what website a user wishes to visit, what type of device they are using, and more.

In the Wireshark capture below, we can see the IP address of the client & the DNS server which handles the request, plus the DNS query being sent.

Note that in this example, both the client and server IP addresses are in my LAN. If this data is captured on an external connection, then the IP addresses would be the Internet IP addresses of the network, but the principle is the same – you can see the IP addresses of the devices communicating.

In the next Wireshark capture we can see the client connecting to the IP address of the neverssl.com webserver. We also see the browser user-agent header which details a number of items such as operating system of the client, what browser is being used, and what language they want the data in.


As you can see, there is a fair amount of data to be obtained just form a simple web transaction. This is one reason why its better to use HTTPS connections and where possible use a DNS provider that supports Encrypted DNS.

If you use both of these technologies when surfing the web, anyone outside of your immediate connection will be unable to examine the metadata of your activities.

Alternatively, use a VPN to ensure no-one can snoop on your data.

See my post about VPNs for more information on why a VPN can be a good thing to use

Not just network data

Any file which is created digitally will have an element of metadata associated with it, emails, Office documents, PDFs, photos, music files, etc. They all have some meta data associated with it which could be useful to a threat actor performing OSINT.

Office documents

When you create an office document (word, excel, PowerPoint, etc.) a number of data items automatically get stored alongside the file which you should be aware of.

In the image below which shows the information relating to a PowerPoint file, you can see items such as author name, modifier name, file path where the data has been saved, dates of creation / modification, and more.

When publishing files of this nature, it is important to remove all sensitive information like this. There is an option in Microsoft Office to inspect documents for such metadata and remove it before publication, but there are a huge amount of files online with exactly this sort of data left in.

For example, a quick Google search for filetype:xlxs will return a swathe of excel spreadsheets which will in many cases show sensitive data.

Because I don’t wish to be landed with some form of law suit, I shall not give explicit examples here, but In less than 10 minutes I found the following data:

A list of specific Excel spreadsheets
An Excel spreadsheet with a persons name and an organisations name
A link to LinkedIn profile with a full name
The LinkedIn page

It’s easy to see how someone can use these types of OSINT searches to quickly identify individuals who work for specific companies, learn about those people and then use that data in a directed Phishing attack, or fraud.

Photographs

When you take a photograph with a smart phone, a LOT of information is stored alongside the photo in a file called the EXIF file (EXchangable Information File).

Depending on the settings of your phone, this EXIF file can contain data such as:

  • Focal length
  • Flash firing mode
  • f-stop
  • ISO
  • Latitude & Longitude
  • Magnetic direction
  • Height above sea level
  • Device type
  • Device name
  • Metering mode
EXIF file showing location data

When you take a photo with your smart phone and upload it to a social media site, the site will strip any EXIF data out so that others cannot obtain this data if they download your photo, however not all sites do this – for example, many forums do not strip out EXIF data.

This image is taken form a Forum I use and shows a car owner who has an issue with a broken bolt on their Alloy wheel.

Forum photo

After downloading the image, I examined the EXIF data and identified the location data.

EXIF data from photo

Using this data, I quickly identified the house where the car owner lives.

Location of vehicle owner

Ideally, if you are sharing photos online, ensure you either disable location tagging, or strip all sensitive data from the image before you post it online as you have no guarantee the website will do the same.

Hiding in plain sight

Back in 2014, Russia was accused of deploying troops inside the borders of Ukraine, something Moscow steadfastly denied. However, they were proved liars when a number of troops uploaded photos of themselves and their armoured vehicles to sites such as Instagram & Twitter – All with EXIF data showing Lat/Long information.

In 2015, the Atlantic council produced a full dossier of damning evidence of Russian activity in Ukraine entitled Hiding in plain sight which details a number of items of proof of Russian cross-border activity, including photos uploaded by troops.

Russian troops exposed by Geo-tagged photos in Ukraine

Metadata is a fact of living in a digital world, but in many cases the existence of metadata can tell a picture many might not want telling, so before you post things to the Internet, check what “hidden” data is also being uploaded.