We’ve all heard of it, we all associate it with the nefarious world of hacking, drugs, child sexual exploitation, slavery, and so on, but what exactly is the dark web, and how does it work?

In this short series of blogs I will discuss what the dark web is, and some of the technologies that can be used to access the dark web offerings.

Layers of webs

The web is very often talked about as having 3 distinct layers.

Layer 1

This is the top layer of the web and is also known as the surface web. Sometimes this layer is also known as the indexed web, referring to the fact that this is the portion of the www which can be scanned and indexed by search engines such as google, bing, duck duck go, etc.

It is impossible to put an exact figure on the size of the www as it changes constantly, however by analysing the amount of data Google indexes, we get a fair approximation to its size.

Back in mid February, Google was indexing approx. 65 billion web pages, so that gives us an idea of the size of the surface web.

Size of the surface web

Layer 2

This is the next layer down and is often called the Deep web. This area of the web is much larger than the surface web, but it is impossible to put a figure on the size due to the fact that this layer cannot be indexed by search engines, so there is no way to catalogue and analyse the data it holds.

The reason why this layer cannot be indexed by search engines is because the data held at this level is typically behind some form of barrier – be that a paywall, or a password-protected site – search engines simply cannot see beyond the barrier.

Think of all the sites you use which requires you to either enter a username / password combination, or have to pay to access its content, and you’ll begin to realise how much larger than the surface web this is.

All the data on your social media sites, the data in your banking sites, your health-care sites, insurance websites, corporate sites, etc. This layer is massive – and none of it can be accessed by a search engine.

Layer 3

Layer three is the Dark web. This layer is typically not accessible to most users via traditional web tools such as browsers, etc. In order to access the content held in the dark web, you need specific software and often invites from those already with privileged access.

Like the deep web, the data held in the dark web cannot be indexed, and so it is impossible to place a value on the size of the dark web. In addition to this fact, many websites in the dark web appear and disappear fairly quickly, so any size figures would be quickly out of date.

This layered approach to the www often sees people describe the structure of the web as being a bit like an Iceberg – The surface web being above water for all to see, the deep web being the much larger piece under the waterline, and the dark web being right at the bottom of the berg in the dark, murky depths of the ocean.

Iceberg depiction of the layers of the www

So, how do you access the dark web?

There are various technologies which allow you to access the dark web, the first one which most people would be aware of is TOR, but there are other lesser-known utilities such as I2P, freenet, GNUnet, freeserve, Open bazzar, and others.

TOR

TOR used to be an acronym for The Onion Router, but nowadays is just called Tor.

The concept of how it works started life in the mid 1990’s at the US Naval research laboratory with their work on Onion Routing.

In simple terms, onion routing involves encrypting your data payload with multiple layers of encryption, so that if you were to slice the communication through the middle you would have concentric layers of encryption – much like the concentric layers of an onion.

Diagram of encryption used in onion routing

When a message is transmitted in a onion routed network, the message is initally encrypted using the key of the final node in the network, before the data destination. This is then re-encrypted using the key of the penultimate node in the network, before being encrypted by the key of the primary node in the network.

When the data is transmitted top the primary node, it decrypts the outer layer to reveal the information of the next node and the sends the data onward. The 2nd node receives the data, decrypts the layer and sends the data to onward. The 3rd device receives the data, decrypts it and uses the data in the message to send it to the final destination.

Typical example of onion routing with tor

This way of operating provides a high level of anonymity for the sender and recipient and also a high level of confidentiality for the message being transmitted.

The US naval laboratory released the source code for tor under a free license in 2004 for others to use and improve on.

The resulting product became the TOR network which is now managed by the tor project

Tor browser bundle

For many people, the easiest way to access the Tor network is to download the tor browser bundle form the tor project website. This browser is a highly customised version of the Firefox browser which is configured to connect to the tor network.

After installing the browser, you need to connect to the tor network. To do this simply click the Connect button.

Connecting to tor

Once connected, you can actually use the browser to surf the surface web just like any other browser – simply type the URL of where you want to go.

Clicking the padlock icon in the browser URL bar will show you the nodes through which your connection is being made to the website you are visiting

visiting the bbc news website via tor

Using hidden services

Using the tor browser to visit surface web pages does not constitute being on the dark web, you are simply using the tor network to access the surface web.

To truely use tor to visit the dark web, you need to access a hidden service (A.K.A. onion site)

Hidden services do not use traditional surface web domain names such as bbc.co.uk, or amazon.com, instead they use special string which ends in .onion

When a service provider hosts a website on the tor network, they generate a public/private keypair for encrypting data. This keypair uses the Edwards-curve Digital Signature Algorithm (EdDSA) – more specifically, Tor uses the Ed25519 public-key signature system.

The onion address is generated from the public key and is base32 encoded as shown below:

 onion_address = base32(PUBKEY | CHECKSUM | VERSION) + ".onion"

This results in an address which looks something like the ones shown here:

  • vww6ybal4bd7szmgncyruucpgfkqahzddi37ktceo3ah7ngmcopnpyyd.onion/
  • s4k4ceiapwwgcm3mkb6e4diqecpo7kvdnfr5gg7sph7jjppqkvwwqtyd.onion/
  • 2jwcnprqbugvyi6ok2h2h7u26qc6j5wxm7feh3znlh2qu3h6hjld4kyd.onion/

Visiting hidden services

To access hidden services, the service provider creates long-term tor routes to a number of nodes and asks the nodes to act as introduction points. The service will only allow connections via these introduction points.

The service provider then creates an onion service descriptor which contains the list of introduction points and authentication keys. This descriptor is digitally signed with the service’s private key.

This service descriptor is then uploaded to a distributed hash table in the tor network so that it can be retrieved by tor clients.

When you enter the .onion URL in the tor browser, the browser looks the address up in the distributed hash table, retrieves the service descriptor file and verifies that it belongs to the service you want to access, and then builds a route to the one of the introduction points listed in the file.

At the same time, the browser creates a route through the tor network to a node which it will use as a rendezvous point and asks the introduction point to send this rendezvous point to the hidden service along with a secret one-time message for added security.

The service now builds a route to the same rendezvous node in order for the connection to be established.

Connecting to a tor hidden service

In this respect, two tor circuits have now been created – one from your device to the rendezvous point, and one from the service to the rendezvous point.

By working in this way it becomes extremely difficult for either party to identify where the other party is located as neither party knows the full route between those involved.

A perfect place to hide

The method of connecting to a hidden service described above offers a perfect place for criminals to hide their online activities – this is fully understood by all which use tor.

However, we must not let that aspect of tor cloud the positive reasons for such a service to exist.

Tor provides a highly safe an anonymous communications channel for all manner of activities where individuals fear for their safety and anonymity.

“We believe everyone should be able to explore the internet with privacy. We are the Tor Project, a 501(c)(3) US nonprofit. We advance human rights and defend your privacy online through free software and open networks.”

tor project

Whistle blowers, law enforcement agencies, journalists, abuse victims, privacy advocates, those living under oppressive regimes, and many others rely on the secrecy offered by tor to communicate with the outside world in the same way in which many others take for granted.