What is metadata collection? What types of metadata traces do we leave online?

Metadata is information about the data you create, especially through communication. Metadata is collected about your location, when you communicate, the device your use, search habits, duration of communication, and similar data about the people you are communicating. Reveals relationships. It can be used for advertising to understand what consumers do and want, for police investigations, government organisations. Know more about what traces do we leave behind online and the role of whistleblowing.

What happens when a whistleblower sends pictures, texts, or other files through your internal whistleblowing channel? How anonymous are they? The likely answer is not very; unless you use a whistleblowing tool to secure communication and protect their identity. Metadata can reveal much more than you think. While digital technology presents risks and opportunities, the power data holds is enormous. Military operations can kill people based on metadata governments collect. National security agencies can collect data from service providers like banks and cell phone companies to prosecute entities or individuals. Your data gets around. Information is often shared between nations through international agencies and foreign intelligence entities that have developed data-sharing protocols.

What Is Metadata? 

Journalist John Battelle coined the term “database of intentions” to describe “the aggregate results of every search ever entered, every result list ever tendered, and every path taken as a result”. He argues digital data represents a placeholder for the intentions of humankind and a huge database of information that reflects the desires, needs, wants, and likes of people around the world. What many people don’t realize is that this information is accessible through subpoenas, archives, and tracking mechanisms, and can be exploited. 

There are three types of metadata: surveillance, business, and technical. Surveillance metadata is less protected because there are few regulations that protect accessing the data. Business metadata is the information companies collect about consumer behaviour, communication, and interaction through Internet services like online purchasing platforms and social media channels. Technical metadata is the information used to ensure messages are moved from one place to another when they are sent.  

Business data is probably the most prolific. It’s collected through everyday actions like swiping your bus pass, using an ATM to do your banking, or checking your Instagram feed during your work break. You create metadata when you text, make a phone call, use social media, do banking, use public transportation, enter information in a business or government platform, use your apps, and browse the Internet.

Metadata collection is intentional and requires tools that transform it into meaningful information businesses can use. When placed in the wrong hands, it can be used maliciously. For example, some governments collect huge amounts of data to conduct surveillance activities. In Russia, there has been an effort to block Telegram because the company refused to share data about its users with national security services. The EU has developed an international set of standards for sharing metadata between Member Nations to reduce the risk to the public. 


Description automatically generated

The sheer volume of metadata collection means everyday transactions that may seem harmless can be used by entities to learn more about you. In one study, researchers were able to build an accurate profile of an individual’s life by tracking metadata from a mobile application through timestamps.  After tracking metadata for one week, they determined the individual’s work habits, personal interests, websites he visited, passwords, and social networks, based only on phone and e-mail activities (Tokmetzis, 2014).

Can Metadata collection be bad?

There are risks associated with metadata, for both individuals and companies. This puts your privacy at risk. Metadata can be collected in bulk as part of a country’s surveillance mechanisms, which can violate the rights of citizens to privacy. Not all businesses notify Internet users that they are collecting metadata. Cookies are created by the websites we visit to use sign-in information, preferences, and geographic information to enhance a user’s experience. In some cases, websites sell browsing data and cookies to third-party entities without asking for permission. 

Metadata collection is controversial because it raises a few ethical concerns. It isn’t always clear how metadata is used and shared by entities that collect it. Some businesses collect metadata without sharing their privacy policies with the public. They may then sell the information to a third party or hand it over to a government agency such as law enforcement or intelligence without warning. How long entities retain metadata depends on their retention policy and existing regional or national laws. 

“An individual’s future location and activities can be predicted by looking for patterns in his friends’ and associates’ location history. A security expert also warned that identifying phone calls from key executives at a company to or from a competitor, an attorney, or a brokerage can reveal the potential for a corporate takeover before any public announcement is made”. (ACLU, 2014)

Document metadata can expose changes or deletions made to protect sensitive information, people who worked on the file, templates used to create it, and time spent editing; all information unintended to be shared publicly. These revelations can potentially have a negative impact on a company’s reputation or share price. For example, in a court case against Whole Foods, documents containing hidden information disclosed plans to close stores and how the company negotiates with suppliers to drive up costs for Wal-Mart. This arguably revealed the competitive strategies used by the company.   

Metadata and Whistleblowing

What do these risks mean for whistleblowers? You should be aware that the channel you use to make your report could be collecting metadata. For example, analysis of telephone metadata shows some telephone numbers reveal basic and often sensitive information about the caller. Using a hotline for whistleblowing means callers who reveal sensitive information can be at risk of having their identity revealed (Felten in U. S. Judiciary Hearing on Continued Oversight of the Foreign Intelligence Surveillance Act, 2013). 

Whistleblowing tools like NorthWhistle can give whistleblowers full anonymity and protect the whistleblowing data they collect. They use fully encrypted communication between devices and the tool’s server and data storage. NorthWhistle also embeds GDPR processes in the technology to ensure your organisation stays compliant with existing data management laws. The interface is designed to be anonymous and confidential for people who disclose information, creating a sense of safety and reducing the risk of reprisals. We take a neutral position to ensure trust with employers and employees.  

Whistleblowing tools are the best way to ensure data privacy and security. Data encryption will maintain confidentiality for the reporter. Northwhistle uses a data minimisation approach and only collects necessary information. The tool uses anonymous endpoints that protect the identity of the whistleblower by removing the geographic location and other identifiers like employment details or an IP address. The most important part of a whistleblowing program is making sure employees remain anonymous when they disclose misconduct. By being aware of how your data (and metadata) is collected, stored, and used, you can be more effective at managing your internal whistleblowing channel.