Exploring the Browser Internationalization API: Unveiling the Fingerprinting Potential

2023-09-12

Browser fingerprinting has emerged as an insidious new threat to online privacy. This technique allows websites to identify and track users without their knowledge by extracting small configuration details from their browsers to generate a unique fingerprint. As traditional tracking methods like cookies become more restricted, fingerprinting has been growing among advertisers, tech companies, and other entities hungry for data on browsing habits.

While most fingerprinting techniques exploit common browser features like screen resolution, installed fonts, or navigator properties, some have started looking at more obscure APIs with unexpected fingerprinting potential. One such API is the Browser Internationalization API (also known as i18n API), designed to facilitate localization and internationalization of web content. At first glance, it may seem benign, but as we will explore in this post, this API can inadvertently leak information that can be abused for browser fingerprinting.

Understanding Browser Fingerprinting

Browser fingerprinting involves collecting configuration and settings information from a user’s browser to generate a unique identifier or fingerprint. This fingerprint can then be used to identify and track the user across the web without their knowledge or consent.

Some examples of information collected by fingerprinting scripts include:

  • Screen resolution and color depth
  • Timezone
  • Installed fonts
  • Browser name and version
  • Operating system
  • List of installed plugins
  • Browser language
  • Feature support

This data is combined and processed through a fingerprinting algorithm to generate a fingerprint unique to that particular browser on that particular device. Even small variations in things like font lists can produce a one-of-a-kind fingerprint.

The threats around browser fingerprinting are obvious. It allows any entity to identify and monitor a user’s online activity across the web without their knowledge. The collected data can be misused for targeted advertising, building user profiles, price discrimination, or sold to third-parties. Fingerprinting undermines user privacy and works against initiatives to block cookies and other traditional trackers.

Introducing the Browser Internationalization API

The Browser Internationalization API, also referred to as the i18n API, was introduced to help websites adapt and provide localized content and formatting based on the user’s language, region, and cultural preferences.

Some of the key features it provides include:

  • Language detection - Detect the language set in the browser to serve content accordingly
  • Number formatting - Format numbers as per conventions in the user’s region (like adding commas to large numbers)
  • Date formatting - Display dates in formats specific to the user’s locale
  • Unit formatting - Allow units (like currency) to be adapted as per user’s region
  • Text direction - Support right-to-left text in some languages
  • Text segmentation - Segment text into words, lines and sentences based on language rules

Overall, the Internationalization API aims to provide a more global browsing experience by adapting content to the user’s locale and language preferences. Website developers leverage it to detect the user’s preferred language, translate content, format dates/numbers/currencies, and adapt layouts. This allows visitors to interact with web content in their own cultural context and native language without any third party translation extensions.

The Surprising Connection: Internationalization and Fingerprinting

At first glance, the Internationalization API appears well-intentioned - helping deliver a localized experience to users worldwide. However, the same information that aids localization can be used to generate fingerprintable data about a user’s browser configuration.

The API exposes several browser and user preferences that may vary by individual, inadvertently adding to browser fingerprinting:

Language: The preferred language(s) set by the user in their browser can help identify and distinguish them.

Regional conventions: Date, time, number, and currency formatting based on the user’s region provides fingerprintable data.

Text direction: Whether text is rendered left-to-right or right-to-left depends on the user’s language.

Text segmentation: Breaking text into words/lines differs across languages based on writing system.

Timezones: Interpreting dates/times relies on knowing the user’s timezone offset.

Calendar: Some regions follow various local or religious calendars that impact date localization.

If you think about it, even two users in the same country may have different browser language, regional, and display settings based on their individual preferences and profiles. These differences can be extracted using the i18n API and used to generate unique fingerprints.

Demonstrating Fingerprinting Potential

To demonstrate how the i18n API can be exploited, let’s walk through some example code snippets:

Detect language

// Get browser's preferred language(s)
const lang = window.navigator.languages;

This provides an array of languages that can be used to differentiate users.

Check text direction

// Determine if text direction is LTR or RTL
const rtl = (new Intl.DateTimeFormat()).resolvedOptions().dir === 'rtl';

The text direction varies by language and can be a fingerprinting signal.

Get regional date format

// Get regional date formatting
const date = new Date().toLocaleDateString();

Date formats like MM/DD/YYYY vs DD/MM/YYYY reveal the user’s location.

Extract timezone

// Get timezone offset
const tz = new Date().getTimezoneOffset();

The timezone offset exposes the user’s geographical location and settings.

As you can see, all this information can be extracted through the i18n API and used as inputs to a fingerprinting algorithm. When combined with canvas fingerprinting, font enumeration, and other techniques, it can reliably generate a fingerprint.

Your Internationalization Fingerprint

Fingerprint: 0000019000790089065058000000380092002400

Code Used

// This async function takes an array and a hashing algorithm as input,
// hashes the JSON representation of the array, and returns the hash as a hex string.
async function hashArray(array, algorithm) {
  // Convert each element in the array to a JSON string representation.
  const stringArray = array.map((element) => JSON.stringify(element));

  // Join all the JSON strings into a single data string.
  const dataString = stringArray.join("");

  // Create a TextEncoder to encode the data string into bytes.
  const encoder = new TextEncoder();
  const data = encoder.encode(dataString);

  // Use the Web Crypto API to calculate the hash of the data.
  const hashBuffer = await crypto.subtle.digest(algorithm, data);

  // Convert the hash buffer into a Uint8Array.
  const hashArray = new Uint8Array(hashBuffer);

  // Return the hash as a hex string.
  return hashArray.map((b) => b.toString(16).padStart(2, "0")).join("");
}

// Get the user's language preferences, text direction, and timezone offset.
const lang = window.navigator.languages;
const rtl = (new Intl.DateTimeFormat()).resolvedOptions().dir === 'rtl';
const tz = new Date().getTimezoneOffset();

// Create an array to store the internationalization-related information.
let i18n = [lang, tz];

// Check if the Intl.supportedValuesOf function is available (only in modern browsers).
if (typeof Intl.supportedValuesOf === 'function') {
  // Retrieve additional internationalization information and add it to the array.
  i18n.push(
    Intl.supportedValuesOf('calendar'),
    Intl.supportedValuesOf('collation'),
    Intl.supportedValuesOf('currency'),
    Intl.supportedValuesOf('numberingSystem'),
    Intl.supportedValuesOf('timeZone'),
    Intl.supportedValuesOf('unit')
  );
}

// Create a copy of the i18n array and calculate its hash using SHA-256.
const i18n2 = Array.from(i18n);
hashArray(i18n2, "SHA-256").then((fingerprint) => {
  // Display the fingerprint on the webpage.
  document.getElementById("hash").textContent = `Fingerprint: ${fingerprint}`;

  // Log the fingerprint to the console.
  console.log(`Fingerprint: ${fingerprint}`);
});

The simplified example above demonstrates how the Browser Internationalization API can be used to generate a fingerprint. It collects data on language, text direction, date formatting, and other internationalization preferences. While basic, this snippet illustrates the concerning fingerprinting potential.

More sophisticated algorithms would extract even more i18n data points and blend them with other sources like canvas, fonts, and plugins to assemble full fingerprints capable of tracking users without consent across the web. This highlights the need for responsible use of APIs that seem innocuous but contain privacy risks upon closer inspection.

Privacy Implications and Ethical Concerns

The ability to use an API designed for localization and surreptitious fingerprinting and tracking raises some serious privacy issues and ethical concerns.

Some of the key issues to consider:

  • Users are unaware their language and regional settings are being used for fingerprinting.
  • No clear way for users to opt-out of such fingerprinting.
  • Circumvents restrictions like tracker blocking.
  • Violates the original spirit of the API in enhancing user experience.
  • No transparency about how the data will be used after fingerprinting.
  • Opens the door for cross-site tracking without consent.

Compared to more visible fingerprinting techniques like canvas fingerprinting, the i18n API can gather fingerprinting data quietly in the background without raising user suspicion.

Overall, exploiting this API for fingerprinting appears ethically dubious, akin to a surveillance technique. Browser vendors never intended such data to be abused for tracking users without their knowledge.

Mitigating and Counteracting Internationalization-based Fingerprinting

Given the privacy risks posed by fingerprinting through the i18n API, what are some ways to mitigate and counteract it?

User mitigation

Developer measures

  • Avoid extracting unnecessary language and regional data.
  • Use i18n API data only for legitimate localization needs.
  • Adopt differential privacy techniques to obfuscate data.

Browser changes

  • Restrict API access to first-party contexts only.
  • Anonymize or reduce precision of returned information.
  • Honor user restrictions like private browsing mode.

Regulatory intervention

  • Recognize such fingerprinting as a harm to enact restrictions.
  • Require clear opt-in consent before allowing fingerprinting scripts.

With collective responsibility from all stakeholders, the privacy risks from the i18n API can be reduced without sacrificing its localization benefits.

Balancing Internationalization and Privacy

It is evident that the i18n API increases the risk of browser fingerprinting. At the same time, the API serves the legitimate purpose of adapting websites to different languages and locales. How do we strike a balance between these two conflicting needs?

For website developers, only extract the bare minimum data needed to provide localization functionality. Avoid fingerprinting-related information unless required for the user’s experience. Also, be open about how you plan to use data obtained through the API.

For browser vendors, provide granular controls over what information is exposed through the API, honor restrictions like private browsing mode, and anonymize data wherever possible.

Users should also take steps to lock down high entropy information, use tracker/fingerprint blocking tools, and a privacy-focused browser. With responsible implementation, the API can facilitate localization without turning into a fingerprinting weapon.

Future Outlook

As browsers expand functionality through new APIs, we will probably encounter more cases of unintended consequences. Features introduced for one purpose get exploited for unintended tracking or surveillance.

Browser fingerprinting itself is only likely to grow as traditional tracking methods like cookies and pixels get blocked. We will likely see more sophisticated methods emerge that take advantage of new APIs as they are being released.

On the regulatory front, authorities are starting to recognize browser fingerprinting as a harm. We may see more jurisdictions introduce GDPR-like restrictions requiring explicit opt-in consent before extracting fingerprinting data.

Conclusion

The Browser Internationalization API illuminates the privacy issues of features intended for other purposes. While designed to aid localization, this API allows new fingerprinting methods by exposing language, locale and text direction settings. As this post showed, developers must carefully calculate the data exposed by each web API to prevent misuse and unintended data collection.

This example highlights the need for greater awareness around the privacy and ethical risks of browser APIs, even those that seem harmless. Developers must implement APIs responsibly, while browsers need to architect them with privacy in mind from the start. With collective oversight by everyone involved, we can ensure technologies enhance website experience through cultural adaptation without compromising user privacy and security.