Word count: ~2600 · Reading time: 14 minutes

Text Isolation and BiDi Solutions: Radical Treatment for Hybrid Texts

From diagnosis to programmatic treatment: How to use the HTML5 arsenal and invisible Unicode controls to enforce visual firewalls and protect characters from breaking.

Introduction: From Crisis Diagnosis to Treatment Engineering (Beyond Rendering Engines)

In the previous two articles of this series on Zy Yazan Platform, we explained the digital system of text; starting with the Unicode standard that gives characters their hidden numerical identity, followed by the Bidirectional (BiDi) algorithm that controls the arrangement of characters and words behind the scenes. We saw how the absence of a defined base paragraph direction leads to visual disasters that jumble punctuation and break line consistency.

However, the picture is not complete here. All this logical data remains locked within the system until the rendering engine invokes it — the component responsible for converting code into visible pixels on the screen (visual rendering). Knowing the cause of the problem is not enough to build flexible platforms; a developer cannot simply add dir="rtl" to static elements, because modern web interfaces are full of dynamic data coming from multiple users and hybrid databases.

The real issue emerges when inserting foreign terms, usernames, or technical symbols (like @, /, and brackets) inside pure Arabic sentences, which confuses the rendering engine and leads to incorrect rendering. Here, we need treatment engineering based on two core principles: Bidirectional Isolation (BiDi Isolation) at the user interface (UI) level, and planting invisible Unicode controls at the backend and database level, to force rendering engines to render correctly.

This article is your practical guide to move from a theoretical understanding of rendering flaws to comprehensive programmatic empowerment, producing clean, stable code that is visually fortified against all types of hybrid linguistic distortions.

First: BiDi Isolation: The Concept of a Visual Firewall

To grasp the meaning of “isolation” in text processing, let us imagine a common programming context: a social media platform displaying a list of usernames followed by their comment count. The logical order of the sentence in the HTML template is as follows:

[username] added a new comment.

If the username is purely Arabic like “سامي”, the interface will display the sentence smoothly from right to left. But what if the username is dynamic and written in English or starts with a number, such as “99Alex“?

In the absence of isolation, when the browser’s text engine reads the string, the context of the username will be affected by its surrounding boundaries. The numbers at the beginning of the name might overlap with preceding Arabic words, or adjacent punctuation marks might jump and tangle with strong Latin characters, causing a complete break in the visual flow of the sentence. This phenomenon is called “Directional Coloration Spillover.”

This is where BiDi Isolation comes into play. The programmatic principle relies on telling the web browser: “Process this dynamic text block inside a box completely isolated from its visual surroundings. Calculate its internal direction independently, protect external words from being affected by its direction, and protect its internal content from collapsing due to the general paragraph direction.” This isolated box prevents the leakage of character directional strength, ensuring that hybrid text remains stable regardless of the nature of the characters stuffed inside it.

Second: The HTML5 Arsenal for Programmatic Interface Isolation

The HTML5 specification provided developers with revolutionary tools to control this isolation without the need to write complex JavaScript code or use futile styling tricks. This arsenal includes specialized tags and attributes whose core differences we must understand:

1- The Automatic Isolation Tag `<bdi>` (Bidirectional Isolation)

This tag is considered the ultimate lifesaver when dealing with anonymous data retrieved from a database. When you wrap any text in a <bdi> tag, you enforce two automatic actions on the browser:

Applying the bidirectional isolation property to the internal text, so it does not affect the direction of the words preceding or following it inline.
Allowing the browser to automatically scan the internal text based on the “first strong character” rule to determine its internal direction (whether RTL or LTR) without being affected by the direction of the surrounding parent element.

Practical Example: Suppose we are displaying a user leaderboard on our site:

<ul dir="rtl">
  <li>First Place: <bdi>Sami</bdi> - 1500 points.</li>
  <li>Second Place: <bdi>ياسين</bdi> - 1200 points.</li>
  <li>Third Place: <bdi>2026Developer</bdi> - 900 points.</li>
</ul>

Thanks to the <bdi> tag, browsers will display all three names completely stably. The name “Sami” will be isolated and read as Latin, the name “2026Developer” will not cause its initial number to flip the order of the surrounding Arabic line, and punctuation marks and commas will remain exactly in their correct positions.

2- The Strict Override Tag `<bdo>` (Bidirectional Override)

In contrast to the passive isolation tag, the <bdo> tag is a strict, dictatorial tool. It does not isolate text or calculate its direction automatically; instead, it performs a “complete bypass and disconnection of the BiDi algorithm” and forces the browser to display characters in a forced physical order dictated by the mandatory direction attribute that must always accompany this tag.

If you write <bdo dir="ltr">العربية</bdo>, the browser will visually reverse the characters to display them letter by letter from left to right, appearing like this: “ة ي ب ر ع ل ا”. This tag is used in very rare cases, such as displaying encrypted text, reversed programming code, or when wanting to force the display of certain numerical sequences that the standard algorithm refuses to format as the design requires.

3- The Core Difference Between `dir="auto"` and the `<bdi>` Tag

Many developers resort to a quick trick by placing the attribute dir="auto" on elements like paragraphs <p dir="auto"> or input fields <input dir="auto">. This attribute is excellent and makes the entire element flip its direction (right or left) based on the first strong character entered.

However, the core difference lies in the scope of influence. The dir="auto" attribute switches the direction of the entire element as a block (Block Level), changing text alignment and interface coordinates by jumping between right and left.

Conversely, the <bdi> tag works at the inline level (Inline Level). It does not move the paragraph from its place or flip the interface layout; rather, it creates an invisible shield that protects the internal sequence of words and prevents symbols from ruining the line.

The Golden Rule: Use dir="auto" for entire dynamic fields and paragraphs, and use <bdi> for mixed words and names nested within static lines.

Third: Invisible Unicode Controls: Settling the Battles of Brackets and Special Characters

In many instances, the issue of broken hybrid text appears outside the scope of web browsers; it can occur within titles sent in emails, push notifications on phones, or in plain text files where no <bdi> tag exists to save us. Here, we must activate the secret weapon of the Unicode standard: invisible directional controls.

These characters are actual numerical values (Code Points) stored in computer memory and transmitted with the text, but they possess no visual representation (Glyph). This means they disappear entirely from the reader’s eye, yet their programmatic effect is like a magnet that enforces its directional strength on the algorithm to resolve the behavior of neutral characters such as brackets and symbols like (@, /).

1. Traditional Control Marks (LRM & RLM)

Left-to-Right Mark (LRM): Carries the numerical value U+200E and is written as &lrm; in web code. This hidden mark behaves exactly like an authentic strong Latin character.
Right-to-Left Mark (RLM): Carries the numerical value U+200F and is written as &rlm; in web code. It behaves exactly like an authentic strong Arabic character.

How do they solve the nightmare of reversed brackets and programming symbols?
Let us take a famous example that breaks on thousands of websites: writing a programming folder path or a website link containing a forward slash (/) inside an Arabic context. If we write the following sentence logically in memory:

قم بالدخول إلى المجلد الخاص بالاعدادات config/settings لتعديل البيانات.

Since the string is governed by a general Arabic direction, and the forward slash (/) is a neutral character located between two Latin words (config and settings), the algorithm can become confused and consider the slash as a separator following the overall Arabic flow. It then reverses the visual order to display the subfolder before the parent folder, changing the phrase to settings/config visually, which destroys the technical path and misleads the developer.

Programmatic Treatment with LRM: To solve this dilemma, we inject an &lrm; mark directly before and after the special characters or nested brackets to confirm the Latin context. When planting this invisible mark, the BiDi engine sees that the neutral symbol is surrounded on both sides by strong left-to-right characters (the actual Latin character and the hidden LRM mark), settling the symbol into its correct position without reversal. The exact same applies to the email @ symbol when it intersects with Arabic and foreign usernames; planting &rlm; or &lrm; based on the handle’s direction ensures the symbol does not jump randomly from right to left.

2. Zero-Width Non-Joiner (ZWNJ)

This unique character carries the numerical value U+200C and is known in web code as &zwnj;. Its primary function in digital typography is to break the mandatory visual connection between adjacent cursive letters (like Arabic or Persian characters) without creating an actual horizontal space.

However, in the context of hybrid texts, ZWNJ possesses a powerful dual purpose: it acts as a “structural buffer” that prevents the overlapping of visual contexts for brackets and neutral special characters. When you place a ZWNJ between the end of an Arabic word and the beginning of a bracket containing an English term, you prevent the text rendering engine from attempting to visually merge or process the last letter with the bracket. This protects the bracket from flipping and keeps its structural shape stable in place, especially when dealing with vocalized text or custom fonts that suffer from alignment issues at edge boundaries.

3. Comprehensive Transition: Text Isolation at the Backend Level Using (FSI & PDI)

All the previous surgical solutions (like LRM and ZWNJ) treated localized cases within the user interface. However, when systems scale and we transition to processing large amounts of data and dynamic text on the server side (Backend) before sending it to the browser, we need a stronger and more automated arsenal.

Here, the Unicode standard provides a newer generation of comprehensive isolation controls. These characters function exactly like the <bdi> tag we explained in interfaces, but they come in the form of raw, hidden text characters, most notably:

FSI (First Strong Isolate – U+2068): A hidden character placed at the beginning of dynamic text to force the system to isolate it and guess its internal direction completely independently.
PDI (Pop Directional Isolate – U+2069): The hidden isolation closing character, placed at the end of dynamic text to terminate the isolation effect and close the visual box.

To apply this practically without manual effort, we can utilize server-side languages like Python. We write a simple programmatic function that automatically injects these characters around any hybrid variable retrieved from the database to protect and prepare it before rendering it in the final HTML template. Example:

def isolate_hybrid_text(text):
    FSI = "\u2068"
    PDI = "\u2069"
    return f"{FSI}{text}{PDI}"

# The result will be fortified text surrounded by an invisible Unicode isolation wall

Fourth: Practical Comparison Table: When to Use Each Tool?

To facilitate engineering decisions during development and localization, the following table outlines the optimal scenario for using each of the BiDi and text isolation tools we have dissected:

Programmatic Tool	Default Environment	Ideal Use Case	Expected Visual Result
`<bdi>` tag	Web Interfaces (HTML5)	Injecting usernames of unknown direction into the middle of a static inline sentence.	Isolates the name completely and calculates its direction automatically without destroying the order of adjacent words.
`dir="auto"` attribute	Interface Templates & Input Fields	Preparing text fields (Input/Textarea) or full comment paragraphs written by users in different languages.	Flips the alignment and direction of the entire block (right or left) based on the first strong character entered.
`&lrm;` / `&rlm;` marks	Plain Text, Web Interfaces, Notifications	Protecting punctuation, nested brackets, and special characters like (`@`, `/`) from flipping.	Tricks the algorithm into treating a neutral symbol as if it is adjacent to a strong character, locking it in its correct position.
`FSI` & `PDI` controls	Backend Data Processing (Python)	Sanitizing and fortifying dynamically composed text strings before sending them to be rendered in interfaces or emails.	Creates an invisible logical isolation box that travels with the text and protects it in all digital display environments.

Summary: Crossing Over to Presentation Engineering

By mastering bidirectional text isolation and gaining the ability to plant invisible Unicode controls to protect brackets and technical symbols from breaking, we have firmly gripped the “logic and science” of processing hybrid texts within the machine’s random-access memory and foundational HTML5 codes. We have moved from a stage of struggle and random attempts at formatting lines to a stage of complete sovereignty and control based on strict Unicode rules.

However, the logical processing of text is incomplete unless accompanied by “visual flexibility” at the presentation and external layout level of the site. A developer cannot live solely within the realm of mathematical calculations; these solutions must be integrated smoothly within comprehensive style sheets to ensure user interfaces grow and template dimensions remain stable without bias.

In the next article (the fourth article) of our educational series, we will cross an important technical bridge to discover “CSS Logical Properties.” We will open our style files to eliminate traditional, biased thinking based on physical left and right, and learn how to build bidirectional programmatic interfaces that flip themselves, their dimensions, and their layout architecture (Flexbox & Grid) automatically without writing a single line of redundant code, and without the historical burden of rtl.css files.

References and Sources:

W3C documentation on using modern isolation tags: W3C: Inline markup for images and bidirectional text in HTML
Official Mozilla Developer Network guide for the automatic isolation tag: MDN Web Docs: The Bidirectional Isolate element (<bdi>)
Unicode technical standard for advanced directional control characters: Unicode Bidirectional Algorithm: Bidirectional Control Characters

🌐 Read this article in Arabic

Localization Engineering Series

Hybrid Text Processing Guide — 5 Articles

1 / 5

Unicode Logic: How Computers See Characters

A deep dive into the Unicode standard and how bidirectional characters are represented digitally.

2 / 5

BiDi Algorithm Rules and Base Paragraph Direction

Analyzing how the Bidirectional Algorithm determines paragraph bases and layouts for mixed text.

3 / 5

Text Isolation and BiDi Solutions: Radical Treatment for Hybrid Texts

How to leverage modern isolation techniques to radically fix nested Arabic and English sequences.

4 / 5

CSS Logical Properties: Direction-Agnostic Interface Engineering

Shifting from physical constraints to logical dimensions for streamlined bidirectional UX systems.

5 / 5

Practical Project for the Hybrid Text Processing Workshop: Fixing Punctuation in Hybrid Code Lines

Hands-on implementation to debug structural punctuation errors in mixed language environment flows.

Text Isolation and BiDi Solutions: Radical Treatment for Hybrid Texts

Text Isolation and BiDi Solutions: Radical Treatment for Hybrid Texts

Introduction: From Crisis Diagnosis to Treatment Engineering (Beyond Rendering Engines)

First: BiDi Isolation: The Concept of a Visual Firewall

Second: The HTML5 Arsenal for Programmatic Interface Isolation

1- The Automatic Isolation Tag `<bdi>` (Bidirectional Isolation)

2- The Strict Override Tag `<bdo>` (Bidirectional Override)

3- The Core Difference Between `dir="auto"` and the `<bdi>` Tag