Permalink Engineering: Decoding Arabic Percent-Encoded URLs
Arabic URLs turn into unreadable encoded strings that hurt CTR, crawl budget, and link equity. Learn how to engineer a clean permalink structure that works for everyone.
~2,500 words · 12-min read
Permalink Engineering
Arabic URL Structure Guide: Fix Percent-Encoding & Boost Crawl Budget
A URL Nobody Can Read Is a URL Nobody Can Trust
Try it right now: open any Arabic website, navigate to an article with an Arabic title, then copy the URL from the address bar and paste it into a text message or email. What are you actually sending?
Probably something like this:
https://example.com/%D9%87%D9%86%D8%AF%D8%B3%D8%A9-%D8%A7%D9%84%D8%B1%D9%88%D8%A7%D8%A8%D8%B7/
That string of numbers, letters, and percent signs isn’t a bug — it’s a technical standard called percent-encoding (also called URL encoding). It’s how the internet represents non-Latin characters in web addresses.
Your modern browser displays Arabic characters in the address bar as a convenience for you. But what actually travels between servers, what appears when you copy a link, and what Google’s crawler receives — is that encoded string.
The gap between what we see and what actually happens is the core of the problem this article addresses.
How Percent-Encoding Works
The URI standard allows only a limited set of “safe” characters in web addresses: uppercase and lowercase Latin letters, digits, and the symbols - _ . ~. Any character outside that set must be represented instead by a sequence of hexadecimal digits preceded by a % sign.
The Arabic letter “ه” becomes %D9%87. The word “هندسة” (engineering) becomes %D9%87%D9%86%D8%AF%D8%B3%D8%A9. A two-word article title turns into 40+ unreadable symbols.
To understand why this happens, we need one layer deeper: character encoding. Arabic text is represented in computers using the Unicode standard. When embedded in a URL, it’s first converted into a UTF-8 sequence of bytes, then each byte is encoded as %XX. That’s what turns a single character into six or more symbols.
One subtle point worth clarifying: modern browsers display Arabic URLs in readable form in the address bar as a visual convenience for users. But that display doesn’t mean the URL sent to servers is what you see — servers always receive the encoded representation. The browser hides this complexity in everyday use.
Three Direct Impacts on Technical SEO
Percent-encoding itself isn’t an error — it’s a correct and necessary standard. The problem appears when a site’s permalink structure is built on Arabic text in the slug (the final segment of the URL), because that produces three cumulative effects on technical SEO:
1 — Lower Click-Through Rate in Search Results
When your page appears in Google’s results, users see three things: the title, the description, and the URL. A readable URL — even in English — gives a clear contextual signal about what’s on the page. An encoded URL gives no signal at all, and looks suspicious to many users who’ve learned that strange-looking links might not be safe.
The CTR difference between a clean URL and an encoded one isn’t theoretical. A/B tests across multiple sites have shown that improving URL structure alone lifts click-through rate by 5% to 15%.
2 — Crawl Budget Waste
Crawl budget is the number of pages Google crawls on your site within a given time window. That budget isn’t unlimited — especially for mid-size sites.
Encoded URLs complicate the crawler’s work in more than one way: Google may treat the encoded and decoded versions of the same URL as two different addresses in some cases — creating potential duplicate indexing and doubling the budget waste. On top of that, longer URLs consume more resources during parsing and processing.
3 — Link Equity Fragmentation
When someone shares a link to your article on an external site, forum, or social platform, they sometimes share the version displayed in the browser (readable Arabic) and sometimes the copied version (encoded). The result: backlinks pointing to two technically different addresses — splitting your link equity instead of concentrating it.
The Wrong Approach: Leaving the Slug in Arabic
Many Arabic site owners choose Arabic slugs for a good reason: they want the URL to match the page content in the same language. The logic is understandable.
But that choice produces exactly the encoded URLs we’ve been discussing. And there’s an additional problem that doesn’t get talked about enough: Arabic slugs are sensitive to diacritics and spacing. The word “هِندسة” (with a kasra) and “هندسة” (without diacritics) can produce two different URLs in some systems — another source of link fragmentation and duplicate content.
To be clear: Arabic slugs aren’t technically wrong. Google supports them and can read them. But they create a chain of technical complications that accumulate over time and make maintenance harder. A better option exists.
The Right Approach: Semantic English Slugs
The approach used by the highest-performing Arabic websites in SEO is straightforward: English slugs, short, semantic, and connected to the content.
“Semantic” means it’s not a literal translation of the Arabic title — it’s an extraction of the core idea that a reader or crawler would search for.
A comparison of approaches:
| Approach | Example | Assessment |
|---|---|---|
| Arabic slug | %D9%87%D9%86%D8%AF%D8%B3%D8%A9-%D8%A7%D9%84%D8%B1%D9%88%D8%A7%D8%A8%D8%B7/ |
❌ Encoded, long, unreadable |
| Long literal translation | engineering-of-permalinks-and-decoding-percent-encoded-arabic-urls/ |
⚠️ Readable but too long |
| Short semantic slug | technical-seo-permalink-engineering/ |
✅ Short, readable, semantic |
| Numbers only | article-3/ or p=4521/ |
❌ No semantic value for search engines |
The practical rule for writing a good slug: extract the core keyword of the article, add one or two words of context, and keep the total to 3–5 English words separated by hyphens.
# Good slug examples for Arabic content
arabic-rtl-css-guide
web-localization-seo
arabic-font-performance
permalink-seo-arabic-sites
rtl-layout-audit
URL Hierarchy and Its Effect on Indexing
The slug isn’t the only decision in permalink engineering. The full URL path structure has its own independent effect on how Google understands your site’s architecture.
There are two main patterns:
Flat Structure
https://example.com/arabic-font-performance/
https://example.com/rtl-css-guide/
https://example.com/permalink-seo/
All articles sit at one level directly under the domain. Good for single-topic blogs and small sites. The crawler reaches pages in one step from the homepage.
Hierarchical Structure
https://example.com/guides/arabic-font-performance/
https://example.com/workshops/rtl-css-guide/
https://example.com/tools/permalink-generator/
Articles are organized under categories. Good for sites with distinct sections. The URL itself tells Google the categorical context of the page.
Which is better? It depends on your site. But the general rule: the shallower a page sits in the hierarchy (the fewer forward slashes in the path), the easier it is for the crawler to reach and the higher its perceived importance.
The rule SEO experts agree on: don’t go deeper than 3 levels in the hierarchy. A page at level four (/a/b/c/d/slug/) requires Google to follow 4 steps to reach it from the homepage — which translates to lower crawl priority and slower indexing.
What to Do If Your Arabic URLs Are Already Live
This is the question that worries most people: what if my site already has hundreds of articles with Arabic slugs? Do I change them all?
The short answer: don’t change a published URL unless you’re ready for what comes with it.
Changing a slug on a published page — even for the better — breaks every external link pointing to it, loses all accumulated link equity, and causes the page to temporarily drop in rankings while it gets re-indexed. That’s a real cost.
If you decide to go ahead, the correct path is:
Step 1 — Set Up 301 Redirects
A permanent redirect (301) tells Google the page has moved to a new address permanently, and passes the link equity with it. In WordPress you can do this with the Redirection plugin or Yoast SEO. In .htaccess on Apache servers:
# Redirect old encoded Arabic slug to new English slug
Redirect 301 /old-arabic-slug-encoded/ https://example.com/new-english-slug/
Step 2 — Update Internal Links
Search your site’s database for every internal link pointing to the old slug and update it. In WordPress, the Better Search Replace plugin handles this safely across the entire database.
Step 3 — Resubmit Your XML Sitemap
After the change, ask Google to re-crawl your sitemap via Google Search Console to speed up indexing of the new URLs.
Step 4 — Wait
A 301 redirect passes link equity, but Google needs time to process the change and update its index. A temporary ranking dip is normal and expected in the first few weeks. Don’t panic and reverse course.
Handling Duplicate Pages with Canonical Tags
Even with good practices in place, your site may automatically generate multiple versions of the same URL. In WordPress, for example, a single article might be accessible at:
https://example.com/technical-seo-permalink-engineering/
https://example.com/?p=4521
https://example.com/workshops/technical-seo-permalink-engineering/
Three addresses for the same content. Google sees three pages. Crawl budget spent on three instead of one.
The fix: the canonical tag — it tells Google which address is the “original” that should be indexed and credited.
<!-- In your page's <head> -->
<link rel="canonical" href="https://example.com/technical-seo-permalink-engineering/" />
In WordPress, Yoast SEO, Rank Math, and All in One SEO all add this tag automatically for the page’s primary URL — confirm the setting is enabled and pointing to the right address.
A Special Case: Bilingual Sites
Zy Yazan itself is a live example of this pattern: every article exists in two versions — Arabic and English. How should URLs be structured in this scenario?
The best practice is a shared slug with a language suffix:
# Arabic version
هندسة الروابط: فك شفرة الرموز المئوية في العناوين العربية
# English version
https://zyyazan.sy/technical-seo-permalink-engineering/
Both URLs are English, short, semantic, and content-connected. The -ar suffix is clear, consistent, and distinguishes the two versions without ambiguity.
But this alone isn’t enough. Google needs to know these two pages are language equivalents — not duplicate content. That’s what the hreflang tag is for:
<!-- In the <head> of the Arabic version -->
<link rel="alternate" hreflang="ar" href="https://zyyazan.sy/technical-seo-permalink-engineering-ar/" />
<link rel="alternate" hreflang="en" href="https://zyyazan.sy/technical-seo-permalink-engineering/" />
<link rel="alternate" hreflang="x-default" href="https://zyyazan.sy/technical-seo-permalink-engineering/" />
These three lines tell Google: “This page exists in Arabic and in English, and the English version is the default for anyone whose language doesn’t match either.” The result: Google serves the right version to the right user in search results.
The hreflang tag is one of the most error-prone elements in technical SEO. The most common mistake: adding it to the Arabic version only, without adding it to the corresponding English version. The rule: if you add it to one page, it must appear on all linked versions. A one-sided hreflang is worse than no hreflang — it can confuse the crawler about which page is canonical.
Practical Tools for Auditing Your URL Structure
Before you close this article, here are three tools that help you diagnose the current state of your site’s links right now:
1 — Google Search Console
In the Pages report, look for pages flagged as “Duplicate” or “Crawled – currently not indexed.” The root cause is often a URL structure problem or a missing canonical tag. The report shows you exactly which URLs Google found and what it decided to do with them.
2 — Screaming Frog SEO Spider
A local crawl tool that runs on your computer and generates a full report of all your site’s URLs, including each URL’s length, status, and how many pages link to it. The free version covers up to 500 pages — enough to diagnose most mid-size sites.
3 — Online URL Decoder
If you want to read what an encoded string actually says, paste it into any URL decode tool online and it’ll translate immediately to the original text. Useful for understanding exactly what the crawler is seeing when it hits one of your encoded URLs.
Closing: A URL Is a Page’s Identity
A permalink isn’t just a technical address — it’s the page’s identity in digital space. It’s what you share with others, what the browser saves in your bookmarks, and what Google uses as its primary reference during crawling and indexing.
Building that identity correctly from the start saves a lot of work later. And when URLs aren’t ideal — as is the case on many existing sites — understanding how to handle the current situation (with canonicals and redirects) turns a performance problem into a site that’s manageable and improvable.
In the final article of this series, we bring everything together in a complete practical project: auditing a real Arabic landing page, diagnosing its problems, and measuring the performance jump after applying everything we’ve learned.
— Digital Typography & Its Impact on Technical SEO —
Previous article: 2 — The Expansion Dilemma: Controlling Arabic Layout Shift
Current article: 3 — Permalink Engineering: Decoding Arabic Percent-Encoded URLs
Next article: 4 — Practical Project: Optimizing an Arabic Landing Page
Related series:
RTL Interface Engineering Guide | Hybrid Text Processing Guide | Financial Data Localization Guide | Web 3.0 Localization Workshop
