You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
979 lines
66 KiB
979 lines
66 KiB
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
|
|
<html lang="en"><head><title>Arabic mathematical notation</title><style type="text/css">
|
|
code { font-family: monospace; }
|
|
|
|
div.constraint,
|
|
div.issue,
|
|
div.note,
|
|
div.notice { margin-left: 2em; }
|
|
|
|
li p { margin-top: 0.3em;
|
|
margin-bottom: 0.3em; }
|
|
|
|
div.exampleInner pre { margin-left: 1em;
|
|
margin-top: 0em; margin-bottom: 0em}
|
|
div.exampleOuter {border: 4px double gray;
|
|
margin: 0em; padding: 0em}
|
|
div.exampleInner { background-color: #d5dee3;
|
|
border-top-width: 4px;
|
|
border-top-style: double;
|
|
border-top-color: #d3d3d3;
|
|
border-bottom-width: 4px;
|
|
border-bottom-style: double;
|
|
border-bottom-color: #d3d3d3;
|
|
padding: 4px; margin: 0em }
|
|
div.exampleWrapper { margin: 4px }
|
|
div.exampleHeader { font-weight: bold;
|
|
margin: 4px}
|
|
</style><link type="text/css" rel="stylesheet" href="http://www.w3.org/StyleSheets/TR/W3C-IG-NOTE.css"></head><body><div class="head"><p><a href="http://www.w3.org/"><img width="72" height="48" alt="W3C" src="http://www.w3.org/Icons/w3c_home"></a></p>
|
|
<h1><a id="title" name="title"></a>Arabic mathematical notation</h1>
|
|
<h2><a id="w3c-doctype" name="w3c-doctype"></a>W3C Interest Group Note 31 January 2006</h2><dl><dt>This version:</dt><dd>
|
|
<a href="http://www.w3.org/TR/2006/NOTE-arabic-math-20060131">http://www.w3.org/TR/2006/NOTE-arabic-math-20060131</a>
|
|
</dd><dt>Latest version:</dt><dd><a href="http://www.w3.org/TR/arabic-math/">http://www.w3.org/TR/arabic-math/</a></dd><dt>Previous version:</dt><dd>This is the first version</dd><dt>Editors:</dt><dd>Azzeddine Lazrek, with Mustapha Eddahibi and Khalid Sami, Cadi Ayyad University - Marrakech, Morocco
|
|
|
|
</dd><dd>Bruce R. Miller, National Institute of Standards and Technology, USA</dd></dl><p>This document is also available in these non-normative formats: <a href="arabic.xhtml">XHTML+MathML version</a>.</p><p class="copyright"><a href="http://www.w3.org/Consortium/Legal/ipr-notice#Copyright"> Copyright</a> ©2006 <a href="http://www.w3.org/"><acronym title="World Wide Web Consortium">W3C</acronym></a><sup>®</sup> (<a href="http://www.csail.mit.edu/"><acronym title="Massachusetts Institute of Technology">MIT</acronym></a>, <a href="http://www.ercim.org/"><acronym title="European Research Consortium for Informatics and Mathematics">ERCIM</acronym></a>, <a href="http://www.keio.ac.jp/">Keio</a>), All Rights Reserved. W3C <a href="http://www.w3.org/Consortium/Legal/ipr-notice#Legal_Disclaimer">liability</a>, <a href="http://www.w3.org/Consortium/Legal/ipr-notice#W3C_Trademarks">trademark</a> and <a href="http://www.w3.org/Consortium/Legal/copyright-documents">document use</a> rules apply.</p></div><hr><div>
|
|
<h2><a id="abstract" name="abstract"></a>Abstract</h2><p>
|
|
This Note analyzes potential problems with the use of MathML for the
|
|
presentation of mathematics in the notations customarily used with Arabic,
|
|
and related languages. The goal is to clarify avoidable implementation details that hinder such presentation,
|
|
as well as to uncover genuine limitations in the specification.
|
|
These limitations in the MathML specification may require extensions in future versions of the specification.</p></div><div>
|
|
<h2><a id="status" name="status"></a>Status of this Document</h2><p><em>This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the <a href="http://www.w3.org/TR/">W3C technical reports index</a> at http://www.w3.org/TR/.</em></p><p>This Note is a self-contained discussion of Arabic mathematical notation in
|
|
MathML. It provides guidelines for the handling of Arabic mathematical
|
|
presentation using MathML 2
|
|
Recommendation (2nd Edition) <a href="#MathML22e">[MathML22e]</a>
|
|
and suggests extensions for a future revision. </p><p>This Note has been written by participants in the <a href="http://www.w3.org/Math/Group/">Math Interest Group</a> (W3C
|
|
members only) which is part of the <a href="http://www.w3.org/Math/Activity">W3C Math activity</a>. Please direct
|
|
comments and report errors in this document to <a href="mailto:www-math@w3.org">www-math@w3.org</a>, a mailing list with a public <a href="http://lists.w3.org/Archives/Member/member-math/">archive</a>.
|
|
</p><p>Publication as a Interest Group Note does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.</p></div><div class="toc">
|
|
<h2><a id="contents" name="contents"></a>Table of Contents</h2><p class="toc">1 <a href="#Introduction">Introduction</a><br>
|
|
2 <a href="#ArabicScript">Some Features of Arabic Script</a><br>
|
|
2.1 <a href="#N100F4">Text Direction</a><br>
|
|
2.2 <a href="#GlyphShaping">Glyph Shaping</a><br>
|
|
2.3 <a href="#Mirroring">Mirroring</a><br>
|
|
2.4 <a href="#NumberSystems">Number Systems</a><br>
|
|
3 <a href="#Comparison">Comparison of Mathematical Notations</a><br>
|
|
3.1 <a href="#Moroccan">Arabic Notation; Moroccan Style</a><br>
|
|
3.2 <a href="#Maghreb">Arabic Notation; Maghreb Style</a><br>
|
|
3.3 <a href="#Machrek">Arabic Notation; Machrek Style</a><br>
|
|
3.4 <a href="#N106E1">Additional Arabic Notations</a><br>
|
|
3.5 <a href="#Persian">Persian</a><br>
|
|
4 <a href="#Proposals">Proposals and Clarifications</a><br>
|
|
4.1 <a href="#BiDiProposal">Clarification of bidirectional Algorithm for MathML</a><br>
|
|
4.2 <a href="#GlyphShapingProposal">Glyph Shaping</a><br>
|
|
4.3 <a href="#N10951">Additional Mathvariants</a><br>
|
|
4.4 <a href="#MirroringProposal">Mirroring</a><br>
|
|
4.5 <a href="#N10A2B">Horizontal Stretchiness</a><br>
|
|
4.6 <a href="#N10A3B">Additional Constructs</a><br>
|
|
5 <a href="#N10A49">Conclusions and Future Work</a><br>
|
|
6 <a href="#N10A58">Acknowledgments</a><br>
|
|
7 <a href="#N10A5F">Production Notes</a><br>
|
|
</p>
|
|
<h3><a id="appendices" name="appendices"></a>Appendices</h3><p class="toc">A <a href="#Localization">Localization Issues</a><br>
|
|
A.1 <a href="#NumberSystem2">Number Systems</a><br>
|
|
A.2 <a href="#SymbolsChoice">Symbols Choice</a><br>
|
|
B <a href="#Implementation">Implementation Issues</a><br>
|
|
B.1 <a href="#CharactersEncoding">Character Encoding</a><br>
|
|
B.2 <a href="#MathematicalFonts">Mathematical Fonts</a><br>
|
|
B.3 <a href="#N10AEE">Symbol Stretching</a><br>
|
|
B.4 <a href="#SoftwareTools">Software Tools</a><br>
|
|
C <a href="#N10B92">Bibliography</a><br>
|
|
</p></div><hr><div class="body"><div class="div1">
|
|
<h2><a id="Introduction" name="Introduction"></a>1 Introduction</h2><p>As the World Wide Web becomes more world wide, inclusion of the world's many languages,
|
|
scripts and cultures becomes critical. Although the development of the Mathematical
|
|
Markup Language (MathML) <a href="#MathML22e">[MathML22e]</a>, was neither intentionally nor
|
|
explicitly exclusive of non-European languages and scripts,
|
|
the focus was on the notational schema used with European languages. Indeed, most of these
|
|
notations are used unchanged in many other contexts. However, there are variations introduced
|
|
in some languages, either for historical reasons, or to fit within various writing systems,
|
|
which MathML should accommodate for improved international support (in particular educational
|
|
material requiring these variations, or historical documents).</p><p>While European languages are written left to right (LTR), Arabic, among others, is
|
|
written right to left (RTL). We will see that in Arabic mathematical texts many of the
|
|
same notational constructs are used, but may be reversed or <a href="#Mirroring">mirrored</a>,
|
|
depending on the cultural context; what we will call a <em>mathematical directionality</em>.
|
|
The mathematical directionality is not necessarily the same as the text directionality.
|
|
Moreover, since the mathematical material may commonly contain text and symbols coming from
|
|
both Arabic and European languages, the question of how the Unicode bidirectional algorithm
|
|
<a href="#UnicodeBiDi">[UnicodeBiDi]</a> should be applied arises.
|
|
Finally, several additional symbols and writing styles may be used in special ways.</p><p><img src="arabic-images/khtout.png" alt="[Arabic Script samples]"></p><p>Arabic Calligraphy is enriched by a variety of writing styles,
|
|
as European writing benefits from a variety of fonts. The graphic above illustrates
|
|
a variety of Arabic calligraphic styles; each word is the name of the corresponding style.
|
|
In the same way that European mathematics broadens the set of distinct symbols available by
|
|
using bold face, Fraktur or other styles, so does Arabic mathematics but typically
|
|
by varying strokes, adding tails or other extensions.</p><p>A given piece of mathematics marked up in
|
|
<a href="http://www.w3.org/TR/MathML2/chapter4.html">Content MathML</a>
|
|
(<a href="#MathML22e">[MathML22e]</a>, chapter 4), is generally language-neutral — although the
|
|
choices for variable names may imply a cultural context —
|
|
it intends to represent the universal meaning of the mathematics.
|
|
A given piece of mathematics marked up in
|
|
<a href="http://www.w3.org/TR/MathML2/chapter3.html">Presentation MathML</a>
|
|
(<a href="#MathML22e">[MathML22e]</a>, chapter 3),
|
|
on the other hand, conveys the visual appearance of the expression. That appearance
|
|
necessarily targets a specific language and notational conventions, indeed even of
|
|
the scientific discipline involved.
|
|
In this Note, we amplify and formalize this segregation of concerns:
|
|
Presentation MathML should be a fairly literal
|
|
representation of the visual notation to be used.</p><p>We relegate all <a href="#Localization">localization</a> issues
|
|
— which symbol to use for summation, which name to use for tangent, what
|
|
format to use for numbers — to the generator of the Presentation MathML,
|
|
rather than the renderer. This avoids guessing, perhaps wrongly, what number is
|
|
intended while deciding whether to replace periods by commas, for example. Thus,
|
|
localization entails the choice of what
|
|
text content to place within MathML's token elements, but that choice is already fixed
|
|
within a given piece of Presentation MathML.</p><p>In this Note, we have attempted to examine all notational conventions in current
|
|
use with Arabic and languages written using Arabic script, without giving preference
|
|
to one form over another.
|
|
We aim to clarify the specification of MathML, proposing extensions where needed,
|
|
so that MathML has the broadest coverage possible. Nevertheless, an in-depth analysis of issues
|
|
affecting other languages, particularly those written top to bottom is a topic for future study.
|
|
The emphasis on Arabic languages is partly a reflection of an increased interest in, and
|
|
usage of, MathML in Arabic language contexts that have highlighted the issues described here.
|
|
Another topic for future study is how Content MathML might best support the transformation
|
|
to appropriately localized Presentation MathML.</p></div><div class="div1">
|
|
<h2><a id="ArabicScript" name="ArabicScript"></a>2 Some Features of Arabic Script</h2><p>Before delving into mathematical notations, it will help to describe some
|
|
of the features of Arabic script, and how Unicode deals with these features.</p><div class="div2">
|
|
<h3><a id="N100F4" name="N100F4"></a>2.1 Text Direction</h3><p>While European languages are written from left to right (LTR), Arabic is written from
|
|
right to left (RTL). Unicode supports these scripts by not only defining codepoints
|
|
for the individual characters of these languages, but by recording the directionality
|
|
of each character.</p><p>When a mixture of LTR and RTL characters appear in text (ie. bidirectional or BiDi text,
|
|
such as an English text that includes Arabic words),
|
|
Unicode's bidirectional algorithm <a href="#UnicodeBiDi">[UnicodeBiDi]</a> describes the order in which
|
|
the characters will be displayed. All adjacent strongly-typed RTL characters (such as a
|
|
in a single Arabic word) will be presented in right-to-left order, and vice versa for
|
|
strongly-typed LTR characters. A cluster of characters with the same directionality
|
|
is called a <em>directional run</em>.</p><p>Within any given "paragraph", directional runs are then ordered according to the
|
|
overall <em>directional context</em>. The bidirectional algorithm allows for higher-level
|
|
protocols to determine which <em>segments</em> of a structured text constitute "paragraphs"
|
|
in this sense. For example, in HTML block-level elements are taken as the
|
|
paragraph segments. The top-level <code>html</code> tag determines the directional context
|
|
which can be changed on lower-level elements using the <code>dir</code> attribute.</p><p>For a gentle introduction to bidirectional text, see
|
|
<a href="#UnicodeBiDiIntro">[UnicodeBiDiIntro]</a>.</p></div><div class="div2">
|
|
<h3><a id="GlyphShaping" name="GlyphShaping"></a>2.2 Glyph Shaping</h3><p>As Arabic is a calligraphic script, letters within words are typically joined together.
|
|
When text in such calligraphic scripts is specified by character sequences, a
|
|
process called <em>shaping</em> is used to blend, or connect the character glyphs.
|
|
In Arabic words consisting of a single character, that character is drawn in the "isolated"
|
|
style. In multi-character words, alternative shapes are generally used depending on position:
|
|
the first (rightmost) character is drawn in its "initial" shape,
|
|
the last (leftmost) character gets its "final" shape, and any characters in the middle
|
|
are of the "medial" shape.</p><p>Compare the isolated characters غ ي ر
|
|
to the result of glyph shaping غير.</p></div><div class="div2">
|
|
<h3><a id="Mirroring" name="Mirroring"></a>2.3 Mirroring</h3><p>Some characters, viewed abstractly, have the same meaning in many languages,
|
|
but the form used in RTL languages are the roughly the mirror image of the
|
|
form used in LTR languages. Parentheses and quotation marks are such characters.
|
|
Unicode deals with these cases by marking some codepoints as mirrored, meaning
|
|
that an alternate glyph will be used for the character if it appears in a RTL context.</p><p>Note that mirrored symbols are not required by Unicode (See
|
|
<a href="http://www.unicode.org/reports/tr9/#Mirroring">Mirroring</a>
|
|
in <a href="#UnicodeBiDi">[UnicodeBiDi]</a>, section 6) to be literally
|
|
the exact mirror image. Indeed, it is considered an important point of Arabic calligraphy
|
|
that they are not: the feather's head (kalam) is a flat rectangle. The writer holds the pen so that
|
|
the largest side makes an angle of approximately 70° with the baseline.
|
|
This orientation is kept throughout the process of drawing the character.
|
|
Furthermore, as Arabic writing goes from right to left, some boldness is produced
|
|
around segments running from top left toward the bottom right and conversely,
|
|
segments from top right to the bottom left will rather be slim.
|
|
Thus, the Arabic sum symbol <img src="arabic-images/sigmaa.png" alt="Arabic Sigma">,
|
|
for example, is not simply the mirror image
|
|
<img src="arabic-images/sigman.png" alt="Mirrored Sigma">
|
|
of sigma <img src="arabic-images/sigmal.png" alt="Sigma">.</p></div><div class="div2">
|
|
<h3><a id="NumberSystems" name="NumberSystems"></a>2.4 Number Systems</h3><p>There are several decimal numeral systems in use in Arabic: </p><table cellpadding="2" border="2"><thead><tr><th rowspan="1" colspan="1">System</th><th rowspan="1" colspan="1">Unicode</th><th colspan="10" rowspan="1">Digits</th><th rowspan="1" colspan="1">Image</th><th rowspan="1" colspan="1">Regions</th></tr></thead><tbody><tr><th rowspan="1" colspan="1">European</th><td rowspan="1" colspan="1">U0030-U0039</td><td rowspan="1" colspan="1">0</td><td rowspan="1" colspan="1">1</td><td rowspan="1" colspan="1">2</td><td rowspan="1" colspan="1">3</td><td rowspan="1" colspan="1">4</td><td rowspan="1" colspan="1">5</td><td rowspan="1" colspan="1">6</td><td rowspan="1" colspan="1">7</td><td rowspan="1" colspan="1">8</td><td rowspan="1" colspan="1">9</td><td rowspan="1" colspan="1"></td><td rowspan="1" colspan="1">Maghreb Arab (eg. Morocco), as well as European</td></tr><tr><th rowspan="1" colspan="1">Arabic-Indic</th><td rowspan="1" colspan="1">U0660-U0669</td><td rowspan="1" colspan="1">٠</td><td rowspan="1" colspan="1">١</td><td rowspan="1" colspan="1">٢</td><td rowspan="1" colspan="1">٣</td><td rowspan="1" colspan="1">٤</td><td rowspan="1" colspan="1">٥</td><td rowspan="1" colspan="1">٦</td><td rowspan="1" colspan="1">٧</td><td rowspan="1" colspan="1">٨</td><td rowspan="1" colspan="1">٩</td><td rowspan="1" colspan="1"><img src="arabic-images/arind.png" alt="[Image of Arabic-Indic Digits]"></td><td rowspan="1" colspan="1">Machrek Arab (eg. Egypt)</td></tr><tr><th rowspan="1" colspan="1">Eastern Arabic-Indic</th><td rowspan="1" colspan="1">U06F0-U06F9</td><td rowspan="1" colspan="1">۰</td><td rowspan="1" colspan="1">۱</td><td rowspan="1" colspan="1">۲</td><td rowspan="1" colspan="1">۳</td><td rowspan="1" colspan="1">۴</td><td rowspan="1" colspan="1">۵</td><td rowspan="1" colspan="1">۶</td><td rowspan="1" colspan="1">۷</td><td rowspan="1" colspan="1">۸</td><td rowspan="1" colspan="1">۹</td><td rowspan="1" colspan="1"><img src="arabic-images/esarind.png" alt="[Image of Eastern Arabic-Indic Digits]"></td><td rowspan="1" colspan="1">Iran</td></tr></tbody></table></div></div><div class="div1">
|
|
<h2><a id="Comparison" name="Comparison"></a>3 Comparison of Mathematical Notations</h2><p>We will explore the spectrum of notations by choosing some samples of mathematical
|
|
content and comparing how they would typically be rendered for different languages and cultures.
|
|
We begin with an expression formatted as it might be seen in both English
|
|
and French contexts.</p><table cellpadding="2" border="2"><thead><tr><th rowspan="1" colspan="1">Style</th><th rowspan="1" colspan="1">Image</th><th rowspan="1" colspan="1">MathML</th></tr></thead><tr><th rowspan="1" colspan="1">English</th><td rowspan="1" colspan="1"><img src="arabic-images/expren.png" alt="[Image of formula in English style]"></td><td rowspan="1" colspan="1">
|
|
|
|
<pre><math xmlns="http://www.w3.org/1998/Math/MathML" display="block">
|
|
<mrow>
|
|
<mrow>
|
|
<mi>f</mi>
|
|
<mo>⁡</mo>
|
|
<mrow>
|
|
<mo>(</mo>
|
|
<mi>x</mi>
|
|
<mo>)</mo>
|
|
</mrow>
|
|
</mrow>
|
|
<mo>=</mo>
|
|
<mrow>
|
|
<mo>{</mo>
|
|
<mtable>
|
|
<mtr>
|
|
<mtd>
|
|
<mrow>
|
|
<munderover>
|
|
<mo movablelimits="false">∑</mo>
|
|
<mrow>
|
|
<mi>i</mi>
|
|
<mo>=</mo>
|
|
<mn>1</mn>
|
|
</mrow>
|
|
<mi>s</mi>
|
|
</munderover>
|
|
<mo>⁡</mo>
|
|
<msup>
|
|
<mi>x</mi>
|
|
<mi>i</mi>
|
|
</msup>
|
|
</mrow>
|
|
</mtd>
|
|
<mtd>
|
|
<mrow>
|
|
<mtext> if </mtext>
|
|
<mi>x</mi>
|
|
<mo><</mo>
|
|
<mn>0</mn>
|
|
</mrow>
|
|
</mtd>
|
|
</mtr>
|
|
<mtr>
|
|
<mtd>
|
|
<mrow>
|
|
<msubsup>
|
|
<mo>∫</mo>
|
|
<mn>1</mn>
|
|
<mi>s</mi>
|
|
</msubsup>
|
|
<mo>⁡</mo>
|
|
<mrow>
|
|
<msup>
|
|
<mi>x</mi>
|
|
<mi>i</mi>
|
|
</msup>
|
|
<mo>⁢</mo>
|
|
<mi>d</mi>
|
|
<mo>⁡</mo>
|
|
<mi>x</mi>
|
|
</mrow>
|
|
</mrow>
|
|
</mtd>
|
|
<mtd>
|
|
<mrow>
|
|
<mtext> if </mtext>
|
|
<mi>x</mi>
|
|
<mo>∈</mo>
|
|
<mi mathvariant="normal">S</mi>
|
|
</mrow>
|
|
</mtd>
|
|
</mtr>
|
|
<mtr>
|
|
<mtd>
|
|
<mrow>
|
|
<mi>tan</mi>
|
|
<mo>⁡</mo>
|
|
<mi>π</mi>
|
|
</mrow>
|
|
</mtd>
|
|
<mtd>
|
|
<mrow>
|
|
<mtext> otherwise </mtext>
|
|
<mrow>
|
|
<mo>(</mo>
|
|
<mtext>with </mtext>
|
|
<mi>π</mi>
|
|
<mo>≃</mo>
|
|
<mn>3.141</mn>
|
|
<mo>)</mo>
|
|
</mrow>
|
|
</mrow>
|
|
</mtd>
|
|
</mtr>
|
|
</mtable>
|
|
</mrow>
|
|
</mrow>
|
|
</math></pre>
|
|
</td></tr><tr><th rowspan="1" colspan="1">French</th><td rowspan="1" colspan="1"><img src="arabic-images/exprfr.png" alt="[Image of formula in French style]"></td><td rowspan="1" colspan="1">
|
|
<pre><math xmlns="http://www.w3.org/1998/Math/MathML" display="block">
|
|
<mrow>
|
|
<mrow>
|
|
<mi>f</mi>
|
|
<mo>⁡</mo>
|
|
<mrow>
|
|
<mo>(</mo>
|
|
<mi>x</mi>
|
|
<mo>)</mo>
|
|
</mrow>
|
|
</mrow>
|
|
<mo>=</mo>
|
|
<mrow>
|
|
<mo>{</mo>
|
|
<mtable>
|
|
<mtr>
|
|
<mtd>
|
|
<mrow>
|
|
<munderover>
|
|
<mo movablelimits="false">∑</mo>
|
|
<mrow>
|
|
<mi>i</mi>
|
|
<mo>=</mo>
|
|
<mn>1</mn>
|
|
</mrow>
|
|
<mi>s</mi>
|
|
</munderover>
|
|
<mo>⁡</mo>
|
|
<msup>
|
|
<mi>x</mi>
|
|
<mi>i</mi>
|
|
</msup>
|
|
</mrow>
|
|
</mtd>
|
|
<mtd>
|
|
<mrow>
|
|
<mtext> si </mtext>
|
|
<mi>x</mi>
|
|
<mo><</mo>
|
|
<mn>0</mn>
|
|
</mrow>
|
|
</mtd>
|
|
</mtr>
|
|
<mtr>
|
|
<mtd>
|
|
<mrow>
|
|
<msubsup>
|
|
<mo>∫</mo>
|
|
<mn>1</mn>
|
|
<mi>s</mi>
|
|
</msubsup>
|
|
<mo>⁡</mo>
|
|
<mrow>
|
|
<msup>
|
|
<mi>x</mi>
|
|
<mi>i</mi>
|
|
</msup>
|
|
<mo>⁢</mo>
|
|
<mi>d</mi>
|
|
<mo>⁡</mo>
|
|
<mi>x</mi>
|
|
</mrow>
|
|
</mrow>
|
|
</mtd>
|
|
<mtd>
|
|
<mrow>
|
|
<mtext> si </mtext>
|
|
<mi>x</mi>
|
|
<mo>∈</mo>
|
|
<mi mathvariant="normal">E</mi>
|
|
</mrow>
|
|
</mtd>
|
|
</mtr>
|
|
<mtr>
|
|
<mtd>
|
|
<mrow>
|
|
<mi>tg</mi>
|
|
<mo>⁡</mo>
|
|
<mi>π</mi>
|
|
</mrow>
|
|
</mtd>
|
|
<mtd>
|
|
<mrow>
|
|
<mtext> sinon </mtext>
|
|
<mrow>
|
|
<mo>(</mo>
|
|
<mtext>avec </mtext>
|
|
<mi>π</mi>
|
|
<mo>≃</mo>
|
|
<mn>3,141</mn>
|
|
<mo>)</mo>
|
|
</mrow>
|
|
</mrow>
|
|
</mtd>
|
|
</mtr>
|
|
</mtable>
|
|
</mrow>
|
|
</mrow>
|
|
</math></pre>
|
|
</td></tr></table><p>Structurally, the expressions are identical. The differences in names,
|
|
number formatting and of course the language used for the connecting words are all
|
|
due to localization. They are effected purely by
|
|
differing textual content within the MathML token elements.</p><p>In the following sections, we will examine three common styles used
|
|
for mathematics within Arabic texts. The terms Moroccan, Maghreb and Machrek will be
|
|
used to indicate the general geographic areas where these styles are used, but
|
|
there are no clearly defined borders between the regions.</p><div class="div2">
|
|
<h3><a id="Moroccan" name="Moroccan"></a>3.1 Arabic Notation; Moroccan Style</h3><p>The current way of writing mathematical expressions in Morocco,
|
|
is closely related to the French style:</p><table cellpadding="2" border="2"><thead><tr><th rowspan="1" colspan="1">Style</th><th rowspan="1" colspan="1">Image</th><th rowspan="1" colspan="1">MathML</th></tr></thead><tr><th rowspan="1" colspan="1">Moroccan</th><td rowspan="1" colspan="1"><img src="arabic-images/exprfrm.png" alt="[Image of formula in Moroccan style]"></td><td rowspan="1" colspan="1">
|
|
|
|
<pre><math xmlns="http://www.w3.org/1998/Math/MathML" display="block">
|
|
<mrow>
|
|
<mrow>
|
|
<mi>f</mi>
|
|
<mo>⁡</mo>
|
|
<mrow>
|
|
<mo>(</mo>
|
|
<mi>x</mi>
|
|
<mo>)</mo>
|
|
</mrow>
|
|
</mrow>
|
|
<mo>=</mo>
|
|
<mrow>
|
|
<mo>{</mo>
|
|
<mtable>
|
|
<mtr>
|
|
<mtd>
|
|
<mrow>
|
|
<munderover>
|
|
<mo movablelimits="false">∑</mo>
|
|
<mrow>
|
|
<mi>i</mi>
|
|
<mo>=</mo>
|
|
<mn>1</mn>
|
|
</mrow>
|
|
<mi>s</mi>
|
|
</munderover>
|
|
<mo>⁡</mo>
|
|
<msup>
|
|
<mi>x</mi>
|
|
<mi>i</mi>
|
|
</msup>
|
|
</mrow>
|
|
</mtd>
|
|
<mtd>
|
|
<mrow>
|
|
<mtext>إذاكان </mtext>
|
|
<mi>x</mi>
|
|
<mo><</mo>
|
|
<mn>0</mn>
|
|
</mrow>
|
|
</mtd>
|
|
</mtr>
|
|
<mtr>
|
|
<mtd>
|
|
<mrow>
|
|
<msubsup>
|
|
<mo>∫</mo>
|
|
<mn>1</mn>
|
|
<mi>s</mi>
|
|
</msubsup>
|
|
<mo>⁡</mo>
|
|
<mrow>
|
|
<msup>
|
|
<mi>x</mi>
|
|
<mi>i</mi>
|
|
</msup>
|
|
<mo>⁢</mo>
|
|
<mi>d</mi>
|
|
<mo>⁡</mo>
|
|
<mi>x</mi>
|
|
</mrow>
|
|
</mrow>
|
|
</mtd>
|
|
<mtd>
|
|
<mrow>
|
|
<mtext>إذاكان </mtext>
|
|
<mi>x</mi>
|
|
<mo>∈</mo>
|
|
<mi mathvariant="normal">E</mi>
|
|
</mrow>
|
|
</mtd>
|
|
</mtr>
|
|
<mtr>
|
|
<mtd>
|
|
<mrow>
|
|
<mi>tg</mi>
|
|
<mo>⁡</mo>
|
|
<mi>π</mi>
|
|
</mrow>
|
|
</mtd>
|
|
<mtd>
|
|
<mrow>
|
|
<mtext>غيرذلك </mtext>
|
|
<mrow>
|
|
<mo>(</mo>
|
|
<mi>π</mi>
|
|
<mo>≃</mo>
|
|
<mn>3,141</mn>
|
|
<mtext>مع</mtext>
|
|
<mo>)</mo>
|
|
</mrow>
|
|
</mrow>
|
|
</mtd>
|
|
</mtr>
|
|
</mtable>
|
|
</mrow>
|
|
</mrow>
|
|
</math></pre>
|
|
</td></tr></table><p>Although the mathematics would be embedded within a RTL language (Arabic),
|
|
its directionality is still LTR. The connecting words and phrases within the math, however,
|
|
are RTL Arabic, and <em>should</em> be subject to <a href="#GlyphShaping">glyph shaping</a>
|
|
(although some current MathML renderers are not doing this).
|
|
Thus these phrases should appear as
|
|
"إذاكان" (for "if"),
|
|
"غيرذلك" (for "otherwise")
|
|
and "مع" (for "with").</p><p>Also, the indication is that the bidirectional algorithm <a href="#UnicodeBiDi">[UnicodeBiDi]</a> should be
|
|
applied to individual text and token elements, rather than at a higher level as in HTML;
|
|
that is, the token elements act as paragraph segments.
|
|
Even with these considerations, the ordering of phrases within the last clause
|
|
(for "otherwise (with pi=3.141)") is problematic. The obvious markup sandwiching
|
|
an <code>mrow</code> for "pi=3.141" between two <code>mtext</code>'s for "otherwise (with" and ")", respectively,
|
|
would yield an incorrect ordering. A correct rendering seems to require the possibility
|
|
of embedding <code>math</code> within <code>mtext</code>, which is not possible in MathML 2.0.
|
|
But even then, the desired ordering would need to be marked up as two separate <code>mtext</code> elements:
|
|
one for "otherwise", and one for "(with pi=3.141)". The Math Interest Group is currently
|
|
considering the possibilities of such embedding. The example above was marked up by
|
|
artificially placing the Arabic word for "with" <em>after</em> the "pi=3.141".</p><p>Given such issues, it is sometimes advantageous to minimize the use of
|
|
connecting phrases, with preference to simple punctuation, such as:</p><table cellpadding="2" border="2"><thead><tr><th rowspan="1" colspan="1">Style</th><th rowspan="1" colspan="1">Image</th><th rowspan="1" colspan="1">MathML</th></tr></thead><tr><th rowspan="1" colspan="1">Moroccan</th><td rowspan="1" colspan="1"><img src="arabic-images/exprfrn.png" alt="[Image of simplified formula in Moroccan style]"></td><td rowspan="1" colspan="1">
|
|
<pre><math xmlns="http://www.w3.org/1998/Math/MathML" display="block">
|
|
<mrow>
|
|
<mrow>
|
|
<mi>f</mi>
|
|
<mo>⁡</mo>
|
|
<mrow>
|
|
<mo>(</mo>
|
|
<mi>x</mi>
|
|
<mo>)</mo>
|
|
</mrow>
|
|
</mrow>
|
|
<mo>=</mo>
|
|
<mrow>
|
|
<mo>{</mo>
|
|
<mtable>
|
|
<mtr>
|
|
<mtd>
|
|
<mrow>
|
|
<munderover>
|
|
<mo movablelimits="false">∑</mo>
|
|
<mrow>
|
|
<mi>i</mi>
|
|
<mo>=</mo>
|
|
<mn>1</mn>
|
|
</mrow>
|
|
<mi>s</mi>
|
|
</munderover>
|
|
<mo>⁡</mo>
|
|
<msup>
|
|
<mi>x</mi>
|
|
<mi>i</mi>
|
|
</msup>
|
|
</mrow>
|
|
</mtd>
|
|
<mtd>
|
|
<mrow>
|
|
<mtext>; </mtext>
|
|
<mi>x</mi>
|
|
<mo><</mo>
|
|
<mn>0</mn>
|
|
</mrow>
|
|
</mtd>
|
|
</mtr>
|
|
<mtr>
|
|
<mtd>
|
|
<mrow>
|
|
<msubsup>
|
|
<mo>∫</mo>
|
|
<mn>1</mn>
|
|
<mi>s</mi>
|
|
</msubsup>
|
|
<mo>⁡</mo>
|
|
<mrow>
|
|
<msup>
|
|
<mi>x</mi>
|
|
<mi>i</mi>
|
|
</msup>
|
|
<mo>⁢</mo>
|
|
<mi>d</mi>
|
|
<mo>⁡</mo>
|
|
<mi>x</mi>
|
|
</mrow>
|
|
</mrow>
|
|
</mtd>
|
|
<mtd>
|
|
<mrow>
|
|
<mtext>; </mtext>
|
|
<mi>x</mi>
|
|
<mo>∈</mo>
|
|
<mi mathvariant="normal">E</mi>
|
|
</mrow>
|
|
</mtd>
|
|
</mtr>
|
|
<mtr>
|
|
<mtd>
|
|
<mrow>
|
|
<mi>tg</mi>
|
|
<mo>⁡</mo>
|
|
<mi>π</mi>
|
|
</mrow>
|
|
</mtd>
|
|
<mtd>
|
|
<mrow>
|
|
<mtext>; </mtext>
|
|
<mrow>
|
|
<mo>(</mo>
|
|
<mi>π</mi>
|
|
<mo>≃</mo>
|
|
<mn>3,141</mn>
|
|
<mo>)</mo>
|
|
</mrow>
|
|
</mrow>
|
|
</mtd>
|
|
</mtr>
|
|
</mtable>
|
|
</mrow>
|
|
</mrow>
|
|
</math></pre>
|
|
</td></tr></table></div><div class="div2">
|
|
<h3><a id="Maghreb" name="Maghreb"></a>3.2 Arabic Notation; Maghreb Style</h3><p>The Maghreb style of notation is widely used in North Africa:</p><table cellpadding="2" border="2"><thead><tr><th rowspan="1" colspan="1">Style</th><th rowspan="1" colspan="1">Image</th><th rowspan="1" colspan="1">MathML</th></tr></thead><tr><th rowspan="1" colspan="1">Maghreb</th><td rowspan="1" colspan="1"><img src="arabic-images/exprare.png" alt="[Image of formula in Maghreb style]"></td><td rowspan="1" colspan="1">Not yet attempted</td></tr></table><p>Here, the most striking difference is that the overall mathematical
|
|
layout is the mirror image of the preceding examples, that is,
|
|
the mathematical directionality is RTL. Further, some symbols
|
|
(eg ∑, <, ∈) are mirrored as well.
|
|
Thus, we need a means of specifying the mathematical directionality,
|
|
and assuring that the appropriate symbols are available in Unicode and are marked as mirrored.
|
|
</p><p>The remaining differences are due to a more pronounced use of Arabic symbols:
|
|
DAL <img src="arabic-images/dal.png" alt="DAL"> (as the initial
|
|
of <img src="arabic-images/dalt.png" alt="DALT"> = "function" in Arabic);
|
|
the Arabic letter BEH <img src="arabic-images/beh.png" alt="BEH">,
|
|
and the letters of the function name abbreviation <img src="arabic-images/tah.png" alt="TAH">
|
|
for tangent (without dots). Again, these differences fall into the category of localization,
|
|
but reinforce the idea that the Unicode bidirectional algorithm, along with glyph shaping, should apply individually
|
|
to token elements.</p></div><div class="div2">
|
|
<h3><a id="Machrek" name="Machrek"></a>3.3 Arabic Notation; Machrek Style</h3><p>As the final Arabic example, we consider the Machrek style generally used in the Middle East.</p><table cellpadding="2" border="2"><thead><tr><th rowspan="1" colspan="1">Style</th><th rowspan="1" colspan="1">Image</th><th rowspan="1" colspan="1">MathML</th></tr></thead><tr><th rowspan="1" colspan="1">Machrek</th><td rowspan="1" colspan="1"><img src="arabic-images/exprarw.png" alt="[Image of formula in Machrek style]"></td><td rowspan="1" colspan="1">Not yet attempted</td></tr></table><p>Most differences between the Machrek and Maghreb styles are essentially due to localization:
|
|
a specifically Arabic symbol <img src="arabic-images/mg.png" alt="MG"> is used for the summation
|
|
(initial of <img src="arabic-images/mgmue.png" alt="MGMUE"> = "sum" in Arabic);
|
|
a different letter <img src="arabic-images/teh.png" alt="TEH"> is used for the function
|
|
(initial of <img src="arabic-images/tabet.png" alt="TABET">, also "function" in Arabic);
|
|
the letters of the elementary function name abbreviation
|
|
<img src="arabic-images/dah.png" alt="DAH"> are with dots;
|
|
and a number format using Arabic-Indic digits and a comma for the decimal separator (but not
|
|
the same as the Arabic comma used in text).</p><p>Note that the symbol used for summation should probably be a mathematical symbol
|
|
with a codepoint distinct from the Arabic letter, as the European summation symbol is
|
|
distinct from the Greek Sigma. This point also applies to the Arabic product.</p></div><div class="div2">
|
|
<h3><a id="N106E1" name="N106E1"></a>3.4 Additional Arabic Notations</h3><p>Two additional unique notations involve combinatorics, namely the factorial and
|
|
binomial coefficients:</p><table cellpadding="2" border="2"><thead><tr><th rowspan="1" colspan="1">Style</th><th rowspan="1" colspan="1">Image</th><th rowspan="1" colspan="1">MathML</th></tr></thead><tr><th rowspan="1" colspan="1">English</th><td rowspan="1" colspan="1"><img src="arabic-images/drbcen.png" alt="[Image of 12 factorial in english style]"></td><td rowspan="1" colspan="1">
|
|
<pre><math xmlns="http://www.w3.org/1998/Math/MathML" display="block">
|
|
<mrow><mn>12</mn><mo>!</mo></mrow>
|
|
</math></pre>
|
|
</td></tr><tr><th rowspan="1" colspan="1">Arabic</th><td rowspan="1" colspan="1"><img src="arabic-images/drbc.png" alt="[Image of 12 factorial in Arabic style]"></td><td rowspan="1" colspan="1">
|
|
<pre><math xmlns="http://www.w3.org/1998/Math/MathML" display="block" dir="rtl">
|
|
<menclose notation="madruwb">
|
|
12
|
|
</menclose>
|
|
</math></pre>
|
|
</td></tr></table><p>The argument to the factorial must be wrapped in a form similar to the
|
|
character LAM (ل), which must
|
|
be stretched in both directions to accommodate. A new <code>menclose</code> notation,
|
|
<code>madruwb</code> is proposed for this case.</p><table cellpadding="2" border="2"><thead><tr><th rowspan="1" colspan="1">Style</th><th rowspan="1" colspan="1">Image</th><th rowspan="1" colspan="1">MathML</th></tr></thead><tr><th rowspan="1" colspan="1">English</th><td rowspan="1" colspan="1"> <img src="arabic-images/arrangaen.png" alt="[Image of binomial(5,12) in english style]"></td><td rowspan="1" colspan="1">
|
|
<pre><math xmlns="http://www.w3.org/1998/Math/MathML" display="block">
|
|
<mrow>
|
|
<mo>(</mo><mtable><mtr><mtd>5</mtd></mtr><mtr><mtd>12</mtd></mtr></mtable><mo>)</mo>
|
|
</mrow>
|
|
</math></pre>
|
|
</td></tr><tr><th rowspan="1" colspan="1">Arabic</th><td rowspan="1" colspan="1"> <img src="arabic-images/arranga.png" alt="[image of binomial(5,12) in Arabic style]"></td><td rowspan="1" colspan="1">
|
|
<pre><math xmlns="http://www.w3.org/1998/Math/MathML" display="block" dir="rtl">
|
|
<mmultiscripts><mo>ل</mo>
|
|
<mn>12</mn><none/>
|
|
<mprescripts/>
|
|
<none/><mn>5</mn>
|
|
</mmultiscripts>
|
|
</math></pre>
|
|
</td></tr></table><p>Finally, although stacked fractions are rendered the same way in both European and Arabic,
|
|
bevelled fractions in RTL Arabic will appear, as one would expect, with the terms in RTL order,
|
|
i.e. A divided by B would appear as "B/A".
|
|
In some locales, the preference is for the slash to also be mirrored, as "B\A". For these cases,
|
|
we suggest that authors employ explicit markup using the REVERSE SOLIDUS \, such as
|
|
|
|
<mrow><mi>A</mi><mo>\</mo><mi>B</mi></mrow>
|
|
.</p></div><div class="div2">
|
|
<h3><a id="Persian" name="Persian"></a>3.5 Persian</h3><p>Persian languages generally use the Arabic script (written RTL), but with
|
|
the mathematical directionality LTR, similar to the Moroccan style.
|
|
We are aware of only one mathematical notation unique to Persian writing, the notation used
|
|
for limits:</p><table cellpadding="2" border="2"><thead><tr><th rowspan="1" colspan="1">Style</th><th rowspan="1" colspan="1">Image</th><th rowspan="1" colspan="1">MathML</th></tr></thead><tr><th rowspan="1" colspan="1">English</th><td rowspan="1" colspan="1"><img src="arabic-images/limw.png" alt="[Image of limit formula in English style]"></td><td rowspan="1" colspan="1">
|
|
<pre><math xmlns="http://www.w3.org/1998/Math/MathML" display="block">
|
|
<mrow>
|
|
<mrow>
|
|
<munder>
|
|
<mo movablelimits="false">lim</mo>
|
|
<mrow>
|
|
<mi>x</mi>
|
|
<mo>→</mo>
|
|
<mfrac bevelled="true">
|
|
<mi>π</mi>
|
|
<mn>10</mn>
|
|
</mfrac>
|
|
</mrow>
|
|
</munder>
|
|
<mo>⁡</mo>
|
|
<mrow>
|
|
<mi>sin</mi>
|
|
<mo>⁡</mo>
|
|
<mi>x</mi>
|
|
</mrow>
|
|
</mrow>
|
|
<mo>=</mo>
|
|
<mrow>
|
|
<mfrac>
|
|
<mn>1</mn>
|
|
<mn>4</mn>
|
|
</mfrac>
|
|
<mo>⁢</mo>
|
|
<mrow>
|
|
<mo>(</mo>
|
|
<msqrt>
|
|
<mn>5</mn>
|
|
</msqrt>
|
|
<mo>-</mo>
|
|
<mn>1</mn>
|
|
<mo>)</mo>
|
|
</mrow>
|
|
</mrow>
|
|
</mrow>
|
|
</math></pre>
|
|
</td></tr><tr><th rowspan="1" colspan="1">Persian</th><td rowspan="1" colspan="1"> <img src="arabic-images/limf.png" alt="[Image of limit formula in Persian style]"></td><td rowspan="1" colspan="1">
|
|
<pre><math xmlns="http://www.w3.org/1998/Math/MathML" display="block">
|
|
<mrow>
|
|
<mrow>
|
|
<munder>
|
|
<mo movablelimits="false">حد</mo>
|
|
<mrow>
|
|
<mi>x</mi>
|
|
<mo>→</mo>
|
|
<mfrac bevelled="true">
|
|
<mi>π</mi>
|
|
<mn>۱۰</mn>
|
|
</mfrac>
|
|
</mrow>
|
|
</munder>
|
|
<mo>⁡</mo>
|
|
<mrow>
|
|
<mi>sin</mi>
|
|
<mo>⁡</mo>
|
|
<mi>x</mi>
|
|
</mrow>
|
|
</mrow>
|
|
<mo>=</mo>
|
|
<mrow>
|
|
<mfrac>
|
|
<mn>۱</mn>
|
|
<mn>۴</mn>
|
|
</mfrac>
|
|
<mo>⁢</mo>
|
|
<mrow>
|
|
<mo>(</mo>
|
|
<msqrt>
|
|
<mn>۵</mn>
|
|
</msqrt>
|
|
<mo>-</mo>
|
|
<mn>۱</mn>
|
|
<mo>)</mo>
|
|
</mrow>
|
|
</mrow>
|
|
</mrow>
|
|
</math></pre>
|
|
</td></tr></table><p>While the overall notation is similar to the Moroccan model (LTR), it uses the
|
|
Eastern Arabic-Indic digits. The word "حد" (for "limit"), is
|
|
used; this word should not only be affected by <a href="#GlyphShaping">glyph shaping</a>,
|
|
but should be stretched horizontally to match the length of the underscript.</p></div></div><div class="div1">
|
|
<h2><a id="Proposals" name="Proposals"></a>4 Proposals and Clarifications</h2><div class="div2">
|
|
<h3><a id="BiDiProposal" name="BiDiProposal"></a>4.1 Clarification of bidirectional Algorithm for MathML</h3><p>The following summarizes how directionality should be applied to MathML
|
|
and, in particular, describes how the bidirectional algorithm should be applied
|
|
(it falls into class HL4; See <a href="http://www.unicode.org/reports/tr9/#HL4">Higher Level
|
|
Protocols: HL4</a> in <a href="#UnicodeBiDi">[UnicodeBiDi]</a>, section 4.3).</p><ul><li><p>The overall <em>mathematical directionality</em> should be determined by
|
|
a (new) <code>dir</code> attribute on the outermost <code>math</code> element
|
|
which takes one of the values <code>ltr</code> or <code>rtl</code>;
|
|
the default is <code>ltr</code>.
|
|
If this attribute is <code>rtl</code> the layout of all Layout, Script, Limit,
|
|
Table and Matrix schemata should proceed from right to left. This includes
|
|
such effects as the surd of an <code>mroot</code> starting from the right.
|
|
When the mathematical directionality is <code>ltr</code>, the layout should conform
|
|
to the current MathML specification.</p></li><li><p>The text content of each Token element should be treated as a separate
|
|
directional segment and the bidirectional algorithm should be applied to each independently.
|
|
The initial directional context for each Token element is determined
|
|
by the mathematical directionality. This latter property should assure that
|
|
individual mirrored symbols are treated correctly.</p></li></ul><p>As an example, consider the MathML fragment:</p><p>
|
|
|
|
<mn>1</mn>
|
|
<mo>+</mo>
|
|
<mi>
|
|
|
|
<img src="arabic-images/behp.png" alt="BEHP">
|
|
|
|
</mi>
|
|
<mo>-</mo>
|
|
<mn>2</mn>
|
|
|
|
</p><p>Some browsers mis-apply the bidirectional algorithm to the expression as a whole, as in HTML.
|
|
Applying the HTML algorithm would set the first two items LTR, but then switch directions upon
|
|
encountering the letter <img src="arabic-images/behp.png" alt="BEHP">;
|
|
thus the last three items are reversed.</p><table cellpadding="2" border="2"><thead><tr><th rowspan="1" colspan="1">Style</th><th rowspan="1" colspan="1">Image</th><th rowspan="1" colspan="1">MathML</th></tr></thead><tr><th rowspan="1" colspan="1">Right</th><td rowspan="1" colspan="1"> <img src="arabic-images/direction1.png" alt="[Image of expression rendered correctly]"></td><td rowspan="1" colspan="1">
|
|
<pre><math xmlns="http://www.w3.org/1998/Math/MathML" display="display">
|
|
<mn>1</mn><mo>+</mo><mi>ب</mi><mo>-</mo><mn>2</mn>
|
|
</math></pre>
|
|
</td></tr><tr><th rowspan="1" colspan="1">Wrong</th><td rowspan="1" colspan="1"><img src="arabic-images/direction2.png" alt="[Image of expression rendered incorrectly]"></td><td rowspan="1" colspan="1"></td></tr></table></div><div class="div2">
|
|
<h3><a id="GlyphShapingProposal" name="GlyphShapingProposal"></a>4.2 Glyph Shaping</h3><p>Glyph shaping rules apply not only to the textual content of an <code>mtext</code>,
|
|
but also to Arabic character sequences used as mathematical symbols (particularly in
|
|
<code>mi</code> and <code>mo</code>). This shaping is the visual cue that
|
|
distinguishes a single symbol from a sequence of symbols, perhaps representing a product.
|
|
This is analogous to the use of roman font in European mathematics, to distinguish for example
|
|
<pre><math xmlns="http://www.w3.org/1998/Math/MathML" display="display"><mi>sin</mi></math></pre>
|
|
from <pre><math xmlns="http://www.w3.org/1998/Math/MathML" display="display"><mi>s</mi><mi>i</mi><mi>n</mi></math></pre>.
|
|
<p>Thus, implementors should apply shaping to each character sequence within the text content of
|
|
any token elements.</p><p>Certain Arabic characters (ا د ذ ر ز و)
|
|
have no unique initial or medial shapes. Their use in the middle of a mathematical symbol
|
|
would tend to make the symbol look like the product of two shorter symbols.
|
|
Thus, to avoid confusion, authors should avoid using these characters
|
|
in the middle of mathematical symbols.</p></div><div class="div2">
|
|
<h3><a id="N10951" name="N10951"></a>4.3 Additional Mathvariants</h3><p>For single character tokens, additional styles, besides isolated, are used
|
|
to enlarge the set of available distinct symbols, just as the bold and Fraktur styles are
|
|
used in European mathematics. The styles used in Arabic mathematics
|
|
are "tailed", "looped" and "stretched", in addition to the "initial" style applied to
|
|
the individual character. Furthermore, the "double-struck" style is commonly used.
|
|
The following table shows the character JEEM in the various styles, in both
|
|
dotted and undotted forms (see below):</p><table cellpadding="2" border="2"><thead><tr><th rowspan="1" colspan="1"></th><th rowspan="1" colspan="1">isolated</th><th rowspan="1" colspan="1">initial</th><th rowspan="1" colspan="1">tailed</th><th rowspan="1" colspan="1">looped</th><th rowspan="1" colspan="1">stretched</th><th rowspan="1" colspan="1">double-struck</th></tr></thead><tbody><tr><th rowspan="1" colspan="1">dotted</th><td rowspan="1" colspan="1"><img src="arabic-images/jeemf.png" alt="Dotted JEEM isolated form"></td><td rowspan="1" colspan="1"><img src="arabic-images/jeemi.png" alt="Dotted JEEM initial form"></td><td rowspan="1" colspan="1"><img src="arabic-images/jeemt.png" alt="Dotted JEEM tailed form"></td><td rowspan="1" colspan="1"><img src="arabic-images/jeeml.png" alt="Dotted JEEM looped form"></td><td rowspan="1" colspan="1"><img src="arabic-images/jeems.png" alt="Dotted JEEM stretched form"></td><td rowspan="1" colspan="1"><img src="arabic-images/jeemd.png" alt="Dotted JEEM double-struck"></td></tr><tr><th rowspan="1" colspan="1">undotted</th><td rowspan="1" colspan="1"><img src="arabic-images/hahf.png" alt="Undotted JEEM ISOLATED"></td><td rowspan="1" colspan="1"><img src="arabic-images/hahi.png" alt="Undotted JEEM initial form"></td><td rowspan="1" colspan="1"><img src="arabic-images/haht.png" alt="Undotted JEEM tailed form"></td><td rowspan="1" colspan="1"><img src="arabic-images/hahl.png" alt="Undotted JEEM looped form"></td><td rowspan="1" colspan="1"><img src="arabic-images/hahs.png" alt="Undotted JEEM stretched form"></td><td rowspan="1" colspan="1"><img src="arabic-images/hahd.png" alt="Undotted JEEM double-struck"></td></tr></tbody></table><p>It is proposed to consider the <code>mathvariant</code> "normal",
|
|
when applied to Arabic, to mean the result of glyph shaping, and in particular,
|
|
the "isolated" style for single character tokens. It is also proposed to
|
|
add the following values allowed for <code>mathvariant</code>:
|
|
"initial", "tailed", "looped" and "stretched".</p><p>It is not expected to be meaningful to apply the "bold", "italic", "fraktur", "script",
|
|
"sans-serif" or "monospace" mathvariants (or combinations) to Arabic (although there is some
|
|
sentiment for allowing "bold" and "italic"). Nor is it meaningful to apply any mathvariant
|
|
other than "normal" to multicharacter tokens, which should have glyph shaping applied.
|
|
The current MathML specification points out that the only combinations of characters and
|
|
mathvariant that have an unambiguous interpretation are those that correspond to the
|
|
SMP Math Alphanumeric Symbols. An analogous argument is to be made for Arabic and the proposed
|
|
Arabic Math Alphabetic Symbols <a href="#UnicodeProposition">[UnicodeProposition]</a> (not yet part of Unicode).</p><p>Both dotted and undotted alphabetic symbols are encountered in this Note.
|
|
The choice of which type to use is up to local preferences, however; documents use
|
|
either dotted or undotted symbols, but not a mixture, and in particular, the dots are not used
|
|
to indicate semantic distinctions. Thus, it is not felt that dotting is a good
|
|
candidate for a mathvariant value, but rather should be accommodated by the choice of
|
|
symbol fonts available to user's browser, or possibly through CSS.</p></div><div class="div2">
|
|
<h3><a id="MirroringProposal" name="MirroringProposal"></a>4.4 Mirroring</h3><p>The MathML attributes <code>lspace</code>, <code>rspace</code>,
|
|
<code>lquote</code> and <code>rquote</code> should be interpreted as opening and closing,
|
|
rather than strictly left and right. This historical anomaly is analogous to
|
|
the standard Unicode names for the parentheses:
|
|
The <code>LEFT PARENTHESIS</code> and <code>RIGHT PARENTHESIS</code>
|
|
are marked as <code>mirrored</code> and are taken to represent
|
|
<code>OPENING PARENTHESIS</code> and <code>CLOSING PARENTHESIS</code>, respectively.
|
|
</p><p>The Math Working Group, and other interested parties, should work to assure
|
|
that the necessary codepoints for Arabic mathematics are not only available, but
|
|
appropriately marked for mirroring.
|
|
It is also to be hoped that available fonts will be available, and will
|
|
respect the calligraphic qualities regarding mirroring.</p></div><div class="div2">
|
|
<h3><a id="N10A2B" name="N10A2B"></a>4.5 Horizontal Stretchiness</h3><p>In Arabic mathematics, the sum, product and limit are commonly stretched horizontally
|
|
to the same width as the limits (over or under) that apply to them. Such stretching
|
|
does occasionally appear, but is rare, in European mathematics.
|
|
In <a href="http://www.w3.org/TR/MathML2/chapter3.html#id.3.2.5.8.3">Horizontal
|
|
Stretching Rules of MathML</a>
|
|
(<a href="#MathML22e">[MathML22e]</a> section 3.2.5.8.3), standard allows for such horizontal stretching
|
|
of some symbols at the discretion of the rendering agent. In this Note, we
|
|
simply encourage developers to implement this feature for the appropriate Arabic symbols.</p></div><div class="div2">
|
|
<h3><a id="N10A3B" name="N10A3B"></a>4.6 Additional Constructs</h3><p>The Arabic notation for factorial is a sort of enclosure.
|
|
We propose to add an additional allowed value <code>madruwb</code> (transliteration
|
|
of the Arabic مضروب for factorial) for
|
|
the <code>notation</code> attribute of <code>menclose</code>.</p></div></div><div class="div1">
|
|
<h2><a id="N10A49" name="N10A49"></a>5 Conclusions and Future Work</h2><p>This Note describes the notational issues encountered in presenting
|
|
mathematics within Arabic and other RTL languages, in particular focusing on
|
|
how these notations differ from the model described by MathML2. To the best of
|
|
our knowledge, the unique notations described here cover all known differences.</p><p>This Note also proposes enhancements to be considered in a future revision
|
|
of the MathML specification. These enhancements would allow Presentation MathML to be
|
|
used to conveniently incorporate mathematics into Arabic documents in a style
|
|
conventionally used by Arabic speaking authors.</p><p>The successful use of mathematics in Arabic texts will also require,
|
|
in addition to the extensions proposed here, that the appropriate codepoints
|
|
are included in Unicode, and that those codepoints are correctly marked as
|
|
mirrored. Some proposals (<a href="#UnicodeProposition">[UnicodeProposition]</a>,<a href="#ArabicMathUnicode">[ArabicMathUnicode]</a>) have already been made.</p></div><div class="div1">
|
|
<h2><a id="N10A58" name="N10A58"></a>6 Acknowledgments</h2><p>This document has been produced by the members of the Math Interest
|
|
Group. The chairs of this Interest Group are David Carlisle (invited
|
|
expert) and Robert Miner (Design Science, Inc.). Other members of the
|
|
Working Group are (at the time of writing): Isam Ayoubi (invited
|
|
expert), Laurent Bernardin (Waterloo Maple Inc.), Stephane Dalmas
|
|
(Institut National de Recherche en Informatique et en Automatique),
|
|
Stan Devitt (invited expert), Max Froumentin (W3C), Patrick D F Ion
|
|
(invited expert), Azzeddine LAZREK (invited expert), Paul Libbrecht
|
|
(German Research Center for Artificial Intelligence), Manolis Mavrikis
|
|
(University of Edinburgh), Bruce Miller (National Institute of
|
|
Standards and Technology), Luca Padovani (University of Bologna), Neil
|
|
Soiffer (Design Science, Inc.), Stephen Watt (Waterloo Maple Inc.)</p><p>The editors would also like to thank Richard Ishida for initiating
|
|
the contacts that lead to the writing of this Note, and for many
|
|
constructive comments on a draft of it.</p></div><div class="div1">
|
|
<h2><a id="N10A5F" name="N10A5F"></a>7 Production Notes</h2><p>The images of Arabic and Persian expressions were composed using the RyDArab
|
|
system <a href="#RyDArab">[RyDArab]</a>, and the FarsiTeX system <a href="#FarsiTeX">[FarsiTeX]</a>, respectively.
|
|
</p></div></div><div class="back"><div class="div1">
|
|
<h2><a id="Localization" name="Localization"></a>A Localization Issues</h2><p>This section discusses some of the localization issues encountered in this Note.
|
|
Authors of MathML may want to consider these issues when composing documents.
|
|
Additionally, it may be worth parameterizing converters from Content MathML
|
|
to Presentation MathML so that they take into account the target language, locale,
|
|
and conceivably the scientific discipline involved as well.</p><div class="div2">
|
|
<h3><a id="NumberSystem2" name="NumberSystem2"></a>A.1 Number Systems</h3><p>Assuming that the text content of <code>cn</code> elements can be unambiguously
|
|
interpreted as a number, the locale selection must be able to choose not only the set of
|
|
digits to use, but what set of decimal and thousands separators.
|
|
Generally, the comma is used as a decimal separator with both the European and Arabic-Indic digits,
|
|
but note that such a comma is distinct from the
|
|
Arabic comma "،"
|
|
|
|
used to separate items in a list.</p></div><div class="div2">
|
|
<h3><a id="SymbolsChoice" name="SymbolsChoice"></a>A.2 Symbols Choice</h3><p>There are two kinds of symbols: literal and mirrored symbols used according
|
|
to the local area:
|
|
<ul><li><p>the sum operator is presented in the two ways:
|
|
<img src="arabic-images/mgmuec.png" alt="[Image of literal summation]"> and
|
|
<img src="arabic-images/mgmues.png" alt="[Image of symbolic summation]">;</p></li><li><p>the product operator is presented in the two ways:
|
|
<img src="arabic-images/gdaac.png" alt="[Image of literal product]"> and
|
|
<img src="arabic-images/gdaas.png" alt="[Image of symbolic product]">;</p></li><li><p>the limit operator is presented in the two ways:
|
|
<img src="arabic-images/nhaytc.png" alt="[Image of literal limit]"> and
|
|
<img src="arabic-images/nhaytf.png" alt="[Image of limit in Persian style]">.
|
|
This last notation is used in Persian.</p></li><li><p>the factorial operator is presented in the two ways:
|
|
<img src="arabic-images/drbc.png" alt="[Image of literal factorial]"> and
|
|
!12.</p></li></ul>
|
|
<p>These stretched operators can be compared to the
|
|
mathematical stretchy accents,
|
|
only the roles are reversed. We can also think of something similar
|
|
to the square root construction.
|
|
</p></div></div><div class="div1">
|
|
<h2><a id="Implementation" name="Implementation"></a>B Implementation Issues</h2><p>This section describes issues that an implementor of an Arabic-enhanced
|
|
MathML specification would encounter, and possible strategies for dealing with them.</p><div class="div2">
|
|
<h3><a id="CharactersEncoding" name="CharactersEncoding"></a>B.1 Character Encoding</h3><p>Even though some local symbols, used in mathematics written in an Arabic
|
|
notation, can be obtained via mirroring of already existing symbols,
|
|
there are many symbols found in Arabic mathematical handbooks that are not
|
|
yet part of the Unicode Standard and cannot be obtained through a simple mirroring
|
|
<a href="#ArabicMathUnicode">[ArabicMathUnicode]</a>.
|
|
Some of such special characters are submitted for inclusion into the Unicode
|
|
Standard <a href="#UnicodeProposition">[UnicodeProposition]</a>.</p></div><div class="div2">
|
|
<h3><a id="MathematicalFonts" name="MathematicalFonts"></a>B.2 Mathematical Fonts</h3><p>Some font families are designed to meet with the requirements of typesetting
|
|
mathematical documents in an Arabic notation.
|
|
The RamzArab Arabic mathematical font <a href="#RamzArab">[RamzArab]</a> aims to provide a complete and
|
|
homogeneous Arabic font family, in the OpenType format, respecting Arabic calligraphy rules.</p><p>Although letters in "tailed" and "stretched" forms are semantically distinct
|
|
from the "initial" forms, they can be simulated by connecting with a particular final
|
|
form of HEH and the final form of ALEF, respectively, and applying glyph shaping. This technique
|
|
may be useful when an insufficient variety of fonts is available.</p><p>Implementors are encouraged to make it feasible for users to
|
|
choose dotted or undotted mathematical symbol fonts easily in accord with local tastes.</p></div><div class="div2">
|
|
<h3><a id="N10AEE" name="N10AEE"></a>B.3 Symbol Stretching</h3><p>In the cases where operators need to be stretched to match
|
|
the width of sub- or superscripts, the lengthening should be done using
|
|
curves rather than straight lines.
|
|
This curve lengthening is called curved <em>kashida</em>. It is one of the most
|
|
important aspects of the Arabic calligraphy.</p><table cellpadding="2" border="2"><thead><tr><th rowspan="1" colspan="1">Good</th><th rowspan="1" colspan="1">Bad</th></tr></thead><tr><td rowspan="1" colspan="1"><img src="arabic-images/mgmuec.png" alt="[Image of properly stretched summation]"></td><td rowspan="1" colspan="1"><img src="arabic-images/mgmuel.png" alt="[Image of poorly stretched summation]"></td></tr><tr><td rowspan="1" colspan="1"><img src="arabic-images/gdaac.png" alt="[Image of properly stretched product]"></td><td rowspan="1" colspan="1"><img src="arabic-images/gdaal.png" alt="[Image of poorly stretched product]"></td></tr><tr><td rowspan="1" colspan="1"><img src="arabic-images/nhaytc.png" alt="[Image of properly stretched limit]"></td><td rowspan="1" colspan="1"><img src="arabic-images/nhaytl.png" alt="[Image of poorly stretched limit]"></td></tr><tr><td rowspan="1" colspan="1"><img src="arabic-images/drbc.png" alt="[Image of properly stretched factorial]"></td><td rowspan="1" colspan="1"><img src="arabic-images/drbl.png" alt="[Image of poorly stretched factorial]"></td></tr></table><p>These curvilinear extensible symbols were generated by the CurExt application
|
|
for the system T<sub>E</sub>X with a PostScript font generator <a href="#RamzArab">[RamzArab]</a>.</p><p>Although horizontal stretching of sum and product operators
|
|
is rare in European mathematics:
|
|
<img src="arabic-images/mgmuegl.png" alt="[Image of stretched summation]"> and
|
|
<img src="arabic-images/gdaagl.png" alt="[Image of stretched product]">,
|
|
this stretching is more common, and more desired, in Arabic mathematics:
|
|
<img src="arabic-images/mgmuega.png" alt="[Image of stretched mirrored summation]"> and
|
|
<img src="arabic-images/gdaaga.png" alt="[Image of stretched mirrored product]">.
|
|
</p><p>[Note: the broken corner in these symbols
|
|
is a known flaw to be repaired in a future version of RyDArab
|
|
<a href="#RyDArab">[RyDArab]</a>].</p></div><div class="div2">
|
|
<h3><a id="SoftwareTools" name="SoftwareTools"></a>B.4 Software Tools</h3><p>The Dadzilla system, an adapted version of Mozilla, allows using MathML for
|
|
Arabic mathematical notation <a href="#Dadzilla">[Dadzilla]</a>.</p></div></div><div class="div1">
|
|
<h2><a id="N10B92" name="N10B92"></a>C Bibliography</h2><dl><dt class="label"><a id="MathML22e" name="MathML22e"></a>MathML22e</dt><dd>David Carlisle, Patrick Ion, Robert Miner, Nico Poppelier,
|
|
<em>Mathematical Markup Language (MathML) Version 2.0 (2nd Edition)</em>
|
|
World Wide Web Consortium Working Draft 19. December 2002
|
|
(<a href="http://www.w3.org/TR/MathML2//">http://www.w3.org/TR/MathML2/</a>)
|
|
</dd><dt class="label"><a id="UnicodeBiDiIntro" name="UnicodeBiDiIntro"></a>UnicodeBiDiIntro</dt><dd>
|
|
Richard Ishida,
|
|
<em>What you need to know about the bidi algorithm and inline markup</em>
|
|
<a href=" http://www.w3.org/International/articles/inline-bidi-markup/">
|
|
http://www.w3.org/International/articles/inline-bidi-markup/</a>
|
|
</dd><dt class="label"><a id="UnicodeBiDi" name="UnicodeBiDi"></a>UnicodeBiDi</dt><dd>
|
|
<a href=" http://www.unicode.org/reports/tr9/">
|
|
http://www.unicode.org/reports/tr9/</a>
|
|
</dd><dt class="label"><a id="UnicodeProposition" name="UnicodeProposition"></a>UnicodeProposition</dt><dd>
|
|
<a href="http://www.ucam.ac.ma/fssm/rydarab/english/unicode.htm">
|
|
http://www.ucam.ac.ma/fssm/rydarab/english/unicode.htm</a>
|
|
</dd><dt class="label"><a id="ArabicMathUnicode" name="ArabicMathUnicode"></a>ArabicMathUnicode</dt><dd>Mohamed Jamal Eddine Benatia, Azzeddine Lazrek, Khalid Sami,
|
|
<em>Arabic mathematical symbols in Unicode</em>, IUC 27, Berlin, Germany, April 6-8, 2005.
|
|
(<a href="http://www.ucam.ac.ma/fssm/rydarab/doc/communic/unicodem.pdf">http://www.ucam.ac.ma/fssm/rydarab/doc/communic/unicodem.pdf</a>)
|
|
</dd><dt class="label"><a id="RyDArab" name="RyDArab"></a>RyDArab</dt><dd>Azzeddine Lazrek,
|
|
<em>RyDArab-Typesetting Arabic mathematical expressions</em>,
|
|
TUGboat, Volume 25 (2004), No. 2, 2004.
|
|
(<a href="http://www.ucam.ac.ma/fssm/rydarab/doc/communic/tugryd.pdf">http://www.ucam.ac.ma/fssm/rydarab/doc/communic/tugryd.pdf</a>)
|
|
</dd><dt class="label"><a id="RamzArab" name="RamzArab"></a>RamzArab</dt><dd>Mostafa Banouni, Mohamed Elyaakoubi, Azzeddine Lazrek,
|
|
<em>Dynamic Arabic mathematical fonts</em>, LNCS, Volume 3130, pp. 149-157, 2004.
|
|
(<a href="http://www.ucam.ac.ma/fssm/rydarab/doc/communic/tugfontm.pdf">http://www.ucam.ac.ma/fssm/rydarab/doc/communic/tugfontm.pdf</a>)
|
|
</dd><dt class="label"><a id="FarsiTeX" name="FarsiTeX"></a>FarsiTeX</dt><dd>Behdad Esfahbod, Roozbeh Pournader,
|
|
<em>FarsiTeX and the Iranian TeX community</em>.
|
|
(<a href="http://www.tug.org/TUGboat/Articles/tb23-1/farsitex.pdf">http://www.tug.org/TUGboat/Articles/tb23-1/farsitex.pdf</a>)
|
|
</dd><dt class="label"><a id="Dadzilla" name="Dadzilla"></a>Dadzilla</dt><dd>Mustapha Eddahibi, Azzeddine Lazrek, Khalid Sami,
|
|
<em>Arabic mathematical e-documents</em>, LNCS, Volume 3130, pp. 158-168, 2004.
|
|
(<a href="http://www.ucam.ac.ma/fssm/rydarab/doc/communic/tugmathm.pdf">http://www.ucam.ac.ma/fssm/rydarab/doc/communic/tugmathm.pdf</a>)
|
|
</dd></dl></div></div></body></html>
|