As mentioned in the first blog we will discuss some areas, first of which is Arabic as a Programming Language. For this to happen, code written in Arabic letters must be coded in some standard character set that is agreed upon for data integrity, transmission, consistency and security.
There have been many attempts to code Arabic in some sort of different character sets since the early mainframes, that changed along the years and sometimes the coding was different between Arabic countries! We now reached to some maturity state of adhering to a standardized character set provided by the Unicode Consortium. Still there is a but…
There are issues with the coding design used in the Unicode, for example if you want to search for “أحمد” in a database that uses Unicode it will return only the name that begins with “أ” and neglect the name if it began with “ا” which is a problem, since they are the same. Not only that, you have the problem of “ي” and “ى”, also “أ ؤ ئ” and “ء”, etc. for each of which has a different Unicode character code that is not designed in mind to be handled in these situations.
Now this is what I actually want to discuss, it’s not designed in mined to handle situations where algorithms are constructed to do search or use regular expressions. Let us look at the ASCII table, you can make a search for a word regardless of the case by an AND logic operation of the characters to 0x5F, the ASCII table was designed taking this into consideration. This makes comparison, regular expressions and search operations very fast since they are performed at low-level logic operations.
What I am suggesting here is a project to make a new coding for Arabic letters that takes in consideration all design scenarios that will be used when dealing with the Arabic text. That means we have to open a discussion about the origin of Arabic letters that should lead in constructing a coding table, that fulfills requirements of easing expected text operations used in data systems or digital context.
What are your comments on this matter?