Linguistics, Psycholinguistics and Semantics

Language, in other words the storehouse of all human Knowledge is represented by words and meanings. Even though words change the concept of meanings remains the same across languages in respective communications. Yet "Meanings" are always understood by human beings based on Contextual, Relative, Tonal and Gestural variances. Dictionary meanings 'as it is' are taken rarely into consideration, thus human language is ambigious in one sense and in other very flexible.

Computers on the other hand are hard-coded to go by the dictionary meanings. Thus teaching (programming) Computers to understand natural language (human language) has been the biggest challange haunting Scientists ever since the idea of Artificial Intelligence (AI) came into existance. In addition this has lead to the obvious question of "What is intelligence" from a Computation perspective. Not able to define intelligence precisely, this field of study nowadays referred to more as "
Machine Learning" than Artificial Intelligence.

Sanskrit as the most Scientific and Structured language has many Algorithms built-in as part of vast scientific treatises for analysing "Meanings" or "Word sense" from many perspectives since time immemorial - "It is our job to discover and convert the scientific methods inherent in Sanskrit into usable Computational models and Tools for Natural Language Processing rather than reinventing the wheel" - as scientists put it. This blog's purpose is to expose some of the many intricate tools and methodolgies used in Sanskrit for centuries to derive precise meanings of human language, to a larger audiance particularly Computational Linguists for futher study, analysis and deployment in Natural Language Processing.

In addition, Sanskrit even though being flexible as a human language, is least ambigious as the structure of the language is precisely difined from a semantical and syntactical point of view.

Thursday, April 18, 2013

"Zero" is in Veda itself...

When we count from number 1 onwards and beyond number 9... how can we proceed if we don't have a number 10. To have a the number 10 then we must have the number "0". Else how 10 can be written - First writing 1 and followed by a 0. We are not familiar with other method of writing in Decimal system (decimal system origination was Ancient India). If so, then how shall the Vedic rishis could have mentioned such large numbers such as ayuta (अयुत) for ‘ten thousand’, niyuta (नियुत) for ‘hundred thousand’, prayuta (प्रयुत) for ‘million’, arbuda (अर्बुद) for ‘ten million’, nyarbuda for ‘hundred million’ etc. (these are used in Yajur Veda).


Today all encyclopedias are wrongly attributing the invention of "0" to Babylonian mathematics in 7nd century BC, and also giving a passing remark about Acharya Pingala in 3rd Century BC as the one who used "0" in the Chandas shastram. Chandas shastram is a Vedanga - limb of Veda. Acharya Pingala's Chandas shastram like Paninian Grammar was written for both Vedic and Worldly branches of Samskritam. The original Chandas shastram is a part of Veda itself in the earlier Era. Thus it is evident that "0" was there from time of Veda - which is time immemorial

The fact that is evident from the above is that the number "0" as a place value system was there and also "0" as a number was also there from the Vedic times  - which in other words means anaadi - time immemorial. So let's not keep repeating the mistake that Sri.Aryabhatta invented "0" etc. Sri.Aryabhatta was a great mathematician and scientist. But saying that  Sri.Aryabhatta invented "0" would be an insult our scientific advancements before him. During Mahabharata time - Astra (missile) launch etc. needed calculations which use "0". - Like today how a missile launch can't be done without precise calculations requiring the use of "0".

Maharishi Vyasa write slokas on celestial maps with references to three sequential solar eclipses and to planetary positions. Reference to the first solar eclipse comes in the Sabha Parva (79.29). Second solar eclipse just before Mahabharata war second in the Bhisma Parva (3.29), following a lunar eclipse occurring within the same fortnight. He warns that these successive eclipses are sign of bad times (we can now use these celestial positions to do the detailed astronomical map and also do the dating to precisely estimate Mahabharata war time), all such complex calculations require the useage of "0", thus "0" was in usage in Mahabharata time and even before.

The English word zero came via → French zéro which is from → Venetial zero, which came from (together with Ciper /Cypher) via → Italian zefiro which came from → Arabic صفر, ṣafira = “is empty", ṣifr = "zero", “nothing” This was translation of  →  the Samskritam word shoonya (शून्य), meaning "empty".

The etymological chain confirms that only the word "Shoonya" (which is used to denote "0" as a valueless number) had travelled and the same word is used for all other purposes of "0" even today. Such as "0" as a valueless number, or place value system, or fraction, etc. Though various mathematical calculations using "0" for other purposes travelled later, but the other Samskritam words didn't travel till 20th century. Later in early 20th century the words such as Void (from Sanskrit word व्योम Vyoma) were starting to be used in computer programming languages.

In Samskritam we have many words for "0" depending on its value. They are below:

पूज्य, /सत् (poojya /sat) = Holy (complete) - from the word Wholly
शून्य, रिक्त, रन्द्र (shunya, rikta, randra) = Valueless
आभु, अव्यक्त (Aabhu, avyakta) = Inexpressible (value can't be determined)
पूर्ण, अनन्त (purna, ananta) = Complete, full, endless (infinite value)
ख, दिब, व्योम, (kha, diba, vyoma) = Infinity
बिन्दु (bindu) = Point /Dot (used in fractions)
अव्यय, (avyaya) = NaN / Indeclinable
साङ्खेय, द्रबिणम् (saankheya, drabinam) = Ordinal (while counting "0" as a number)

Such wide veriety of names used for denoting "0" is found in many places  starting from Vedas, Kalpa sutras, Chandas shastra, and many other treatises. Many of the mathematicians of ancient Bharatam were Vaiyakaranaas - as the entire vyakarana sutras of Maharishi Panini by themselves are based on Bija Ganita (Algebra) principles.

The Ganita shaastra (mathematics) has developed into a separate branch of study very long back starting with the Shulba sutras (Sri.Bodhayanacharya) and Jyotisha shastra times. "0" was in wide useage for a very long time even before the development of Ganita as a separate branch of study. Sri. Aryabhatta, Sri.Bhaskara, Sri.Bramhgupta, Sri. Neelakanta Somayaji, etc. these were Ganita Shastragnas after the Period of Sri.Gautama Buddha.

Even before and after the period of Sri.Gautama Buddha, Jain mathematicians were quite popular, and even before Jainism came, Vaiyakaranaas were great mathematicians as well as linguists as the entire Samskritam language is based on mathematics and thus it is most suitable for Computing.

In the ancient times the Ganita shaastra (mathematics) has its branches as -  Geometry (Gyamiti) is the study of shapes and their applications; Algebra (Bija Ganita) is the study of operations and their applications; Trignometry (Trikonamiti) is study of Triangles and the relationships between their sides and the angles and Calculus (chalana-kalana Ganita) the study of change.

Saturday, March 23, 2013

What is "Sabda" - Shikshaa and Vyakarana in Samskrit - Science of Sound


शब्दानुशासनम् व्याकरणम् । The science of sound is Vyakaranam.

संप्रत्ययः शब्दः । ध्वानिः शब्दः  - इति महाभाष्ये - both knowledge (meaning) and the original "sound" is associated with "Sabda" in Samskrit. The sound is given more importance in Samskrit lanaguage than the word (पदम्) as the natural sound in itself has inheritted meaning with it. The "word" being its derivative added with a suffix, conveys the derived /modified meaning of the original sound. Thus the language is also its derivative. However vyakarana is made to convey the meaning grammar - it is more than just grammar,  as vyaakarana deals with all the derivations of the primordial sounds such as - words, word-sense,  phrases, sentenses, figures of speech, etc. (यद्यपि पदशास्त्रम् इति विश्रुतं तथापि शब्दसाधुत्वासादुत्वविषयैव अस्य वेदाङ्गस्य महत्वम् इति). The shaastra that deals with 'sound' in its basic form is called as Shikshaa (शिक्षा) which is primarily a Vedanga (a part of Veda like Vyaakarana)

Shikshaa (शिक्षाशास्त्रम्) shaastra is the foundation for studying the 2 branches of Samskrit language (भाषा नाम संस्कृतं - वैदिकं लौकिकञ्च) Vaidika (Vedic Samskrit) and Laukika (Classical Samskrit) - Laukika is the part of language that is in use for all purposes other than Vedic - including Science, Literature, Medicine, and all other worldly things.

Shikshaa (शिक्षा) shaastra in its full form is a complex and intricate science based on human vocal system. Which in its full capacity in use in Vedic part of Samskrit (वैदिकसंस्कृतं). The same shaastra is also used in a limited manner in the all purpose non-Vedic part of Samskrit (लौकिकसंस्कृतं).

Even though Sandhi (सन्धिः) is studied along with Grammar, Sandhi deals only with the pronunciation of syllable (Varna वर्णः) with respect to the factors such as - Place (स्थानम्), Effort (प्रयत्नम्), Duration (मात्रा), Pitch (स्वरः), etc. (स्थानादयः). All these deals with Syllabicity (syllables) and phonological aspects (which are part of Shikshaa) than words and meanings. Thus Sandhi is primarily a subject of Shikshaa than grammatical processes - even though Sandhi rules are given in grammar texts (व्याकरणम्), however the place, etc. (स्थानादयः) are elaborated in 'Varnochaarana shikshaa' and 'Paniniya Shiksha' of Maharishi Panini. The rules for changing of syllables based on enjoining of syllables, though found in Grammar texts but are in essence part of the Shikshaa /Phonology.

Thus the linguistics treatise Ashtaadyayi not only deals with Grammar (which is primarily Syntax & Semantics - विभक्तिः कारकम् च) it also deals with the rules of Phonetics - Shikshaa and in general all aspects of "Sound" (शब्दः) which forms the basis for language - including morphology, etc.

Not just Sandhi, the fundamental formations in Sanskrit Roots + Suffixes (प्रकृतिः + प्रत्ययः) and the word generation (व्युत्पत्तिः) processes essentially are based on phonetics (शिक्षा) like Vriddhi, Guna, Samprasaaranam etc. (वृद्धिः गुण सम्प्रसारणम् इत्यादिप्रक्रियाः) - Process of expansions of syllables which purely natural sound modifications while joining syllables. These are evident across word formations - in both Noun forms and Verb forms from singular to plural forms and also declensions. The Vriddhi, Guna, etc.  part in primary and secondary Noun derivatives from Noun roots, Nouns and Verb roots, etc.  (वृद्धिगुणादयः - कृत्तद्धितेषु) are again strictly follow the rules of phonetics. In addition the the phonetic features such as Natvam, Shatvam (णत्वम् षत्वम् उभयमपि) are also part of shikshaa. Also the letter 'h' (ह्) becoming the forth letter of the group consonents (वर्गीयचतुर्तम् अक्षरम्) are again shikshaa. Similarly all most all the Dhaatus (निज​-धातुः) - Verbal roots are also single syllable phonetic (sound) forms - also the suffixes (प्रत्ययः). The prefixes (उपसर्गः) are again mostly dual syllable sound forms.


There are many shikshaa shaastras (शिक्षाशास्त्राणि) more than 40 so far we have got for Four Vedas (चतुर्वेदाः) and their shaakhaas. Of these Paniniya shikshaa for Laukika (non-vedic) branch of Samskrit is famous. Thus Maharishi Panini integrated all these branches of 'the science of language" in his monumental work Ashtaadyayi.  It appears that the entire work of Maharishi Panini is to make rules for pronouncing the "Word" correctly as the "Word" in itself has the inseperable meaning attached with it - thus the perfect pronunciation of just one "Word" takes you to heaven as per Maharishi Patanjali.

The word "vyaakaranam - vi+aa+kr+lyut (suffix)" (व्याकरणम् = वि+आ+कृ+ल्युट्) itself means a "Special form" (of language) - with stress on the verb (creation of the form), the other 2 similar words (1)"aakaarah - aa+kr+ghan (suffix)" (आ+कृ+घञ्) and (2)"aakritih - aa+kr+ktin (suffix)" (आ+कृ+क्तिन्) both represent form and shape respectively in common usage. The extra 'vi' (वि उपसर्गः) prefix gives the meaning of Special. Thus the word vyaakaranam itself means the entire science of the creation of the language. Which includes abiding by the natural phonetic capabilities of human vocal faculties and also reflecting the natural and eternal "sound-meaning" combination (शब्दार्थयोः निसर्गनित्यसम्बन्धत्वम्).

The entire Shikshaa shaastram is based on Human vocal anatomy and its primary purpose in laukika part of language - is to make pronunciation easy and natural (similar to Veda) in addition to shortening, softening, replacing, adding, etc. of syllables based on natural movement of tongue and natural functioning of vocal chord. This has also helped in making the entire language musical - which in-turn helped in easy communication and retention of huge volumes of treatises over 1000s of years, generations after generations based on the most natural and easy to remember phonological sounds.

The natural inter-wining of phonology and language - music and literature, - a true Wonder!. Hope we understand, hold it dear (in our tongues) and preserve it by passing to the next generation without any deterioration...

This shikshaa shaastra is primarily a Vedanga - which means a part of Veda... and also used in Yoga, Tantra and Shastras. The natural relationship between Language and Phonology proves that  Samskrit is a well constructed (not by human) language and is indeed the greatest gift to mankind, from who? - who else...! 
------------

Personally this has lead me to the conclusion that originally all 6 Vedangaas (Shikshaa, Chandas, Nirukta, Vyaakaranam, Jyotisha and Kalpa) must have been a single shaastra (may be called as Vyaakaranam - based on the Yogaartha of the word) and must have been an integral part of Veda in the earlier Era (Dwapara Yuga) where Veda was just one !

Basic details of Shikshaa you can fine here https://vedavichara.com/the-vedas/vedangas-the-limbs-of-vedas.html
and
http://en.wikipedia.org/wiki/Shiksha


To continue...

Friday, March 1, 2013

Why Sanskrit? in Computational Linguistics - Part 2

First of all the confusion that needs to be cleared is whether Sanskrit is best suited for Computing or Computer Programming - my view is both. Yet this paper is not about Sanskrit as a computer programming tool - even though there are scientists and academicians who are developing programming languages based on Paninian priciples, however this paper deals with Sanskrit as a Computing tool. Computing here refers to concepts, algorithms and methodologies.

Computer Programming is an entirely different thing as it deals with a human being generating code in a high level computer language, which in-turn translated to a low level code through compilers /linkers, which in-turn translated to operating system instructions, which in-turn translated to microprocessor instructions (based on CPU instruction set) which internally converted into binary instructions which further converted to digital electronic (electrical) signals for flip-flops /counters etc.

The entire chain of programming is based on mathematics /symbol language and not any human language spoken or written - even though the symbols consists of few human understandable characters such as numeric 1-9, alphabetical a-z, and some signs of mathematics such as +, -, /, %, etc. all these constitute the ASCII - which has 255 characters or symbols of computer codes - in other words each symbol can fit into a single byte. These symbols are assigned to certain operative values in digital electronics - thus the programming languages are not human languages. That's the precise reason why human beings want Natural Language Processing or human language processing capabilities in computers - which literally means our languages being understood by computers. So far computer understands only computer language and human being's only human language.

With respect to computer language the instructions (lets say commands /actions /verbs) are very limited - widely used are about 15 - such as go to, break, compare, copy, reverse, assign, operators (+,-,*,/), receive, display, etc. Also few other actions (verbs) can be written as functions such as sort, list, etc. Thus the computer language's capability in comparison with human language is very limited.

In comparison in Ashtadyayi - Panini's 1000s of years old Sanskrit grammar treatise - the meta language used inside Ashtadyayi not Sanskrit but uses certain words and rules of Sanskrit - which is used to teach Sanskrit grammar to the readers of Ashtadyayi. That meta language has more instruction sets - yet without using any explicit verb. Thus if one can make a high-level programming language exactly mimicking the meta language of Ashtadyayi - we will have a powerful tool - with which computer can generate words, form sentences, etc.  - yet associating meanings will be the biggest challenge.

The hypothesis is that - if there is a highly structured human language, can then that language be used for Natural Language Processing ?- the answer is yes and to wonderful degree containing complex human sentences - how ? = In Linguistics and most importantly computational linguistics the following are essential for scientific analysis (for computers to do the analysis) of the language - which consists of sentences - and sentences have inherent  meanings.

How then analysis can take place? -
Without ambiguity the sentence meaning being conveyed is first and most important thing; because computers don't have intelligence - computer's understanding of language is based on a particular structure (lets say a word (or) phrase and its meaning) and it tries to combine or mix and match to a particular meaning. Again here the computer doesn't care about the meaning but it responds for a question which has a particular meaning and based on that from a set of answers the most suitable answer is chosen and given - which again based on a particular individual, popularity, number of occurrences, place, time, etc.

In natural language for example sentences like -

a- The committee chair chairs the meetings where the chair is elected as the chair for one more chair-term.
b- All committees' chairs chair their meetings to elect the chair and the past chair is elected as the new chair for the next chair-term.

Now, we can easily understand the meanings of these sentences, but computer can't understand - here is where the language's ambiguities with respect to word meanings and words' declensions, usage, phrase meanings and along with other phrases and within a sentences - many such things matter.

English is the most complex language - it takes even for a native speaker 8-10 years to achieve proficiency. It takes just 2 years to achieve proficiency in Sanskrit another 2 years in literary Sanskrit. In addition, the written form of English is again non-phonetic which adds its own problems in converting text to voice. In addition due to many borrowed words - spellings and pronunciations are again differ and add complexity. More over the regional flavours.

The interpretation of a text in computing goes through - first Parsing, part of speech tagging, lexical analysis, morphological analysis, syntactical analysis, and then semantical analysis. Thus the more  structured and scientific the language is the less problems in computer based natural language processing and its applications such as Machine Translation and Machine Assisted Translation

In general Language means - collection of sentences; A sentence means - collection of meaningfully associated words such as Subject, Object, Verb etc. ; Each word in a sentence should have clear and easily understandable verbal and nominal declensions if not confusion starts. In the above example the word "Chair" is both verb and noun - this gives enormous confusion to computer.

Lets explore further... Human language is highly ambiguous. Primarily - because of the ambiguities of word sense (meanings) word meanings (on their own) and in association with another word (in a phrase) and in association with verbs and other words in a sentence - the complexities multiply. With quotations and idioms complexities only increase further in a sentence or a part of speech. Now add acronyms and what we get? - most complex thing known to human being next only to human mind (or) both language and mind are one and the same ??

Thus in linguistic terms the complexity is exponentially increased in each corresponding step as per the 6 most important things in the order in Linguistics and how they are in Sanskrit are given below

(1) Phonetics and Phonology —knowledge about linguistic sounds - In Sanskrit it is known as Shiksha shastra - Sanskrit has over 40 Shiksaas for each shaaka of Veda but for language in general Paniniya shiksha is most suitable as it correspondingly connects to the Grammar and the rules of the grammar also abide by the rules of the Phonetics.

(2) Morphology —knowledge of the meaningful components of words from stems and their generation and usage - In Sanskrit this is called as 'pada vyutpatti' in Sanskrit - 4 types of vrittis (word generators) are there for this purpose namely - Krit, Taddhita, Samasa and Sannaadyanta - In addition the method for generating words are also explained step-by-step in Panini's Ashtadyayi like a mathematical equation - thus programming to generate words are easiest.

(3) Lexical —knowledge of meanings and equivalent words. Every Sanskrit lexical item has a one-one correspondence. So a particular word used in some place means the same when used elsewhere too from a semantics point of view. Amara Kosha, Nirukta, Nighantu all have the complete lexical database of Sanskrit words and associated word connections.

(4) Syntax —knowledge of the structural relationships between words - declensions of nominal forms /stems - In Sanskrit Vibhakti play this role - we have very tight rule thus there is no ambiguity. Also Sanskrit is a language without prepositions thus a major complexity is removed - this is also explained step-by-step in Ashtadyayi like a mathematical equation.

(5) Semantics —knowledge of meaning of words in a sentence - In Sanskrit this is one discussed in detail in many works and in Sanskrit vyakarana called as "Kaarakam" - Many ways of sentence meanings and their analysis on a scientific basis are available in Sanskrit with respect to different schools of linguisitic sciences such as Vyakarana, Nyaya and Mimamsa.

(6) Pragmatics — knowledge of the relationship of meaning with respect to the context - this is the most complex as meanings change based on context and many other factors - In Sanskrit there is a wonderful Vyakarana treatise available for pragmatics called as "Vakyapadiyam" by Maharishi Bhartrhari - it is pity that many Sanskritists are not aware of this. But this treatise is very popular among European linguists in particular German, Belgian, and French.

All most all kinds of meaning analysis based on relationship between 2 words - sameness, opposites, connection, association, context, etc. are dealt in detail in "Vakyapadiyam", in the West Scholars are  been inspired by this and made their theories of Semantics and Pragmatics.

Some reference from Wikipedia - further reference from Linguistic Journals and Scientific publications are below: Wikipedia - The Link (as on March 1st, 2013)

"Pāṇini's work became known in 19th-century Europe, where it influenced modern linguistics initially through Franz Bopp, who mainly looked at Pāṇini. Subsequently, a wider body of work influenced Sanskrit scholars such as Ferdinand de Saussure, Leonard Bloomfield, and Roman Jakobson. Frits Staal (1930-2012) discussed the impact of Indian ideas on language in Europe. After outlining the various aspects of the contact, Staal notes that the idea of formal rules in language – proposed by Ferdinand de Saussure in 1894 and developed by Noam Chomsky in 1957 – has origins in the European exposure to the formal rules of Pāṇinian grammar. In particular, de Saussure, who lectured on Sanskrit for three decades, may have been influenced by Pāṇini and Bhartrihari; his idea of the unity of signifier-signified in the sign somewhat resembles the notion of Sphoṭa. More importantly, the very idea that formal rules can be applied to areas outside of logic or mathematics may itself have been catalyzed by Europe's contact with the work of Sanskrit grammarians


de Saussure

Pāṇini, and the later Indian linguist Bhartrihari, had a significant influence on many of the foundational ideas proposed by Ferdinand de Saussure, professor of Sanskrit, who is widely considered the father of modern structural linguistics. Saussure himself cited Indian grammar as an influence on some of his ideas. In his Memoire sur le systeme primitif des voyelles dans les langues indo-europennes (Memoir on the Original System of Vowels in the Indo-European Languages) published in 1879, he mentions Indian grammar as an influence on his idea that "reduplicated aorists represent imperfects of a verbal class." In his De l'emploi du genitif absolu en sanscrit (On the Use of the Genitive Absolute in Sanskrit) published in 1881, he specifically mentions Pāṇini as an influence on the work."

Sanskrit referred as a Devabhasa (Gods language) is not because it is the oldest language - it is because it is very perfect in its structure, morphology, semantics, etc. which have not changed for 1000s of years - Only a good linguist understands that such a perfect language can't be created by Human beings, neither Cavemen nor evolved - Whats the evidence: we have seen English evolving from a structured languages and having some formal structures and usage initially to now with no structure and highly ambiguous!. - not with respect to human understanding but with respect to Linguistics. With respect to human understanding it has become easy and flexible - as a result can we say that human mind has become unstructured and dull !, may be past century scientists with out labs and tools have found many things !. Mathematician Ramanujan's tools were just a pencil and a paper !.

References:

  1. The science of language, Chapter 16, in Gavin D. Flood, ed. The Blackwell Companion to Hinduism Blackwell Publishing, 2003, 599 pages ISBN 0-631-21535-2, ISBN 978-0-631-21535-6. p. 357-358
  2. George Cardona (2000), "Book review: Pâṇinis Grammatik", Journal of the American Oriental Society 120 (July– September, 2000): 464–5, JSTOR 606023? [6]
  3. Leonard Bloomfield (1927). "On some rules of Pāṇini". Journal of the American Oriental Society (American Oriental Society) 47: 61–70. doi:10.2307/593241. JSTOR 593241
  4. Ashtadyayi Reference: http://avagraha.wordpress.com/
  5. Sanskrit Programming - Reference 2 sites both contains lot of information - (1) http://vagartham.blogspot.in/ and
    (2) http://uttishthabharata.wordpress.com/
  6. Functional programming - Reference: http://vishk.wordpress.com/2007/02/11/backus-naur-form-and-ashtadhyayisanskrit-grammar/
  7. Parser /Tokenizer for Samasa - Vaakkriti: Sanskrit Tokenizer, Aasish Pappu and Ratna Sanyal, Indian Institute of Information Technology, Allahabad (U.P.), India, Proceedings from the paper submitted in Third International Joint Conference on Natural Language Processing, 2008, Hyderabad, India
  8. The methods used inside the ashtadyayi is similar to today's arrays, inheritance (including multiple inheritance), polymorphism, etc. used in OOPS - Reference: Recent Research in Science and Technology, 2011, 3(7): 109-111,  ISSN: 2076-5061, www.scholarjournals.org
  9. Computational Lingusitics - Reference: Hyman Malcolm D., “From Pāninian Sandhi to Finite State Calculus”, Sanskrit Computational Linguistics: First and Second International Symposia, Revised Selected and Invited Papers,  ISBN:978-3-642-00154-3, Springer-Verlag, 2009.

Friday, February 22, 2013

Lost in Translation - Yogaartha vs. Rooddyartha

Meanings are lost in Translations, Generally happen and are accepted to some degree in other languages. But with Sanskrit sometimes translations can be completely wrong particularly with respect to shastras (sciences) - whats so special here and why?

Prakritih (Root - both verb root - called as Dhatu and Noun root - called as Praatipatika) while joining with Pratyayah (can be loosly termed as suffix - but it is more than just suffix), we get "Padam" - the word in Sanskrit. Entire Sanskrit language is nothing but a mixure of Prakritih and Pratyayah - here the word Prakriti denotes feminine gender and the word Pratyayah represents masculine gender (connecting with the higher principle of Prakriti - Purusha).

Similarly both Prakritih and Pratyayah contributes meanings to a Padam (word). One will convey the conceptual (root) meaning and the other its (vyavahara) meaning in worldly usage. The original meaning of a word (Prakritih + Pratyayah) is called Yogaarthah - the word Yogah (not Yogh or Yogaa - both are wrong pronunciations one is widely used in Northern India and the other by People in Western countries) Yogah means enjoinment - thus it is the original meaning of a word when Prakritih and Pratyayah is enjoined.

However due to usage of the word over a long time for a specific purpose, the meaning of the word get associated with that purpose. That superimposing of a meaning to a word is called Rooddyarthah. This superimposing (meaning
changes) is dealt in 2000+ year old Sanskrit texts - thus this is another proof that the language is very ancient and also widely being used.

When we see dictionaries, the first choice of meanings are always Rooddyarthah and not Yogaarthah. But in Sanskrit Shastras (scientific treatises) Yogaarthah is what is invariably used and not Rooddyarthah - it is the case with the shastric texts written even as late as in 17th century. Thus when we read /translate sanskrit scientific texts we have to be mindful of yogaartha and very careful about the contextual meaning also. Bhagavadgita which is a Yogashastra as well as Gitopanishad - when translated Gita or Yogasutra of Patanjali Maharishi is also susceptible to these rules as well as the important "Rule of studying shastra" in Sanskrit

The Rule of shastra studying is such that before one embarks on a study of Vedanta one should study - Vyakarana, Mimamsa and Nyaya - to understand Shankara bhashyam of Bhavadgita one needs these three shaastras. But nowadays people without studying even the basics doing free-flowing translation of Bhagavadgita, Yogasutra, Yogavaashista and many other texts with the help of some body's translation which is again based on Rooddyarthas. Which is wrong as meanings get diluted

One needs strong understanding (meanings) of Dhatu, Upasarga (prefix) and Pratyayah in addition to the Sanjna /Paribhasha = nomenclature and codewords /acronyms of the specific shastra thats being translated. Without such elaborate preparations the translation and the effort becomes unworthy and useless.

Some examples of Yogaartha vs. Rooddyartha.

The word "Ooha" generally used for the meaning "Guess" - the Yogaartha meaning is "Application". The word "Laavanyam" used to describe exceptional beauty, in Yogaartha it actually means "Saltiness". The word "Vyakti" used for referring to a person whereas its Yogaartha meaning is "manifest" or "known". Even the most talked /used word "Yogah' currently used for  "exercise" and that word's Yogaartha meaning is "Union" or "enjoinment". Similarly the word "Bhoo" and its Yogaartha meaning is "be" and its derivative Bhoota means "Being" (in the sense of life and life-form - Life is eternal and always exists, only the forms gets formed or changed). Similarly the word Dhyaanam which is popularly used for Meditation, whereas the Yogaartha meaning is "Brood-over". Similarly the word "gamanam" (gam /gach dhatu) means not going /travel, but reaching or attaining.

In the same way meanings of Upadesah, Upavaasah, Upanyaasah - all these 3 words (kridanta words) in Yogaartha means being /placing near to the object of focus (God), yet in RooddyarthaUpadesam means advice, Upavaasam means restrainment of food and Upanyaasam means spiritual discourse. Similarly Avataara which means descend /getting down but that has become manifestation and now after the popular movie its become like ones image in a digital /virtual world.

Another interesting point in meanings of words is that the degrees of meanings for a word - eg: the word Shariram - means generally body but when the body of a youth is referred then it is called as "Dehah", Man's body is called as Gaatram, then it is called as "Kaaya" old man's body as "Kalevaram", Form /Devata forms - male & female and also in some cases female body is called as "Vapuh" (Vapuz stem) and female body as "Tanuh". Also 'Aakaara' is used for Form and 'Aakritih' is used for body in a general sense.

Another corrupted word is "Aarya" - "Ri" Dhatu + Nyat pratyayanta kridanta roopam = Aryam (Aaryam yasya sah = Aaryah) - Aaryah as per yogaartha is the one who instills order, yet rooddyartha it is given to noble person or person of higher race. Here it is to be noted that as per yogaartha only a kshatriya in pravritti (in action) or God as an Avataarah (again in action) can only be Aarya - like Sri Rama, Sri Krishna, Maharajah Vikramaditya, Maharajah Bhoja, Maharajah Shivaji, Maharaja Krishnadevaraya as they have established Dharma - and not a renunciate - because renunciates have gone beyond Dharma and are in the path of Moksha or attained. If they happen to be social reformers also then they can be addressed as Aarya. If they are pure enlightened beings then they can be considered equivalent to God but not Aarya. Nature is the biggest Aaryaa.

The earlier rooddyartha of Aarya become noble person then later due to the influence of Western indologists it became invaders. Then now as per the convenience of Tamil Nadu politicians Aarya means a fair skinned person (North Indian) who displaced the so called native population to down south - height of ignorance and gullibility!

Prithvee - "prith" Dhatu - unaadi - "Prithu" - its Stree lingam (feminine gender) is Prithvee - yogaartha = manifold (that which is one yet manifolds into many - vyakarana itself teaches vedanta!). In Rooddyarthah this word is used for Earth or Big.

Ajinam - this word (taddhita compound) means some stuff that is connected with a sheep - used for sheep wool (sweater) etc. or sheep skin. Later this word transformed into general skin, etc. Now people associate Ajinam with Tiger skin

Similarly the word avagamanam means understanding; gnanam means awareness; Buddhih means intellect; Matam means openion or abhiprayah - this has now become Religion, etc. There are more words for various degrees /grades of human knowledge such as pratipattih, prateetih, sampratyayah, dheeh, bodha, samvit, gnanam, etc. - each one is at higher order than the previous one. For these words equivalent English words are not there, thus it is difficult or impossible to translate Sanskrit Shastras into other languages.

Thus for correct understanding of a particular Shastra one has to study it in its original language. Translation is such a poor alternative in some cases we will be better of without studying it. E.g.: If one tries to translate khaNDana-khaNDa-khAdya of the great Poet Sriharsha we will understand. Similarly many scholars admit that there is but only one good translation (in English) of Sri Nagarjuna's Moola Madhyamika kaarika in all these years of Buddhist studies - no wonder Buddha is misunderstood - and there is a fight between 2 wrong understandings then - which is still going on !. (The same case with the writings of  J Krishnamurthi which is in English, to translate it other non-European language would be a herculean task.)

All these collectively highlight the mistakes in our understandings born out of not studying Sanskrit properly. Our entire culture is based on Sanskrit yet we don't learn !. How then we will we know the hidden values and Scientific rationale behind our culture ?. Not knowing is certainly a shame.

Widespread studies of Sanskrit shastras stopped in the mid of 19th Century. Some of our fathers and grandfathers in the past 2-4 generations must have studied Vedas /Sanskrit but they didn't study the shastras.  Particularly
Vyakarana - which is the foundation stone of the language. Though Veda paaTashaalas somehow survived but many of the shastra paaTashaalas were closed in the begining of last century. Thanks to the efforts of many traditional MaTas and Acharyaas some shastra paaTashaalas were revived - where traditional shaastraas are taught in a traditional way. My humble salutations to them - but for them I wouldn't be writing this and I'm merely a pipe carrying the thoughts of teachers.


By saying all these I'm not saying Rooddyartha is wrong. All I'm saying is one should be mindful of which meaning is used in a particular Shastra in a particular context and translate accordingly. This is also important for Computational Linguists who are developing Machine Translation systems.

Thursday, February 7, 2013

Why Sanskrit? in Computational Linguistics - Part 1

This is a concise introduction to "How Sanskrit is the most suitable language for Computing?" and now "In what way Sanskrit is suitable for Computational Linguistics?"

When I first heard a few years back that Sanskrit is the most ideally suited language for Computing - I was curious to know How ? - I couldn't get any straight forward answer.  Later I found out on my own, with a bit of research in the Web and discussions with Linguistic scholars.

Two linguists namely Dr. Leonard Bloomfield and Dr. Zellig Harris who were living in early 20th Century were responsible for coming out with the theories of Structural Linguistics - main reason for the development of Computer programing languages. Widely used in the first and second generation of Programming languages

These two linguists - Leonard Bloomfield and Zellig Harris, I found that both of them went Germany during late 19th century /early 20th century and studied intensely both Vedic Grammar (Pratisakyam) and Paninian system - for 7 years !. in their post Doctoral research /studies. They both studied in details the works of Dr. Otto von Böhtlingk - a German Indologist and Sanskrit Scholar - specializing in Vyakarana

"From Wikipedia - page http://en.wikipedia.org/wiki/Otto_von_B%C3%B6htlingkBöhtlingk was one of the most distinguished scholars of the nineteenth century, and his works are of pre-eminent value in the field of Indian and comparative philology. His first great work was an edition of the Sanskrit grammar of Panini, Aṣṭādhyāyī, with a German commentary, under the title Acht Bücher grammatischer Regeln (Bonn, 1839–1840)."

"From Wikipedia - page http://en.wikipedia.org/wiki/Backus%E2%80%93Naur_Form
The idea of describing the structure of language with rewriting rules can be traced back to at least the work of Pāṇini (before the 4th century BC), who used it in his description of Sanskrit word structure. American linguists such as Leonard Bloomfield and Zellig Harris took this idea a step further by attempting to formalize language and its study in terms of formal definitions and procedures (around 1920–60)
"



IAL (Intelligent Application Language) the first Computer Programming Language - from IAL born ALGOL-58 the first-generation popular programming language - John Backus a programmer in IBM labs developed the first notation  based on Sanskrit Grammar methods. Later when Peter Naur further developed the original ALGOL (58) into ALGOL-60 and created the Backus-Norm Form (BNF Notation) - it become a huge success and brought in major developments to the computer field.

From Wikipedia page - http://en.wikipedia.org/wiki/Backus%E2%80%93Naur_Form
"Further development of ALGOL led to ALGOL 60; in its report (1963), Peter Naur named Backus's notation Backus Normal Form, and simplified it to minimize the character set used. However, Donald Knuth argued that BNF should rather be read as Backus–Naur Form, as it is "not a normal form in any sense" unlike, for instance, Chomsky Normal Form. The name Pāṇini Backus form has also been suggested in view of the facts that the expansion Backus Normal Form may not be accurate, and that Pāṇini had independently discovered a similar notation centuries earlier"


Later date programming languages and linguistics got further development when Naom Chomsky introduced Generative Grammar. (Naom Chomsky is the student of Dr. Zellig Harris - Linguist and Sanskrit Vyakarana scholar) - Sanskrit language's speciality itself is its Generative Grammar & Morphology.

Thus it is very clear that Maharishi Panini not only helped to protect the Sanskrit grammar by writing his linguistic canon "Ashtadyayi". He also helped create Computer Programming languages. Panini - the first Computer Scientist.

Part 2 - How the rules of Ashtadyayi helped the Programming languages or how many of Panini's ideas are used "as it is" in programming languages.