Tuesday, September 11, 2007

Sardininan - Sassarese languages or language and dialect?

Well, there is a nice website that can help us with that question ... and that is from the institution that cares about this officially - the Region of Sardinia.

When it comes to the Limba Sarda Comuna used on the actual Sardinian wikipedia there is no doubt that the language exists, but we must appreciate that it is an artificial language that was created out of the living languages of Sardinia. The website of the Region of Sardinia states:

Limba sarda comuna: una lingua realmente esistente: Sa Limba sarda comuna è naturale per il 92,8 per cento, è in posizione mediana rispetto a tutti i dialetti del sardo e può ancora essere migliorata per farla diventare la lingua ufficiale dei sardi.

Limba sarda comuna: a language that in fact exists: Sa Limba sarda comuna is natural be 92,8 per cent, it is in an intermediate position compared to all Sardinian dialects and can still be improved to have it become the official language of the Sardinian people

So they still want to improve the language ... nice ... 92,8 per cent of it is natural that means 7,2 percent is not natural. If I consider these percentages to what translators work with every day, that is the "matches" we get in our CAT tools, then 92,8 percent is a low percentage of being "natural". It seems to be high, but in fact it is not ...

Let's say I translate any kind of text (a sentence for example) and my analysis software tells me that the text is up to 93% percent equal to another sentence I translated before, this means that I cannot leave the sentence as is, because I will need to change at least one word in the sentence to make it a proper translation of what is there.

Just to give you an example:
The house on the hill is green - that is what was translated before. Now I get such a 92,8 per cent match with a sentence like: the tree on the hill is green. If I left it as is: it would state something completely different.

You can also look at it like this:
The house on the hill is nice and green. - that is 100% English
The house on the hill is nice and vert. - that is approx. 89 % English + 11% French
(it is just a matter of playing with the amount of words to get the 92,8%)

So what these 92,8% tell us: even if a huge part of it is considered to be built out of the "natural language part" it is still an artificial language.

But what is a language and what is a dialect? Well: that very much depends from which POV you look at things. But ISO determined some rules to understand what a language is and what not. That is, before you can get an ISO 639 code for a language you need to prove that this languabe complies to the standard. Of course there are living languages that don't have an ISO code, because up to now nobody cared for them - I am just thinking about Griko Salentino, a language spoken and written in Italy - but if people care about that language, they will ask for it.

What is a dialect ...

a) a language without an army
b) a way of expressing orally that developed out of a language and that has some differences , for example in pronunciation, some expressions etc, even having the same basics when it comes to grammar (just to mention one example)

So could
Campidanese (ISO 639-3: sro)
Gallurese (ISO 639-3: sdn)
Logudorese (ISO 639-3: src)
Sassarese (ISO 639-3: sdc)
be dialects of the Common Sardinian Language? Well ... only from a logical POV this is not possible, because they were there long before the Common Sardinian Language was created ...

By having their ISO 639 code, when they requested that code, they complied to the requests of the International Standardisation Organisation and therefore, on an international level they are considered to be languages even with an ISO code.

Please let me repeat: there are languages that don't have one, but these can request a code ...

When it comes to the language committee we had to draw a line somewhere and this line should not come from us, that is: it is NOT up to the members of the language committee to decide what a language is or not. We needed some kind of standard to apply and the clearest one was and still is the ISO standard. So if somebody wants to complain and say that the four languages above are in fact dialects of Sardinian and not languages, we should kindly invite them to create their papers and contact ISO directly to have the ISO 639-3 language code taken away ... it is NOT up to the language committee to take such decisions.

Another thing people should then also consider to do: also UNESCO states that these four languages are languages and they are in the red book of endangered languages - so if whoever states that they are not languages and he/she is so sure about it: they should also contact UNESCO. It is NOT up to the language committee to take such decisions as to delete four languages out of the endangered languages list ...

Sorry for me being so ironical, but: when such discussions about what is and what is not a language come up ... well: before you come to us, please go to the INTERNATIONAL bodies that deal with the question.

We are only normal people that base their decisions on standards and can tell people where to go to request their code, but we can nor create that code, nor influence what is recognised on an international level. (Nor do we want to do that).

Now to the question of sc.wikipedia ... I remember that, at the beginning, sc.wikipedia tried to host all of the Sardinian languages, then someone came up and decided to make sc.wikipedia a Limba Sarda Comune wikipedia only. Well: the Limba Sarda Comune is being used by Sardinian Authorities to facilitate their work.

In any case the code "sc" stands for the macro language Sardinian and not for the Limba Sarda Comune, so there is no reason why it should have the right to claim that code for the language. That is the Limba Sarda Comune, like any other language in the world that wants recognition by ISO must request an own ISO 639 code. It is not an option to simply say: now let's take that one since it is there ... well the one that is there stands for something else.

The question of the actual sc.wikipedia came up because of people telling us that Sassarese is not a language, but a dialect of Sardinian and that the Limba Sarda Comune (Common Sardinian Language) is the only "right language" of Sardinia.

Well again: it is not us who is going to decide on Sassarese and the other three being or not being a language - we rely on ISO 639-3 codes since we had to draw a line and avoid to simply assert things. It is not us who is going to decide if the Limba Sarda Comune is going to get an ISO 639 code. If you, who read this, are interested in this matter, it is up to you to get things on their way.

See: the decision to base whatever we do on ISO 639-3 was one of the wisest decisions ever taken within the language committee ... imagine which fights (almost all political based) we would have if we did not do this.

Just to make things clear - I repeat it again:

a) we do NOT decide if something is a language or not
b) we base our decisions on ISO 639-3
c) we actually need a solution for various scripts used for one language
d) we would love to see Multilingual Mediawiki there since it could be used to create easily sustainable communities
e) we are not going to go ahead on discussing if Sassarese is a language or not (it has a code)
f) we will need to find a solution for Limba Sarda Comune which does NOT have an ISO 639 code and is using the sc code in an improper way.

Thank you for your patience and understanding.

No comments:

Khalil Gibran über die Musik

Die Musik wirkt wie die Sonne, die alle Blumen des Feldes mit ihrem Strahlen zum Leben erweckt. ( Khalil Gibran ) Image by Pete Linforth fr...