Suffixes of Georgian last names

Geplaatst op 12-11-2013 door Maarten Marx | TI | tags: , | comment image Geen reacties »

In a project for Transparency International Georgia we are creating a database of Asset Declarations of Georgian public officials. Besides a lot of worthy information about their income and relations to companies, having such a large database also gives the opportunity to do fun linguistic research.

Here we report on the names occuring in this database. These are names of Georgian public officials and their reported relatives. Georgian names are simple: they always consist of two tokens “firstname, surname”. There are 1604 different first names and 3883 different surnames in our database, coming from a total of 19522 different names.

The ending of a Georgian name is a good indication of the region that a person (or his ancestors) are from. Of the 3883 different surnames there are 396 different 4-letter endings and just 149 different 3-letter endings.
Maybe not surprising, but still, the endings follow a Zipfian distribution. The wordcloud below (click for a larger version) is generated using the logarithm of the frequencies. The raw counts of the top ten are next to it.

Wordle: GeorgianLastNameEndings4      
vili:1407 
adze:634 
idze:496 
iani:180 
auri:45 
lava:36 
ania:30 
aria:25 
elia:23 
hava:22 
auli:17 
khia:16 
nava:16 
laia:16
shvili;1398;36% 
dze;1153;30% 
ia;385;10% 
ani;210;5% 
ava;151;4% 
uri;68;2% 
eli;66;2% 
ua;57;1% 
uli;27;1% 
shi;6;0% 
oni;6;0% 
ti;6;0% 
khi;5;0% 
kva;2;0%

The 14 endings in Wikipedia

The 14 endings mentioned in the Wikipedia article on Georgian surnames are distributed as follows over our list of 3883 different surnames.
Only 343 of these (less then 10%) does not end in one of these 14 endings.
The wordcloud below was generated using the logarithms of the frequencies in the table on the right.
If we would use the raw frequencies shvili and dze would dwarf the rest.

     
shvili;1398;36% 
dze;1153;30% 
ia;385;10% 
ani;210;5% 
ava;151;4% 
uri;68;2% 
eli;66;2% 
ua;57;1% 
uli;27;1% 
shi;6;0% 
oni;6;0% 
ti;6;0% 
khi;5;0% 
kva;2;0%

Georgian First Names

In a follow up post we will describe some research about Georgian first names. Here we just show the frequency distribution of the top 50 most occurring names. We intend to create a list stating the gender for (at least the top K most occuring of) our 1604 first names.

     
Giorgi:2600 
Nino:1635 
Davit:1188 
Tamar:818 
Ana:782 
Mariam:711 
Mariami:684 
Aleqsandre:626 
Nikoloz:608 
Maia:554 
Irakli:545 
Zurab:501 
Natia:472 
Nana:466 
Elene:459 
Qetevan:457 
Levan:453 
Ekaterine:416 
Luka:395 
Tamari:382 
Lasha:347 
Khatuna:328 
Sofio:328 
Manana:325 
Salome:309 
Tea:305 
Zaza:276 
Mikheil:266 
Nikolozi:263 
Marina:249 
Tinatin:244 
Vakhtang:240 
Lia:238 
Mamuka:236 
Lali:228 
Teimuraz:223 
Daviti:222 
Saba:217 
Shota:213 
Anastasia:210 
Tornike:201 
Nika:195 
Irma:187 
Marine:186 
Gocha:185 
Lela:185 
Lizi:185 
Givi:184 
Maka:181 
Natalia:181 
Shalva:181 
Beqa:180 
Rusudan:180 
Qetevani:172 

Reageer

Je moet ingelogd zijn om te kunnen reageren.