Skip to main content

What can we learn about the grammar of traditional Georgian vocal music from computational score analysis?

At the 7th International Symposium on Traditional Polyphony in Tbilisi in 2014, I met the French ethnomusicologist Simha Arom for the first time. Best known for his work on African Polyphony and Polyrhythm, he had also started to work on traditional Georgian music, trying to decipher its chordal syntax using manually derived chord-progression statistics (Arom and Vallejo 2008; 2010). I had come across these papers prior to the meeting and - simply out of curiosity - had replicated their approach in Mathematica which allowed to speed up the analysis significantly.  Our meeting in Tbilisi was the beginning of a long-term collaboration aimed at understanding  the chord-progression structure of traditional Georgian vocal music by analyzing sheet music in Western 5-line staff notation. 

As an important milestone, we have now developed a generative grammar model based on the self-learning Kohonen model (Kohonen, 1989) in a prefix tree (Antonov, 2018; 2023) framework (Scherbaum et al., 2025). This represents a significant improvement over the classical Markov model, as it allows for the influence of different context lengths for each chord in a chord sequence. We used this model to generate a large number of chord sequences, all conforming to the same grammatical production rules as our corpus. These were then used as training data for an artificial neural network to test whether, as in large language models (LLMs), ‘linguistic relationships’ could be identified by visually analyzing the embedding space of the network. The results for chord-to-chord relationships are inconclusive, as the spatial structure of the embedding map for individual chords cannot be interpreted unambiguously. The embedding map for whole songs, however, shows a pronounced spatial clustering which reflects the different classes of our corpus. This suggests that the structure of the embedding map reflects the similarities and dissimilarities of the chordal syntax of the individual songs, which the network has  learned in an unsupervised way. 

The 452 scores in the analyzed corpus can be associated with 7 different classes, representing different regions and/or schools. These map to different regions in the embedding map. For details see Scherbaum et al. (2025).

References

Antonov, Anton. (2018). Tries With Frequencies Mathematica package. Retrieved from github.com/antononcube/MathematicaForPrediction

Antonov, Anton. (2023). Using Prefix trees for Markov chain text generation. Retrieved from community.wolfram.com/groups/-/m/t/2819012

Arom, Simha; Vallejo, Polo. (2008). "Towards a theory of the chord syntax of Georgian Polyphony" Proceeding of the 4th International Symposium on Traditional Polyphony. [The Fourth International Symposium on Traditional Polyphony]. Eds. Rusudan Tsurtsumia and Joseph Jordania: pp. 321–335. Tbilisi: International Research Center for Traditional Polyphony of Tbilisi State Conservatoire. 

Arom, Simha; Vallejo, Polo. (2010). "Outline of a syntax of chords in some songs from Samegrelo" Proceeding of the 5th International Symposium on Traditional Polyphony. [The Fifth International Symposium on Traditional Polyphony] Eds. Rusudan Tsurtsumia and Joseph Jordania: pp. 266–277. Tbilisi: International Research Center for Traditional Polyphony of Tbilisi State Conservatoire. 

Kohonen, Teuvo. (1989). "A self learning musical grammar, or "Associative memory of the second kind" International Joint Conference on Neural Networks. 1: pp. 1-5, Washington: International Neural Network Society.

Scherbaum, Frank; Arom, Simha; Caron Darras, Florent. (2025). "What can we learn about the grammar of traditional Georgian vocal music from computational score analysis?" Musicologist. 9(1): 1-29.