From Oral Tradition to Online Access

Traditional Georgian polyphonic vocal music, which was originally passed down from generation to generation by oral tradition, is currently facing the challenge of the disappearance of the old ways of immersion in traditional music making. On the other hand, since the UNESCO declaration as part of the "Intangible Cultural Heritage of Humanity" (2001), the international interest of scientists and music lovers in traditional Georgian singing has been growing. Professional and amateur ensembles around the world have included traditional Georgian vocal music into their repertoires, "music tourism" to Georgia has been booming in recent years, and Georgian singing teachers regularly perform and teach abroad. Moreover, new ways of merging traditional Georgian music with various other tendencies of music making have emerged, e.g. in the form of Georgian folk-fusion music (Lomsadze, 2019). And, fortunately, one can also observe (and listen to) a new generation of talented young Georgian musicians who are devotedly cultivating and continuing the traditional Georgian music.

Therefore, it does not look as if traditional Georgian vocal music as such is an endangered genre. What has obviously changed, however, are the ways in which the music is passed on from teachers to students, especially since the time of the corona pandemic, when personal contacts had to be minimized. Nowadays, virtual internet meetings have become a popular teaching approach, and studying from scores and/or YouTube videos has become a common learning method. At a higher sophistication level, teaching CDs, often accompanied by scores in 5-line staff notation, provide audible access to individual voice tracks (usually accompanied by a stereo mix of all voices). All of these are important tools, created with a great deal of love for the music and lots of effort, which deserve grateful recognition. But should we leave it at that?

The Austrian composer Gustav Mahler (1860-1911) is quoted as having said „Keeping tradition alive does not mean to pray to the ashes but to pass on the flame“. The question we have asked ourselves in this context is how the computer-based methods and web-based techniques developed in our project can contribute to „passing on the flame“ in ways that do justice to the essential musical aspects of the traditional music and overcome some of the imitations of present teaching materials.

Three of the main limitations which we see with the type of teaching material currently popular are :

Western 5-line staff notations (scores) do not do justice to the non-tempered tuning systems used by many traditional Georgian singers. In addition, many practitioners of traditional Georgian music can not read scores (and we feel they should not have to learn it).
A traditional Georgian song is more than the combination of three melodies sung at the same time. Teaching CDs necessarily have to represent the individual voices as isolated melodies. Details of the harmonic interaction between singers, which in life performances can be perceived as very fine-grained mutual (harmonic) intonation adjustments, usually get lost this way.
Classical teaching material, e.g. scores, but also YouTube Videos and teaching CDs are static representations of the songs in the sense that there are very few options to interact with the material.

To overcome these limitations, we have developed a new web-based video/audio interface to interact with multi-track audio/video recordings in a new way. Its purpose is to be able to immerse as deeply as possible into the full polyphonic (often non-tempered) soundscape represented by traditional Georgian singing, using state-of-the-art computer tools.

Design Features

Photo: Frank Scherbaum

Recording setup used for the collection of the of the GVM dataset (Scherbaum et al., 2019).

The new web-based Video/Audio Interface, which we refer to as „GVM-Interface“, was developed within the GVM (Georgian Vocal Music) project at the University of Potsdam and was initially designed for the multimedia structure of the GVM dataset (Scherbaum et al., 2019), consisting of video- , conventional stereo-, headset- , and larynx microphone recordings, all synchronized to a common time code.

For the following illustration of the current features of the GVM-Interface, we use the five songs for which the raw field recordings where already presented at the ISMIR Conference in 2018 (Scherbaum et al., 2018) and for which the original GVM data are already freely accessible through what we refer to as AudioLabs-Interface.

The main modification of the GVM-Interface with respect to the AudioLabs-Interface is the fact, that the volumes of the individual audio tracks can freely be chosen. This new property, which is essential for the features discussed below, is made possible through the use of the pywebaudioplayer (Pauwels &Sandler, 2018), implemented by Reza Dokht Dolatabadi (2020). It facilitates the perception of the polyphonic structure of the songs and allows the generation of various „pseudo ensemble tracks“ to sing along or to study the interaction of the singers with each other.

In addition, in order to overcome the limitations of classical score notation, we visually represent the melodic and harmonic progression in the songs in an intuitive and unbiased way as multi-voice F0-trajectories (pitch trajectories) and note tracks plots (c.f. Scherbaum & Mzhavanadze, 2020), not tied to any tuning system and which does not require the ability to read scores. Furthermore, we calculate and display the actually sung harmonic intervals between the three voices in real time in order to encourage harmonic perception („vertical thinking“ ).

Finally, as an additional innovative feature, the lyrics of the song for each voice is displayed in real time on top of the notes and in some of the display modes also close to the faces of the individual singers. This can be done in Georgian letters as well as in transcribed Roman letters (as it is done in the examples below).

Photo: Frank Scherbaum

Recording setup used for the collection of the of the GVM dataset (Scherbaum et al., 2019).

Interface Main Menu and Display Modes

The song selection starts from the main menu, of which a static screenshot is shown below. Research and experience indicate that learning is facilitated when teachers use a variety of techniques that are purposefully selected to achieve particular learning goals (National Research Council, 2002). To accommodate this strategy in the current version of the GVM-Interface, we have tentatively implemented five different display modes which can be seen as representing different teaching/learning scenarios.

Photo: Frank Scherbaum

Record Session mode

In this display mode, the recording location, the relative position of the singers as well as their (non-verbal) communication during the recording session can be observed while listening to their performance.

Photo: Frank Scherbaum

Hitting the corresponding Play button in the active main menu starts the display of the long shot video of the original recording session of the selected song. The chosen example shows the Svan song Elia Lrde (ID: GVM031), performed by Gigo Chamgeliani (left, top voice), Murad Pirtskhelani (center, middle voice), and Givi Chamgeliani (right, bass voice).

Audio Mix mode

Photo: Frank Scherbaum

Hitting the corresponding Play button in the active main menu starts the pywebaudioplayer (Pauwels & Sandler, 2018) for the three headset microphone tracks, for which the cross talk has been somewhat reduced by using the information about the voice activity of each singer contained in the corresponding larynx microphone recording.

F0-Trajectory mode

Photo: Frank Scherbaum

In this display mode, the GVM-Interface is started with a multi voice note track video for the sung notes (horizontal lines color coded red, blue, and black for the top, middle, and bass voice, respectively) and the F0-trajectories (pitch tracks) within those notes on the right side, and close-ups of the three singers’ faces on the left side.

Since pitch is a psychoacoustic quantity which can not be measured directly, F0-trajectories can be seen as a quantitative approximation to the actually sung pitches. It needs to be emphasized, however, that F0-trajectories not only capture sung notes but also other details of the intonation process, swallowing, clearing the throat, etc. See for example the sliding phases at the beginning and ends of notes, which are quite typical for the Svan intonation style. The cover image shows the complete song which during playback is replaced by shorter time windows (as shown here).

The note lyrics are displayed on top of the notes as well as as subtitles for the individual singers close-up face videos shown on the left side of the screen. The red cursors marks the actual position within the song. The green number in the ellipse in the upper right of the plot shows the F0-value of the lowest voice at the cursor position in Cents (relative to the chosen reference frequency of 55 Hz). The numbers displayed as sub- and superscripts give the harmonic interval of the second lowest voice with respect to the bass voice and the harmonic interval of the highest voice with respect to the lowest voice, respectively. Finally, the distribution on the right side of the plot shows the frequency distribution of the F0-values from within all the notes in the song. We refer to this as „pitch distribution“ or „pitch inventory“. It is one of the ways to display the tuning system used in that song. Each peak corresponds to a scale degree. For a detailed explanation see Scherbaum et al. (2020). The tilted numbers in gray between the peaks of this distribution show the intervals between the different scale degrees in cents. As a reminder, 100 cents correspond to a semitone. This way, any tuning system used can be displayed in an undistorted way.

Photo: Frank Scherbaum

Pseudo Score mode

Photo: Frank Scherbaum

In this display mode, the note pitches have all been mapped to the center pitches of their corresponding pitch group and the display of the F0-trajectories has been omitted (as an unnecessary detail). In contrast to the F0-trajectory mode which displays exactly the pitches which were sung by the singers, this mode is an interpretation of what pitch the singers might have wanted to sing, assuming that they might have wanted to sing the exact scale pitch. The motivation for this mode comes from the observation that pitch perception is categorical (c.f. Ganguli & Rao,). It is a mode which makes it easier for students to recognise which scale degree was sung. This is why we refer to it tentatively as Pseudo Score mode. We anticipate this to be the most attractive mode to learn a song, in particular in combination with the volume control for the individual audio tracks.

Photo: Frank Scherbaum

Karaoke mode

Photo: Frank Scherbaum

In this mode, we mimic a Karaoke situation by displaying only the lyrics together with close-ups of the singer’s faces. Like in the F0 trajectory mode and the Pseudo Score mode, the volume of the audio tracks can be individually adjusted (and not just muted as e.g. in the AudioLabs-Interface).

Photo: Frank Scherbaum

Meta Information

Photo: Frank Scherbaum

Each recording session during the 2016 field expedition was accompanied by extensive interviews with the singers, but also with other informants from the villages to collect contextual information regarding the background and history of the singers, about local customs, etc.

Selecting the Show Info button under the Meta Information entry will bring up a small subset of this information e. g. the singers’s names and recording dates.

The complete field reports can be obtained elsewhere, e. g. from the LaZAR-Database (open access) .

Photo: Frank Scherbaum

☞ Click Here to Access the GVM-Interface (Version: 2022/08/30)

Please note, that this is still work in progress and the online version might be changed without prior notification.

In order to improve the interface, your feedback is extremely important! Please send your comments to myself
at: fs@geo.uni-potsdam.de.

One more thing ...

Photo: Frank Scherbaum

Nana Mzhavanadze singing a single-singer-version of Guruli Sabodisho as as a potential teaching scenario.

Although originally designed as to access the GVM dataset (Scherbaum et al., 2019), the GVM-interface can also be used for audio/video tracks which are produced in overdubbing mode by individual singers. This allows to generate internet based teaching/learning scenarios in which students can sing along with selected individual voices or virtual ensembles for which the volumes of the individual voices can interactively be controlled in real time. For the recording of the demonstration below, the free QuicktTime software was used.

☞ Click Here to Access Single Singer Example.

Photo: Frank Scherbaum

Nana Mzhavanadze singing a single-singer-version of Guruli Sabodisho as as a potential teaching scenario.

References

Ganguli, K.K., and Rao, P. (2019). On the perception of raga motifs by trained musicians. The Journal of the Acoustical Society of America 145 (4): 2418 – 2434. ISSN: 0001-4966.

Georgian polyphonic singing. Proclaimed as Intangible cultural heritage in 2001, ich.unesco.org/en/RL/georgian-polyphonic-singing-00008, (last accessed 6 Nov. 2020)

National Research Council. (2002). Learning and Understanding: Improving Advanced Study of Mathematics and Science in U.S. High Schools. Learning and Understanding. doi.org/10.17226/10129

Pauwels, J., & Sandler, M. B. (2018). pywebaudioplayer : Bridging the gap between audio processing code in Python and attractive visualisations based on web technology. In Web Audio Conference WAC-2018, September 19–21, 2018, Berlin, Germany.

Lomsadze, T. (2019). When tradition meets modernity – Georgian folk-fusion music, Folk Life, 57:2, 122-140, DOI: 10.1080/04308778.2019.1656791

Scherbaum, F., S. Rosenzweig, M. Müller, D. Vollmer, and N Mzhavanadze (2018). Throat Microphones for Vocal Music Analysis, in Demos and Late Breaking News of the International Society for Music Information Retrieval Conference (ISMIR), 2018. (Link)

Scherbaum, F., Mzhavanadze, N., Rosenzweig, S., & Müller, M. (2019). Multi-media recordings of traditional Georgian vocal music for computational analysis. Proceedings of the 9th International Workshop on Folk Music Analysis, 2-4 July, 2019, Birmingham. (PDF)

Scherbaum, F., & Mzhavanadze, N. (2020). Svan Funeral Dirges (Zär): Musical Acoustical Analysis of a New Collection of Field Recordings. Musicologist, accepted subject to revisions. (PDF)