Skip to main content

TubeWork - the new occupational field of YouTubers in Germany

Inequality and self-economization in algorithm-based markets


Camera and person
Photo: Jodie Cook via Unsplash

Project: TubeWork
Funding: German Research Foundation (DFG)
Duration: 2022-2025

 

YouTube is currenty one of the most popular internet platforms worldwide. In Germany, the platform stands for the digital transformation of society. The research project "TubeWork" looks at the professional practice of YouTubers.

Camera and person
Photo: Jodie Cook via Unsplash

Project solicitation: Prof. Dr. Roland Verwiebe and Dr. Lena Seewann 


Former project staff: Marie Theres-Hesse (MA)


Aims

The project has multiple topics. On the one hand, it focuses on the self-perception of YouTubers as an occupational group asking the following questions:

  • What kind of meaning do YouTubers attribute to their actions on the platform? How are boundaries of recreational and professional activities defined?
  • Which chances and obstacles does this new occupational field offer?
  • Which strategies do YouTubers implement in order to use the platform professionally?

 

On the other hand, the project examines the inequality structures YouTubers are facing and how they come to be. This poses the following questions:

  • Which (work) biographies are typical for people using the platform?
  • Which role does the YouTube algorithm (as a non-human actor) play when it comes to inequalities of both visibility and income?
  • How strongly do characteristics such as age, gender, educational level, descent or line of work influence inequalities on YouTube?
  • How do sociotechnical skills or forms of organization affect the positioning on YouTube?

 

Current questions are: 

  • Which content-related patterns of sentiment and discrimination can be identified within YT?
  • What differences exist according to social structural characteristics (e.g. gender, migration background) and characteristics of the social media platform (sector of the YouTube channel, number of subscriptions, views and likes (reach / success))?
  • What individual coping strategies do content creators develop to deal with sentiment and discrimination practices on social media? How do content creators perceive YouTube's algorithm in their work? To what extent do content creators perceive the algorithmic system of social media platforms as an independent entity that has its own ascribed agency?

Watchtime-Statistiken für YouTube
Photo: Szabo Viktor via Unsplash

Data base

The research project will utilise a range of different research methods:

  • 30 problem-centred interviews were conducted between October 2022 and April 2023 with women and men who represent typical YouTube industries and operate a YouTube channel for professional purpose
    • The sampling of YouTubers was also carried out according to channel reach and age and socio-demographic characteristics such as gender, age, migration background and educational background
    • The field access was created via various entry points to reach different parts of the YouTube community
    • The interviews were conducted via Zoom
  • A total survey of YouTube channels from the D-A-C-H region (Germany, Austria, Switzerland) was carried out using web scraping via the YouTube API. Relevant platform-specific characteristics and activities such as video frequency, views, channel descriptions and the number of subscribers were recorded. 
    • In total, the data set contains channel information from 120,000 channels
    • This is supplemented by further data-intensive information such as the comments under certain videos or the automatically generated subtitles under certain videos
  • With the help of a self-designed classification survey, data from 5000 randomly selected content creators from the scraped data set were classified in relation to the topics of professional practice (success/reach on YT, income, industry) and the social composition of content creators (age, gender, education, migration background).  (→ see classification survey)
  • The next step will be an online survey of around 2000 German YouTubers. The aim is to obtain information on everyday working life, job satisfaction and existing values
Watchtime-Statistiken für YouTube
Photo: Szabo Viktor via Unsplash

Classification survey 

A classification survey is used to generate variables that are partially visible on the platform but cannot be scraped. With the help of this survey, student assistants systematically classified the characteristics of 5000 channels (e.g. gender, age, migration background, but also channel-specific characteristics such as the number of people involved, other sources of income such as Patreon or sponsored videos, and the specific topic of the channel). The classification survey gives the coders three stimuli one after the other (the channel description, the last video and finally other sources such as Wikipedia or LinkedIn), which help to understand where individual pieces of information come from and records each person within a channel individually (see image).

  • This data serves as an extension of the scraped data set and allows the analysis of sociological questions on YouTube, which includes demographic characteristics of the channels and is therefore unique in the German-speaking world.
  • The next step will be to expand this data set using machine learning methods.

Methods and current projects

In addition, a variety of analytical methods from the fields of qualitative and quantitative social research, mixed methods and computational social science are used, including:

- A thematic analysis (Braun & Clarke, 2012) to explore the question of how content creators perceive the YouTube algorithm in their work

  • Initial findings show the perception of the algorithm as an independent identity with agency and sometimes a focus on the four themes of the unpredictability of the algorithm, the dynamics of the algorithm, the power of the algorithm and inequality through the algorithm.

- A type formation (Kelle & Kluge 1999) on coping and strategies that content creators develop to deal with sentiment and discrimination practices on social media

  • First findings (→ see conferences)

Dictionary, machine learning (e.g. Lasso regression) and deep learning (e.g. Bert algorithm) methodologies for generating variables such as gender, but also sentiment or hate speech from YouTube comments

→ Based on this, various regression and cluster analyses

  • First findings ( → see conferences & publications)

Webscraping

To make data from the YouTube platform accessible to us, we use the YouTube Data API. APIs (Application Programming Interfaces) offer the option of integrating functions of the website in question into your own websites or apps by accessing them using various programming languages. In our case, we use Python to retrieve and store data in a structured way using the interface (API). We are therefore able to scrape information on channels (e.g. name, description, number of subscribers, number of videos, profile picture), videos (e.g. number of views, number of likes, topics) and comments (e.g. content, number of likes, name of the channel). The corresponding Python scripts are available here.

 

In addition, the source code of the channels was read, which contains additional information that cannot be retrieved via the YouTube Data API.

 

Since December 2022, data has been scraped daily from the 120,000 channels, which now amount to almost 200GB. This is cross-sectional data, although a longitudinal data set has also been created for selected channels and videos since November 2023.


Results

Conferences

  • Philipp, Aaron; Verwiebe, Roland; Weißmann, Sarah (2023): „Sentiment und Diskriminierung auf Social Media? Ein Mixed-Methods-Ansatz zur Untersuchung von Kommunikations- und Debattenstrukturen innerhalb der YouTube-Community“, DGS Section "social inequality and social structure analysis" - Spring Conference 2023
  • Philipp, Aaron; Verwiebe, Roland; Weißmann, Sarah (2023): “Sentiment and hate-speech on social media. An analysis of the relevance of gender and race for the emergence of new communication structures within the YouTube Community”, General Online Research (GOR 23)

 

Publications

 

Further informations


Funding

The project has been funded for the duration of three years by the German Research Foundation (DFG).


YouTube
Photo: NordWood via Unsplash