Uni Potsdam LaBank LangAge

la-bank: Resources for Research and Teaching

Eine Sammlung von Sprachressourcen des Bereichs Romanische Sprachwissenschaft (Französisch und Italienisch) an der Universität Potsdam

Contact: langage -at- uni-potsdam.de

eBay petites annonces

The corpus

A collection of 1256 online auction listings (petites annonces) collected from the platform eBay.fr, covering a time span of 13 years. The corpus is split into four subcorpora. The first subcorpus consists of 300 listings from 2005, from private users, the second and third subcorpora are from 2017, and feature 300 listings from private and professional users, respectively. The fourth subcorpus was created in 2018 and features 356 listings from private users.

How the listings were collected


The corpus has been annotated for various features. The various tags and their meanings are as following:

In addition to these tags which are used consistently throughout all four subcorpora, the first subcorpus (2005) contains extra tags.


Average length of listings:
e05p: 43 tokens, e17p: 49 tokens, e17x: 177 tokens, e18v: 97 tokens.
The length of the listing varies depending on the category.

Distribution of categories in the first three subcorpora (e05p, e17c, e17p):

The XML file contains various metadata for each listing: a unique ID, the year and month it was collected in and the category the listing belongs to. Some subcorpora have additional metadata, listed below:

File format

The corpus is available to download in XML format. The file contains all four subcorpora, and each listing has a unique ID, the year and month it was collected in, the category the listing is from, and some additional tags (as explained above). PDFs of screenshots of the listings are also available for all subcorpora, although due to technical reasons, only the first 249 are available for the 2018 subcorpus.

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Quelle/source: Gerstenberg, Annette, Valerie Hekkel & Freya Hewett. 2019. Online Auction Listings Between Community and Commerce. In Julien Longhi & Claudia Marinica (eds.), Proceedings of the 7th Conference on CMC and Social Media Corpora for the Humanities (CMC-Corpora2019), 9–10 September 2019, Cerby-Pontoise University, France, 1–5. Cergy-Pontoise: scienceconf.org.

eBay.fr-corpus = Gerstenberg, Annette & Freya Hewett. 2019. A collection of online auction listings from 2005 to 2018 (anonymised). University of Potsdam: LA-bank. https://www.uni-potsdam.de/langage/la-bank/ebay.php
Bitte zitieren Sie das Korpus mit der angegebenen Quelle. Please always cite this corpus if you use it in further work.

Vielen Dank für Ihre Interesse. Um das Korpus herunterzuladen, geben Sie bitte die Emailadresse ein, die Sie schon bei uns registriert haben:
Thank you for your interest in our corpora. To download this corpus, please enter your email address (in exactly the same form that you used in the registration form):


Haben Sie sich noch nicht bei uns registriert? Bitte füllen Sie dieses Formular aus.
If you haven't registered with us already, please fill in the form here.

I have read and accept the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.