Suchergebnisse

The research and teaching corpus of spoken German – FOLK

Autor*in: Schmidt, Thomas

Erschienen: 2014

Verlag: Institut für Deutsche Sprache, Bibliothek, Mannheim

Bibliographische Angaben
Zugang

Zugang:

Resolving-System

Langzeitarchivierung Nationalbibliothek

Verlag (kostenfrei)

Export in Literaturverwaltung

Quelle:	DNB Sachgruppe Deutsche Sprache und Literatur
Sprache:	Englisch
Medientyp:	Unbestimmt
Format:	Online
Weitere Identifier:	urn: urn:nbn:de:bsz:mh39-24434
Schlagworte:	Deutsch; Gesprochene Sprache; Korpus <Linguistik>
Weitere Schlagworte:	Forschungs- und Lehrkorpus Gesprochenes Deutsch = FOLK
Umfang:	Online-Ressource
Bemerkung(en):	In: Proceedings of the ninth conference on international language resources and evaluation (LREC’14) . - Reykjavik : European Language Resources Association (ELRA), 2014., S. 383-387

Multimedia Corpora (Media encoding and annotation) : Draft submitted to CLARIN WG 5.7. as input to CLARIN deliverable D5.C3 “Interoperability and Standards”

Autor*in: Schmidt, Thomas ; Elenius, Kjell ; Trilsbeek, Paul

Erschienen: 2014

Bibliographische Angaben
Zugang

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/2234 https://ids-pub.bsz-bw.de/files/2234/Schmidt_Multimedia%20corpora_2010.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-22341

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Unbestimmt
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	gesprochene Sprache; Korpus; Notation; Standardisierung; Computerlinguistik; Multimedia
Lizenz:	rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

The research and teaching corpus of spoken German – FOLK

Autor*in: Schmidt, Thomas

Erschienen: 2014

Bibliographische Angaben
Zugang

Volltext:	https://d-nb.info/1135918678/34 http://www.lrec-conf.org/proceedings/lrec2014/summaries/290.html https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/2443
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-24434

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Unbestimmt
Format:	Online
DDC Klassifikation:	Germanische Sprachen; Deutsch (430)
Schlagworte:	Forschungs- und Lehrkorpus Gesprochenes Deutsch = FOLK
Lizenz:	kostenfrei

The database for spoken German - DGD2

Autor*in: Schmidt, Thomas

Erschienen: 2014

Verlag: Institut für Deutsche Sprache, Bibliothek, Mannheim

Zugang:

Resolving-System

Langzeitarchivierung Nationalbibliothek

Verlag (kostenfrei)

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	Verbundkataloge
Sprache:	Englisch
Medientyp:	Buch (Monographie)
Format:	Online
Weitere Identifier:	urn: urn:nbn:de:bsz:mh39-24425
DDC Klassifikation:	Sprache (400)
Schlagworte:	gesprochene Sprache; Korpus <Linguistik>
Weitere Schlagworte:	Datenbank für gesprochenes Deutsch = DGD
Umfang:	Online-Ressource
Bemerkung(en):	In: Proceedings of the ninth conference on international language resources and evaluation (LREC’14). - Reykjavik : European Language Resources Association (ELRA), 2014., S. 1451-1457

The research and teaching corpus of spoken German – FOLK

Autor*in: Schmidt, Thomas

Erschienen: 2014

Verlag: Institut für Deutsche Sprache, Bibliothek, Mannheim

Zugang:

Resolving-System

Langzeitarchivierung Nationalbibliothek

Verlag (kostenfrei)

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	Verbundkataloge
Sprache:	Englisch
Medientyp:	Buch (Monographie)
Format:	Online
Weitere Identifier:	urn: urn:nbn:de:bsz:mh39-24434
DDC Klassifikation:	Germanische Sprachen; Deutsch (430)
Schlagworte:	Deutsch; Gesprochene Sprache; Korpus <Linguistik>
Weitere Schlagworte:	Forschungs- und Lehrkorpus Gesprochenes Deutsch = FOLK
Umfang:	Online-Ressource
Bemerkung(en):	In: Proceedings of the ninth conference on international language resources and evaluation (LREC’14) . - Reykjavik : European Language Resources Association (ELRA), 2014., S. 383-387

Best Practices for Spoken Corpora in Linguistic Research

Autor*in: Haugh, Michael

Erschienen: 2014

Verlag: Cambridge Scholars Publishing, Newcastle upon Tyne

A key concern of researchers involved in the creation and sharing of language resources is to attain maximum usability, reliability and longevity of these resources for present and future researchers in the language sciences. The view developed in... mehr

Frankfurt/Main: Hessisches BibliotheksInformationsSystem HeBIS

Standort:

Hessisches BibliotheksInformationsSystem HeBIS

Fernleihe:

keine Fernleihe

Link zum Verbundkatalog:

Hessisches BibliotheksInformationsSystem (HeBIS)

Frankfurt/Main: Universitätsbibliothek J. C. Senckenberg, Zentralbibliothek (ZB)

Standort:

Universität Frankfurt, Elektronische Ressourcen

Signatur:

/

Fernleihe:

keine Fernleihe

Link zum Verbundkatalog:

Hessisches BibliotheksInformationsSystem (HeBIS)

A key concern of researchers involved in the creation and sharing of language resources is to attain maximum usability, reliability and longevity of these resources for present and future researchers in the language sciences. The view developed in this volume is that spoken corpora construction and sharing are major research endeavours that should also be laid open to academic debate in a manner that is more visible than is currently the case in corpus linguistics. The present volume brings...

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	Verbundkataloge
Beteiligt:	Schmidt, Thomas; Wörner, Kai; Ruhi, Sükriye
Sprache:	Englisch
Medientyp:	Ebook
Format:	Online
ISBN:	9781443860338; 9781443865548 (Sekundärausgabe)
RVK Klassifikation:	ES 115 ; ES 900 ; ER 765
Umfang:	282 p.
Bemerkung(en):	Description based upon print version of record Online-Ausg.:

EXMARaLDA

Autor*in: Schmidt, Thomas

Erschienen: 2014

Mannheim: Leibniz-Institut für Deutsche Sprache (IDS), Bibliothek

Standort:

Leibniz-Institut für Deutsche Sprache (IDS), Bibliothek

Fernleihe:

keine Fernleihe

Link zum Verbundkatalog:

Südwestdeutscher Bibliotheksverbund (SWB)

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	Leibniz-Institut für Deutsche Sprache, Bibliothek
Beteiligt:	Wörner, Kai
Sprache:	Englisch
Medientyp:	Aufsatz aus einem Sammelband
Format:	Druck
Übergeordneter Titel:	In: The Oxford handbook of corpus phonology; Oxford [u.a.] : Oxford University Press, 2014; (2014), Seite [402]-419; XVI, 662 S.

Introduction: putting practices in spoken corpora into focus

Autor*in:

Erschienen: 2014

Mannheim: Leibniz-Institut für Deutsche Sprache (IDS), Bibliothek

Standort:

Leibniz-Institut für Deutsche Sprache (IDS), Bibliothek

Fernleihe:

keine Fernleihe

Link zum Verbundkatalog:

Südwestdeutscher Bibliotheksverbund (SWB)

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	Leibniz-Institut für Deutsche Sprache, Bibliothek
Beteiligt:	Ruhi, Şükriye; Schmidt, Thomas
Sprache:	Englisch
Medientyp:	Aufsatz aus einem Sammelband
Format:	Druck
Übergeordneter Titel:	In: Best practices for spoken corpora in linguistic research; Newcastle upon Tyne : Cambridge Scholars Publ., 2014; (2014), Seite [1]-17; VIII, 272 S.

(More) common ground for processing spoken language corpora?

Autor*in: Schmidt, Thomas

Erschienen: 2014

Mannheim: Leibniz-Institut für Deutsche Sprache (IDS), Bibliothek

Standort:

Leibniz-Institut für Deutsche Sprache (IDS), Bibliothek

Fernleihe:

keine Fernleihe

Link zum Verbundkatalog:

Südwestdeutscher Bibliotheksverbund (SWB)

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	Leibniz-Institut für Deutsche Sprache, Bibliothek
Sprache:	Englisch
Medientyp:	Aufsatz aus einem Sammelband
Format:	Druck
Übergeordneter Titel:	In: Best practices for spoken corpora in linguistic research; Newcastle upon Tyne : Cambridge Scholars Publ., 2014; (2014), Seite [249]-265; VIII, 272 S.

A TEI-based approach to standardising spoken language transcription

Autor*in: Schmidt, Thomas

Erschienen: 2014

This paper formulates a proposal for standardising spoken language transcription, as practised in conversation analysis, sociolinguistics, dialectology and related fields, with the help of the TEI guidelines. Two areas relevant to standardisation are... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/2225 https://ids-pub.bsz-bw.de/files/2225/Schmidt-a-tei-based-approach-to-standardising-spoken-language-transcription_2011.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-22256

This paper formulates a proposal for standardising spoken language transcription, as practised in conversation analysis, sociolinguistics, dialectology and related fields, with the help of the TEI guidelines. Two areas relevant to standardisation are identified and discussed: first, the macro structure of transcriptions, as embodied in the data models and file formats of transcription tools such as ELAN, Praat or EXMARaLDA; second, the micro structure of transcriptions as embodied in transcription conventions such as CA, HIAT or GAT. A two-step process is described in which first the macro structure is represented in a generic TEI format based on elements defined in the P5 version of the Guidelines. In the second step, character data in this representation is parsed according to the regularities of a transcription convention resulting in a more fine-grained TEI markup which is also based on P5. It is argued that this two step process can, on the one hand, map idiosyncratic differences in tool formats and transcription conventions onto a unified representation. On the other hand, differences motivated by different theoretical decisions can be retained in a manner which still allows a common processing of data from different sources. In order to make the standard usable in practice, a conversion tool—TEI Drop—is presented which uses XSL transformations to carry out the conversion between different tool formats (CHAT, ELAN, EXMARaLDA, FOLKER and Transcriber) and the TEI representation of transcription macro structure (and vice versa) and which also provides methods for parsing the micro structure of transcriptions according to two different transcription conventions (HIAT and cGAT). Using this tool, transcribers can continue to work with software they are familiar with while still producing TEI-conformant transcription files. The paper concludes with a discussion of the work needed in order to establish the proposed standard. It is argued that both tool formats and the TEI guidelines are in a sufficiently mature ...

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Aufsatz aus einer Zeitschrift
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	gesprochene Sprache; Transkription; Standardisierung
Lizenz:	creativecommons.org/licenses/by-nd/3.0/de/ ; info:eu-repo/semantics/openAccess

Refining and Exploiting the Structural Markup of the eWDG

Autor*in: Schmidt, Thomas ; Geyken, Alexander ; Storrer, Angelika

Erschienen: 2014

Verlag: Barcelona : Institut Universitari de Linguistica Aplicada, Universitat Pompeu Fabra:

In this paper, the authors describe a semi-automated approach to refine the dictionary-entry structure of the digital version of the Wörterbuch der deutschen Gegenwartssprache (WDG, en.: Dictionary of Present-day German), a dictionary compiled and... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/2258 https://ids-pub.bsz-bw.de/files/2258/Schmidt_Geyken_Storrer_Refining_and_Exploiting_the_Structural_Markup_2008.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-22582

In this paper, the authors describe a semi-automated approach to refine the dictionary-entry structure of the digital version of the Wörterbuch der deutschen Gegenwartssprache (WDG, en.: Dictionary of Present-day German), a dictionary compiled and published between 1952 and 1977 by the Deutsche Akademie der Wissenschaften that comprises six volumes with over 4,500 pages containing more than 120,000 headwords. We discuss the benefits of such a refinement in the context of the dictionary project Digitales Wörterbuch der deutschen Sprache (DWDS, en: Digital Dictionary of the German language). In the current phase of the DWDS project, we aim to integrate multiple dictionary and corpus resources in German language into a digital lexical system (DLS). In this context, we plan to expand the current DWDS interface with several special purpose components, which are adaptive in the sense that they offer specialized data views and search mechanisms for different dictionary functions-e.g. text comprehension, text production-and different user groups-e.g. journalists, translators, linguistic researchers, computational linguists. One prerequisite for generating such data views is the selective access to the lexical items in the article structure of the dictionaries which are the object of study. For this purpose, the representation of the eWDG has to be refined. The focus of this paper is on the semiautomated approach used to transform eWDG into a refined version in which the main structural units can be explicitly accessed. We will show how this refinement opens new and flexible ways of visualizing and querying the lexicographic content of the refined version in the context of the DLS project.

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Konferenzveröffentlichung
Format:	Online
DDC Klassifikation:	Wörterbücher (413)
Schlagworte:	Computerunterstützte Lexikographie
Lizenz:	creativecommons.org/licenses/by-nc-sa/3.0/ ; info:eu-repo/semantics/openAccess

Interfacing Lexical and Ontological Information in a Multilingual Soccer FrameNet

Autor*in: Schmidt, Thomas

Erschienen: 2014

This paper presents ongoing work on a multilingual (English, French, German) lexical resource of soccer language. The first part describes how lexicographic descriptions based on frame-semantic principles are derived from a partially aligned... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/2263 https://ids-pub.bsz-bw.de/files/2263/schmidt_interfacing_lexical_2006.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-22635

This paper presents ongoing work on a multilingual (English, French, German) lexical resource of soccer language. The first part describes how lexicographic descriptions based on frame-semantic principles are derived from a partially aligned multilingual corpus of soccer match reports. The remainder of the paper then discusses how different types of ontological knowledge are linked to this resource in order to provide an access structure to the resulting dictionary. It is argued that linking lexical resources and ontologies in such a way provides novel ways to a dictionary user of navigating a domain vocabulary

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Konferenzveröffentlichung
Format:	Online
DDC Klassifikation:	Wörterbücher (413)
Schlagworte:	Sportsprache; Fachsprache; Fußball; computerunterstützte Lexikographie; Wörterbuch; Deutsch; Englisch; Französisch; Korpus
Lizenz:	rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

Avoiding Data Graveyards : from Heterogeneous Data Collected in Multiple Research Projects to Sustainable Linguistic Resources

Autor*in: Schmidt, Thomas ; Chiarcos, Christian ; Lehmberg, Timm ; Rehm, Georg ; Witt, Andreas ; Hinrichs, Erhard

Erschienen: 2014

This paper describes a new research initiative addressing the issue of sustainability of linguistic resources. The initiative is a cooperation between three collaborative research centres in Germany – the SFB 441 “Linguistic Data Structures” in... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/2268 https://ids-pub.bsz-bw.de/files/2268/Schmidt%20etc_Avoiding%20Data%20Graveyards_2006.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-22687

This paper describes a new research initiative addressing the issue of sustainability of linguistic resources. The initiative is a cooperation between three collaborative research centres in Germany – the SFB 441 “Linguistic Data Structures” in Tübingen, the SFB 538 “Multilingualism” in Hamburg, and the SFB 632 “Information Structure” in Potsdam/Berlin. The aim of the project is to develop methods for sustainable archiving of the diverse bodies of linguistic data used at the three sites. In the first half of the paper, the data handling solutions developed so far at the three centres are briefly introduced. This is followed by an assessment of their commonalities and differences and of what these entail for the work of the new joint initiative. The second part then sketches seven areas of open questions with respect to sustainable data handling and gives a more detailed account of two of them – integration of linguistic terminologies and development of best practice guidelines.

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Konferenzveröffentlichung
Format:	Online
DDC Klassifikation:	Linguistik (410)
Schlagworte:	Forschungsdaten; Linguistik; Standardisierung; Langzeitarchivierung
Lizenz:	rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

Sustainability of Linguistic Resources

Autor*in: Dipper, Stefanie ; Hinrichs, Erhard ; Schmidt, Thomas ; Wagner, Andreas ; Witt, Andreas

Erschienen: 2014

This paper describes a new research initiative addressing the issue of sustainability of linguistic resources. This initiative is a cooperation between three linguistic collaborative research centres in Germany, which comprise more than 40 individual... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/2271 https://ids-pub.bsz-bw.de/files/2271/Schmidt%20etc_Sustainability_2006.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-22718

This paper describes a new research initiative addressing the issue of sustainability of linguistic resources. This initiative is a cooperation between three linguistic collaborative research centres in Germany, which comprise more than 40 individual research projects altogether. These projects are involved in creating manifold language resources, especially corpora, tailored to their particular needs. The aim of the project described here is to ensure an effective and sustainable access of these data by third-party researchers beyond the termination of these projects. This goal involves a number of measures, such as the definition of a common data format to completely capture the heterogeneous information encoded in the individual corpora, the development of user-friendly and sustainably usable tools for processing (e.g. querying) the data, and the specification of common inventories of metadata and terminology. Moreover, the project aims at formulating general rules of best practice for creating, accessing, and archiving linguistic resources.

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Konferenzveröffentlichung
Format:	Online
DDC Klassifikation:	Linguistik (410)
Schlagworte:	Forschungsdaten; Linguistik; Computerlinguistik; Langzeitarchivierung
Lizenz:	rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

Time-based data models and the Text Encoding Initiative’s guidelines for transcription of speech

Autor*in: Schmidt, Thomas

Erschienen: 2014

Verlag: Hamburg : Universität

Bibliographische Angaben
Zugang

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/2272 https://ids-pub.bsz-bw.de/files/2272/Schmidt_Time%20Based%20Data%20Models%20and%20the%20Text%20Encoding%20Initiative_2006.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-22729

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Bericht
Format:	Online
Schlagworte:	gesprochene Sprache; Transkription; Computerlinguistik; Gesprächsanalyse; Software
Lizenz:	rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

The Kicktionary – A Multilingual Lexical Resource of Football Language

Autor*in: Schmidt, Thomas

Erschienen: 2014

Verlag: Berlin ; New York, NY : de Gruyter

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/2348 https://ids-pub.bsz-bw.de/files/2348/Schmidt_The_Kicktionary_2009.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-23482

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Aufsatz aus einem Sammelband
Format:	Online
DDC Klassifikation:	Linguistik (410)
Schlagworte:	Computerunterstützte Lexikographie; Frame-Theorie; Sportsprache; Fußball
Lizenz:	rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

The Kicktionary : Combining corpus linguistics and lexical semantics for a multilingual football dictionary

Autor*in: Schmidt, Thomas

Erschienen: 2014

Verlag: Tübingen : Narr

This paper presents the Kicktionary, a multilingual (English - German - French) electronic lexical resource of the language of football. In the Kicktionary, methods from corpus linguistics and two approaches to lexical semantics - the theory of frame... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/2349 https://ids-pub.bsz-bw.de/files/2349/Schmidt_The_Kicktionary_2008.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-23491

This paper presents the Kicktionary, a multilingual (English - German - French) electronic lexical resource of the language of football. In the Kicktionary, methods from corpus linguistics and two approaches to lexical semantics - the theory of frame semantics and the concept of semantic relations - are combined to construct a lexical resource in which the user can explore relationships between lexical units in various ways. This paper explains the theoretical background of the Kicktionary, sketches the data and methods which were used in its construction, and describes how the resulting resource is presented to users via a set of hyperlinked webpages.

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Aufsatz aus einem Sammelband
Format:	Online
DDC Klassifikation:	Wörterbücher (413)
Schlagworte:	Computerunterstützte Lexikographie; Sportsprache; Wörterbuch; Fußball; Korpus
Lizenz:	rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

The Kicktionary: A Multilingual Resource of the Language of Football

Autor*in: Schmidt, Thomas

Erschienen: 2014

Verlag: Tübingen : Narr

This paper presents the Kicktionary, a multilingual (English — German - French) electronic lexical resource of the language of football. It explains how a corpus of football match reports was analysed according to the FrameNet and WordNet approaches... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/2350 https://ids-pub.bsz-bw.de/files/2350/Schmidt_The_Kicktionary_A_Multilingual_Resource_2007.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-23500

This paper presents the Kicktionary, a multilingual (English — German - French) electronic lexical resource of the language of football. It explains how a corpus of football match reports was analysed according to the FrameNet and WordNet approaches and how the result of this analysis is presented to a dictionary user via a website

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Konferenzveröffentlichung
Format:	Online
DDC Klassifikation:	Wörterbücher (413)
Schlagworte:	Computerunterstützte Lexikographie; Sportsprache; Fußball; Korpus; Frame-Theorie
Lizenz:	rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

Collaborative Commentary: Opening Up Spoken Language Databases

Autor*in: MacWhinney, Brian ; Martell, Craig ; Schmidt, Thomas ; Wagner, Johannes ; Wittenburg, Peter ; Brugman, Hennie ; Broeder, Daan

Erschienen: 2014

Verlag: Paris : ELRA

We define collaborative commentary as the involvement of a research community in the interpretive annotation of electronic records. The goal of this process is the evaluation of competing theoretical claims. The process requires commentators to link... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/2369 https://ids-pub.bsz-bw.de/files/2369/MacWhinney_Martell_Schmidt_Wagner_Wittenburg_Collaborative_Commentary_%20Opening_Up_Spoken_Language_Databases2004.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-23695

We define collaborative commentary as the involvement of a research community in the interpretive annotation of electronic records. The goal of this process is the evaluation of competing theoretical claims. The process requires commentators to link their comments and related evidentiary materials to specific segments of either transcripts or electronic media. Here, we examine current work in the construction of technical methods for facilitating collaborative commentary through browser technology. To illustrate the relevance of this approach, we examine seven spoken language database projects that have reached a level of web-based publication that makes them good candidates as targets of collaborative commentary technology. For each database, we show how collaborative commentary can advance the relevant research agendas.

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Konferenzveröffentlichung
Format:	Online
DDC Klassifikation:	Linguistik (410)
Schlagworte:	Videaufzeichnung; Interaktionsanalyse; Annotation
Lizenz:	rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

The transcription system EXMARaLDA: An application of the annotation graph formalism as the basis of a database of multilingual spoken discourse

Autor*in: Schmidt, Thomas

Erschienen: 2014

Verlag: Philadelphia : University of Pennsylvania - Institute for Research in Cognitive Science

This paper describes EXMARaLDA, a system for computer transcription of spoken discourse developed and used by the SFB "Mehrsprachigkeit" at the university of Hamburg. EXMARaLDA consists of several DTDs for XML coding of transcription data and some... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/2373 https://ids-pub.bsz-bw.de/files/2373/Schmidt_Transcription_system_EXMARaLDA_2001.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-23735

This paper describes EXMARaLDA, a system for computer transcription of spoken discourse developed and used by the SFB "Mehrsprachigkeit" at the university of Hamburg. EXMARaLDA consists of several DTDs for XML coding of transcription data and some input and output tools for these formats. Apart from being a transcription system in its own right, EXMARaLDA also plays the role of a mediator between older existing data formats at the SFB and between these formats and a planned database of multilingual spoken discourse.

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Konferenzveröffentlichung
Format:	Online
DDC Klassifikation:	Linguistik (410)
Schlagworte:	Gesprochene Sprache; Transkription; Computerlinguistik
Lizenz:	rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

The database for spoken German - DGD2

Autor*in: Schmidt, Thomas

Erschienen: 2014

Verlag: Reykjavik : European Language Resources Association (ELRA)

The Database for Spoken German (Datenbank für Gesprochenes Deutsch, DGD2, http://dgd.ids-mannheim.de) is the central platform for publishing and disseminating spoken language corpora from the Archive of Spoken German (Archiv für Gesprochenes Deutsch,... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/2442 https://ids-pub.bsz-bw.de/files/2442/Schmidt_Database%20of%20spoken%20language_2014.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-24425

The Database for Spoken German (Datenbank für Gesprochenes Deutsch, DGD2, dgd.ids-mannheim.de) is the central platform for publishing and disseminating spoken language corpora from the Archive of Spoken German (Archiv für Gesprochenes Deutsch, AGD, agd.ids-mannheim.de) at the Institute for the German Language in Mannheim. The corpora contained in the DGD2 come from a variety of sources, some of them in-house projects, some of them external projects. Most of the corpora were originally intended either for research into the (dialectal) variation of German or for studies in conversation analysis and related fields. The AGD has taken over the task of permanently archiving these resources and making them available for reuse to the research community. To date, the DGD2 offers access to 19 different corpora, totalling around 9000 speech events, 2500 hours of audio recordings or 8 million transcribed words. This paper gives an overview of the data made available via the DGD2, of the technical basis for its implementation, and of the most important functionalities it offers. The paper concludes with information about the users of the database and future plans for its development.

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Konferenzveröffentlichung
Format:	Online
Schlagworte:	gesprochene Sprache; Korpus
Lizenz:	creativecommons.org/licenses/by-nc/3.0/de/deed.de ; info:eu-repo/semantics/openAccess

New and future developments in EXMARaLDA

Autor*in: Schmidt, Thomas ; Wörner, Kai ; Hedeland, Hanna ; Lehmberg, Timm

Erschienen: 2014

Verlag: Hamburg : Universität

We present some recent and planned future developments in EXMARaLDA, a system for creating, managing, analysing and publishing spoken language corpora. The new functionality concerns the areas of transcription and annotation, corpus management, query... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/2228 https://ids-pub.bsz-bw.de/files/2228/Schmidt_New%20and%20future%20developments%20in%20EXMARaLDA_2011.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-22288

We present some recent and planned future developments in EXMARaLDA, a system for creating, managing, analysing and publishing spoken language corpora. The new functionality concerns the areas of transcription and annotation, corpus management, query mechanisms, interoperability and corpus deployment. Future work is planned in the areas of automatic annotation, standardisation and workflow management.

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Aufsatz aus einer Zeitschrift
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	gesprochene Sprache; Korpus
Lizenz:	rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

Multilingual Corpora at the Hamburg Centre for Language Corpora

Autor*in: Hedeland, Hanna ; Lehmberg, Timm ; Schmidt, Thomas ; Wörner, Kai

Erschienen: 2014

We give an overview of the content and the technical background of a number of corpora which were developed in various projects of the Research Centre on Multilingualism (SFB 538) between 1999 and 2011 and which are now made available to the... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/2230 https://ids-pub.bsz-bw.de/files/2230/Schmidt_Multilingual%20Corpora_2011.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-22305

We give an overview of the content and the technical background of a number of corpora which were developed in various projects of the Research Centre on Multilingualism (SFB 538) between 1999 and 2011 and which are now made available to the scientific community via the Hamburg Centre for Language Corpora.

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Aufsatz aus einer Zeitschrift
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	Mehrsprachigkeit; Korpus; gesprochene Sprache
Lizenz:	rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

Linguistic tool development between community practices and technology standards

Autor*in: Schmidt, Thomas

Erschienen: 2014

This contribution addresses the workshop topic of “standardising policies within eHumanities infrastructures”. It relates 10 years of experience with language resource standards, gained in the development of EXMARaLDA, a system for the construction... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/2231 https://ids-pub.bsz-bw.de/files/2231/Schmidt_Linguistic%20tool%20development_2010.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-22314

This contribution addresses the workshop topic of “standardising policies within eHumanities infrastructures”. It relates 10 years of experience with language resource standards, gained in the development of EXMARaLDA, a system for the construction and exploitation of spoken language corpora. Section 2 gives an overview of the EXMARaLDA system focussing on its relationship with existing and evolving standards for language resources. Section 3 presents the HIAT system as an example of an established community practice. Section 4 then addresses several issues that where encountered when trying to bring together HIAT, EXMARaLDA and the wider standard world.

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Aufsatz aus einer Zeitschrift
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	gesprochene Sprache; Korpus; Transkription; Computerlinguistik; Standardisierung
Lizenz:	rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

FOLKER : an annotation tool for efficient transcription of natural, multi-party interaction

Autor*in: Schmidt, Thomas ; Schütte, Wilfried

Erschienen: 2014

Verlag: Valletta, Malta : European Language Resources Association (ELRA)

This paper presents FOLKER, an annotation tool developed for the efficient transcription of natural, multi-party interaction in a conversation analysis framework. FOLKER is being developed at the Institute for German Language in and for the FOLK... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/2232 https://ids-pub.bsz-bw.de/files/2232/Schmidt_Schuette_FOLKER_2010_Paper.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-22323

This paper presents FOLKER, an annotation tool developed for the efficient transcription of natural, multi-party interaction in a conversation analysis framework. FOLKER is being developed at the Institute for German Language in and for the FOLK project, whose aim is the construction of a large corpus of spoken present-day German, to be used for research and teaching purposes. FOLKER builds on the experience gained with multi-purpose annotation tools like ELAN and EXMARaLDA, but attempts to improve transcription efficiency by restricting and optimizing both data model and tool functionality to a single, well-defined purpose. This paper starts with a description of the GAT transcription conventions and the data model underlying the tool. It then gives an overview of the tool functionality and compares this functionality to that of other widely used tools.

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Konferenzveröffentlichung
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	gesprochene Sprache; Korpus; Transkription; Computerlinguistik
Lizenz:	rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

Filtern nach

Aktive Filter

Kategorien:

Bereich

Quelle

Format

Beteiligt

Medientyp

Sprache

Jahr

Letzte Suchanfragen

Ergebnisse für *

The research and teaching corpus of spoken German – FOLK

Multimedia Corpora (Media encoding and annotation) : Draft submitted to CLARIN WG 5.7. as input to CLARIN deliverable D5.C3 “Interoperability and Standards”

The research and teaching corpus of spoken German – FOLK

The database for spoken German - DGD2

The research and teaching corpus of spoken German – FOLK

Best Practices for Spoken Corpora in Linguistic Research

Frankfurt/Main: Hessisches BibliotheksInformationsSystem HeBIS

Frankfurt/Main: Universitätsbibliothek J. C. Senckenberg, Zentralbibliothek (ZB)

EXMARaLDA

Mannheim: Leibniz-Institut für Deutsche Sprache (IDS), Bibliothek

Introduction: putting practices in spoken corpora into focus

Mannheim: Leibniz-Institut für Deutsche Sprache (IDS), Bibliothek

(More) common ground for processing spoken language corpora?

Mannheim: Leibniz-Institut für Deutsche Sprache (IDS), Bibliothek

A TEI-based approach to standardising spoken language transcription

Refining and Exploiting the Structural Markup of the eWDG

Interfacing Lexical and Ontological Information in a Multilingual Soccer FrameNet

Avoiding Data Graveyards : from Heterogeneous Data Collected in Multiple Research Projects to Sustainable Linguistic Resources

Sustainability of Linguistic Resources

Time-based data models and the Text Encoding Initiative’s guidelines for transcription of speech

The Kicktionary – A Multilingual Lexical Resource of Football Language

The Kicktionary : Combining corpus linguistics and lexical semantics for a multilingual football dictionary

The Kicktionary: A Multilingual Resource of the Language of Football

Collaborative Commentary: Opening Up Spoken Language Databases

The transcription system EXMARaLDA: An application of the annotation graph formalism as the basis of a database of multilingual spoken discourse

The database for spoken German - DGD2

New and future developments in EXMARaLDA

Multilingual Corpora at the Hamburg Centre for Language Corpora

Linguistic tool development between community practices and technology standards

FOLKER : an annotation tool for efficient transcription of natural, multi-party interaction