Ergebnisse für *

Es wurden 8 Ergebnisse gefunden.

Zeige Ergebnisse 1 bis 8 von 8.

Sortieren

  1. KorAP architecture – diving in the deep sea of corpus data
    Autor*in: Diewald, Nils
    Erschienen: 2016
    Verlag:  Institut für Deutsche Sprache, Bibliothek, Mannheim

    Export in Literaturverwaltung
    Quelle: DNB Sachgruppe Deutsche Sprache und Literatur
    Beteiligt: Hanl, Michael (Verfasser); Margaretha, Eliza (Verfasser); Bingel, Joachim (Verfasser); Kupietz, Marc (Verfasser); Bański, Piotr (Verfasser); Witt, Andreas (Verfasser); Calzolari, Nicoletta (Herausgeber); Choukri, Khalid (Herausgeber); Declerck, Thierry (Herausgeber); Goggi, Sara (Herausgeber); Grobelnik, Marko (Herausgeber); Maegaard, Bente (Herausgeber); Mariani, Joseph (Herausgeber); Mazo, Helene (Herausgeber); Moreno, Asunción (Herausgeber); Odijk, Jan (Herausgeber); Piperidis, Stelios (Herausgeber)
    Sprache: Englisch
    Medientyp: Unbestimmt
    Format: Online
    Weitere Identifier:
    Schlagworte: Korpus <Linguistik>
    Weitere Schlagworte: Korpusanalyseplattform (KorAP); Institut für Deutsche Sprache <Mannheim>; Textlinguistik; microservices; large corpus data
    Umfang: Online-Ressource
    Bemerkung(en):

    In: Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), Portorož, Slovenia. - Paris : European Language Resources Association (ELRA), 2016., S. 3586-3591, ISBN 978-2-9517408-9-1

  2. Forschungsinfrastrukturen in außeruniversitären Forschungseinrichtungen
    Forschungsbericht
    Autor*in:
    Erschienen: 2014
    Verlag:  Inst. für Dt. Sprache, Mannheim

    Leibniz-Institut für Deutsche Sprache (IDS), Bibliothek
    keine Fernleihe
    Export in Literaturverwaltung   RIS-Format
      BibTeX-Format
    Hinweise zum Inhalt
    Volltext (Kostenfrei)
    Quelle: Leibniz-Institut für Deutsche Sprache, Bibliothek
    Beteiligt: Fiedler, Norman; Werthmann, Antonina; Stührenberg, Maik; Schonefeld, Oliver; Bingel, Joachim; Witt, Andreas
    Sprache: Deutsch
    Medientyp: Buch (Monographie)
    Format: Online
    Weitere Identifier:
    Schlagworte: Forschungsprozess; Infrastruktur; Computerlinguistik; Geisteswissenschaften; Sozialwissenschaften
    Umfang: Online-Ressource
  3. Forschungsinfrastrukturen in außeruniversitären Forschungseinrichtungen. Forschungsbericht
  4. Instantiation and implementation of a corpus query lingua franca
    Erschienen: 2015

    The present thesis introduces KoralQuery, a protocol for the generic representation of queries to linguistic corpora. KoralQuery defines a set of types and operations which serve as abstract representations of linguistic entities and configurations.... mehr

     

    The present thesis introduces KoralQuery, a protocol for the generic representation of queries to linguistic corpora. KoralQuery defines a set of types and operations which serve as abstract representations of linguistic entities and configurations. By combining these types and operations in a nested structure, the protocol may express linguistic structures of arbitrary complexity. It achieves a high degree of neutrality with regard to linguistic theory, as it provides flexible structures that allow for the setting of certain parameters to access several complementing and concurrent sources and layers of annotation on the same textual data. JSON-LD is used as a serialisation format for KoralQuery, which allows for the well-defined and normalised exchange of linguistic queries between query engines to promote their interoperability. The automatic translation of queries issued in any of three supported query languages to such KoralQuery serialisations is the second main contribution of this thesis. By employing the introduced translation module, query engines may also work independently of particular query languages, as their backend technology may rely entirely on the abstract KoralQuery representations of the queries. Thus, query engines may provide support for several query languages at once without any additional overhead. The original idea of a general format for the representation of linguistic queries comes from an initiative called Corpus Query Lingua Franca (CQLF), whose theoretic backbone and practical considerations are outlined in the first part of this thesis. This part also includes a brief survey of three typologically different corpus query languages, thus demonstrating their wide variety of features and defining the minimal target space of linguistic types and operations to be covered by KoralQuery. ; Die vorliegende Arbeit präsentiert KoralQuery, ein Protokoll für die allgemeine Repräsentation von Anfragen an linguistische Korpora. KoralQuery definiert eine Menge von Typen und Operationen, welche ...

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Masterarbeit
    Format: Online
    DDC Klassifikation: Linguistik (410)
    Schlagworte: Korpus; Computerlinguistik; Textlinguistik; SQL
    Lizenz:

    creativecommons.org/licenses/by-nd/4.0/ ; info:eu-repo/semantics/openAccess

  5. KoralQuery - a General Corpus Query Protocol
    Erschienen: 2015
    Verlag:  Linköping University Electronic Press, Linköpings universitet

    The task-oriented and format-driven development of corpus query systems has led to the creation of numerous corpus query languages (QLs) that vary strongly in expressiveness and syntax. This is a severe impediment for the interoperability of corpus... mehr

     

    The task-oriented and format-driven development of corpus query systems has led to the creation of numerous corpus query languages (QLs) that vary strongly in expressiveness and syntax. This is a severe impediment for the interoperability of corpus analysis systems, which lack a common protocol. In this paper, we present KoralQuery, a JSON-LD based general corpus query protocol, aiming to be independent of particular QLs, tasks and corpus formats. In addition to describing the system of types and operations that Koral- Query is built on, we exemplify the representation of corpus queries in the serialized format and illustrate use cases in the KorAP project.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Linguistik (410)
    Schlagworte: Korpus; Computerlinguistik; Automatische Sprachverarbeitung
    Lizenz:

    creativecommons.org/licenses/by-nd/4.0/ ; info:eu-repo/semantics/openAccess

  6. Named entity tagging a very large unbalanced corpus: training and evaluating NE classifiers
    Erschienen: 2014
    Verlag:  Reykjavik : European Language Resources Association (ELRA)

    We describe a systematic and application-oriented approach to training and evaluating named entity recognition and classification (NERC) systems, the purpose of which is to identify an optimal system and to train an optimal model for named entity... mehr

     

    We describe a systematic and application-oriented approach to training and evaluating named entity recognition and classification (NERC) systems, the purpose of which is to identify an optimal system and to train an optimal model for named entity tagging DeReKo, a very large general-purpose corpus of contemporary German (Kupietz et al., 2010). DeReKo 's strong dispersion wrt. genre, register and time forces us to base our decision for a specific NERC system on an evaluation performed on a representative sample of DeReKo instead of performance figures that have been reported for the individual NERC systems when evaluated on more uniform and less diverse data. We create and manually annotate such a representative sample as evaluation data for three different NERC systems, for each of which various models are learnt on multiple training data. The proposed sampling method can be viewed as a generally applicable method for sampling evaluation data from an unbalanced target corpus for any sort of natural language processing.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Deutsch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Germanische Sprachen; Deutsch (430)
    Schlagworte: Deutsches Referenzkorpus (DeReKo); Korpus; Textkorpus; Identitätsverwaltung
    Lizenz:

    rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

  7. KorAP: the new corpus analysis platform at IDS Mannheim
    Erschienen: 2014
    Verlag:  Poznań : Uniwersytet im. Adama Mickiewicza w Poznaniu

    The KorAP project (“Korpusanalyseplattform der nächste Generation”, “Corpus-analysis platform of the next generation”), carried out at the Institut fUr Deutsche Sprache (IDS) in Mannheim, Germany, has as its goal the development of a modem,... mehr

     

    The KorAP project (“Korpusanalyseplattform der nächste Generation”, “Corpus-analysis platform of the next generation”), carried out at the Institut fUr Deutsche Sprache (IDS) in Mannheim, Germany, has as its goal the development of a modem, state-of-the-art corpus-analysis platform, capable of handling very large corpora and opening the perspectives for innovative linguistic research. The platform will facilitate new linguistic findings by making it possible to manage and analyse extremely large amounts of primary data and annotations, while at the same time allowing an undistorted view of the primary un-annotated text, and thus fully satisfying expectations associated with a scientific tool. The project started in July 2011 and is funded till June 2014. The demo presentation in December will be the first version following a preliminary feature freeze, and will open the alpha testing phase of the project.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Germanische Sprachen; Deutsch (430)
    Schlagworte: Korpus
    Lizenz:

    rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

  8. KorAP architecture – diving in the deep sea of corpus data
    Erschienen: 2016
    Verlag:  Paris : European Language Resources Association (ELRA)

    KorAP is a corpus search and analysis platform, developed at the Institute for the German Language (IDS). It supports very large corpora with multiple annotation layers, multiple query languages, and complex licensing scenarios. KorAP’s design aims... mehr

     

    KorAP is a corpus search and analysis platform, developed at the Institute for the German Language (IDS). It supports very large corpora with multiple annotation layers, multiple query languages, and complex licensing scenarios. KorAP’s design aims to be scalable, flexible, and sustainable to serve the German Reference Corpus DEREKO for at least the next decade. To meet these requirements, we have adopted a highly modular microservice-based architecture. This paper outlines our approach: An architecture consisting of small components that are easy to extend, replace, and maintain. The components include a search backend, a user and corpus license management system, and a web-based user frontend. We also describe a general corpus query protocol used by all microservices for internal communications. KorAP is open source, licensed under BSD-2, and available on GitHub.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Germanische Sprachen; Deutsch (430)
    Schlagworte: Korpus
    Lizenz:

    creativecommons.org/licenses/by-nc/4.0/ ; info:eu-repo/semantics/openAccess