EUSKORPORA, the Linguistic Data Center for Basque Digital Technologies, a new association based in Donostia/San Sebastián, is seeking young professionals at the beginning of their careers to support key tasks related to the creation of linguistic resources and language technologies for the Basque language.  Selected individuals will join an interdisciplinary team and participate in projects involving the collection, annotation, and analysis of linguistic data, as well as the development of open-source foundational language models (ASR, TTS, MT, NLP) oriented to Basque, in a research and development context closely connected to industry.  Main Responsibilities: 

  • Assist in the collection, cleaning, and annotation of linguistic corpora (text and audio). 
  • Collaborate in the training and evaluation of language models for text and speech. 
  • Participate in the description, documentation, cataloging, and maintenance of linguistic resources. 
  • Contribute to the integration of open-source NLP tools and libraries. 
  • Work in coordination with technical, linguistic, and management profiles. 
  • Assist in the creation of technical reports, publications, and dissemination of results. 

Requirements: 

  • Bachelor’s or Master’s degree in Computational Linguistics, Computer Engineering, Data Science, Translation with a focus on language technologies, or related fields. 
  • Basic knowledge of NLP, language models, or speech technologies. 
  • Previous participation in related research or development projects is valued. 

Technical Knowledge: 

  • Python programming (basic/intermediate level). 
  • Familiarity with tools for linguistic annotation or text processing. 
  • Experience in corpus collection and resource creation is valued. 
  • Experience with libraries like Hugging Face, spaCy, or similar is valued. 
  • Basic use of version control tools (Git). 

Languages: 

  • Basque: high level (B2 or higher) 
  • Spanish: fluent 
  • English: high level (B2 or higher) 

We Offer: 

  • Integration into a newly created, dynamic and innovative center specialized in Basque language technologies, based in San Sebastián. 
  • Attractive national and international development projects to position Basque in the digital world. 
  • Ongoing training in cutting-edge language technologies. 
  • Flexible work and a collaborative environment. 
  • Real opportunities for professional growth within the team. 
  • Experience in an interdisciplinary environment with social and cultural impact. 
  • Competitive salary aligned with training and experience.