Opening of 1 Research Position
Corpus Linguistics Researcher - ILTEC
The ILTEC institute (Institute for Theoretical and Computational Linguistics) is looking for a post-doc fellow for a period of five years, to start before the end of 2008. The position is part of the Ciência 2008 program of the Foundation for Science and Technology (FCT).
The ILTEC Institute
ILTEC is a small research center in Lisbon, Portugal, founded in 1988. It has an autonomous status, but strong ties to the two major universities in Lisbon (University of Lisbon and New University of Lisbon) .
ILTEC is dedicated to research on linguistics and language education in the areas of language technology, human-computer interaction, computational modelling, linguistic diversity, literacy, professional communication, and discourse studies. ILTEC is divided into three research groups: Language and Linguistic Diversity, Lexicon and Computational Modeling, and Discourse and Literacy.
ILTEC has a major commitment not only to fundamental research but also to research that produces findings that are relevant for use by the community, as is the case with the project entitled Portal da Língua Portuguesa (http://www.portaldalinguaportuguesa.org). The Portal da Língua Portuguesa is a web-site created to make the data in ILTEC's MorDebe database available to the general public in a user-friendly manner. As such, it provides an online vocabulary, a dictionary of inflectional morphology, dictionaries of gentilic names, of deverbal nouns, etc.
More information about the ILTEC institute can be found on its web-site (http://www.iltec.pt)
The main purpose of the fellowship is the development of projects in corpus linguistics, specifically geared to attaining standards of high international quality in the field.
ILTEC has been developing different corpora with multiple purposes in the last few years – Termináutica, Judo, Cinema, Comércio Electrónico, Nanotecnologia – all corpora of written specialized texts; REDIP – a corpus of media productions, both written and spoken; CORPORAL – a large spoken corpus of informal speech; as well as several newspaper corpora collected for the purpose of tracking neologisms. Other corpora are also being assembled for individual research purposes, primarily as part of several PhD projects.
Having all these corpora present at the institute, ILTEC is looking to build an infrastructure for corpus data which will permit the integration of the existing corpora, as well as the extension of the corpus material with new parts to be compiled in the future. The infrastructure should allow different information retrieval tasks, as well as knowledge extraction tasks. The infrastructure should also enable the corpora to be used for pedagogical purposes, for education in the fields of Lexicology, Lexicography, Terminology, and Translation Studies. Furthermore, the objective is to make these resources available not only to all linguists from ILTEC, but also to the scientific community at large.
One of the main objectives of the focus on corpus linguistics at ILTEC is the improvement of Terminological research. The institute has a long tradition and much experience in this field, and is often approached by third parties for the construction and improvement of terminologies. Therefore, the corpus-linguistic approach at ILTEC will have a strong focus on specialized language, which entails the easy constitution of new, specialized corpora, and tools for term extraction, definition extraction, and ontology extraction.
The researcher is expected to improve our capacities in gathering and managing large-scale corpora of different types of discourse (oral / written, general / specialized languages, etc.) and performing advanced searches in these. One of the main applications of the researcher’s work at ILTEC will be the development of corpus-based terminological work, both mono- and multilingual. His/her specific tasks will include: developing an integrated infrastructure for large-scale resources and other corpora and lexicons; assuring their availability; developing resources to be used to foster terminological work; establishing automatic processes of information extraction; applying statistical NLP tasks; and developing corpus annotation systems.
The researcher will be integrated into the Lexicon and Computational Modeling group, but is expected to develop cooperative work with other groups as well.
Candidates should have a PhD in the area of corpus linguistics, computational linguistics, or a related field, and (preferentially) have at least three years of research experience since the completion of their PhD.
Apart from a strong background in corpus linguistics, computational semantics, natural language processing, statistical NLP and/or machine learning, candidates are expected to have some working experience and knowledge of the use of database systems, and some experience in programming (Perl, C/C++, PHP).
Candidates should also have a good command of English, and should have, or be willing to develop, active knowledge of spoken and written Portuguese, which is the target language of the majority of projects developed at ILTEC.
The position involves a contract for a fixed period of 5 years, and consists of a full research position with no teaching obligations.
The selected candidate will start with a contract for one year, concluded by an evaluation, whereupon it will be annually renewed, according to Nº 1 of Article 139 of the Código do Trabalho (Labour Regulations). At the end of the third year the contract will be renewed one single time for a period of two years, according to nº 2 of Article 139 of the Código do Trabalho (Labour Regulations).
The annual salary for the position before tax is just over €43,000, equal to that of an assistant professor.
The contract is subject to prior agreement from the Foundation for Science and Technology, the national funding agency.
Candidates should send their application to the ILTEC institute, either by e-mail or by surface mail to the address below. The candidature should contain a curriculum vitae, a letter of motivation (stating the candidate’s reasons for applying and the potential benefits to ILTEC of the candidate's expertise), and two letters of recommendation (or the names and addresses of two qualified specialists who are in a position to recommend the candidate).
ILTEC-Instituto de Linguística Teórica e Computacional
Rua Conde de Redondo, 74, 5º andar