Educational and Scientific Center for Computational Linguistics. Computational linguistics at HSE: Anastasia Bonch-Osmolovskaya about the new master’s program About the course of lectures “Computer linguistics”

Russians in France. Impressions about the student internship.

May internships for FILR students at the Center for Applied Linguistics at the University of Franche-Comté in Besançon (France) are becoming a good tradition. For the first two weeks of the month, our students join a multinational company of interns that numbers more than three thousand people annually.

One of the oldest centers for teaching foreign languages ​​teaches not only French as a foreign language, but also a dozen other languages. The Center improves the qualifications of teachers of higher and secondary schools, and various scientific research is conducted on its basis. By order of the ministries of foreign affairs, embassies, ministries of education and other educational and cultural institutions, the Center’s specialists go on business trips and carry out their work on the ground. Just over a year ago, on March 28, 2007, a ministerial commission consisting of representatives of the Ministry of National Education, the Ministry of Culture and Communication, and the Ministry of Foreign Affairs awarded the Center for Applied Linguistics a special quality mark (label FLE) - the highest score in five categories - for merits in the field of teaching French as a foreign language.

This year our partner celebrates its 50th anniversary. Celebrates on a grand scale, uniting the Center’s employees; trainees; partners, including the administration of the city, department, region, television channel TV 5, and the magazine “French Language in the World”. As one of the slogans says, “...after all, languages ​​connect and unite People, Cultures, Countries, Projects, Hopes... It is this idea that will become our red thread.” All forms of expression are welcome at the festivities - speech, writing, poetry, graphics, sound, video. Our group had the opportunity to attend one of the events opening the anniversary marathon: a vernissage of works by lyceum students dedicated to the Center for Applied Linguistics.

This stay was rich in holidays and weekends, but these days cannot be called lost. Trips to Paris, Lyon, Dijon, Beaune; participation in the annual international festival of university musical groups in Belfort; a visit to the medieval castle of Joux, a boat ride to the waterfall on the Doubs River with a hearty lunch of local dishes in a village restaurant: cheese fondue from several varieties of Comté, various smoked meats - the pride of the region, unique yellow wine and delicious homemade pie. Hervé Lesch, head of the extracurricular activities department, proposed a bright and rich program. The sights of Besançon were not deprived of attention: its museums, fortress, streets, houses, fountains. Thanks to our guide - the Center's teacher F. Ouabian - we saw the city from the inside.

Everyday life turned out to be no less intense: 40 hours of fruitful work under the guidance of professionals from the Center for Applied Linguistics Nicole Poirier and Jean-Louis Cordonier using multimedia, audio and video supports, texts of various genres, and most importantly - in an authentic language environment. The meeting with a representative of the city administration, an employee of the department of tourism, culture and sports, Ch. Dufaitre, was informative.

Support for group A-M curators. Stimpfling, M. Lakaya and J-P Beshat are especially valuable. Indeed, thanks to their work, the program of stay for our students was perfectly organized, we were surrounded by a wonderful atmosphere, and the impressions were the most pleasant.

Accompanying the group of O. Kryukov

Alieva Aysel

At the end of winter we were offered an internship in Besançon. After some deliberation, everyone agreed. So, on the first of May we flew to Paris, from where we then went to the capital of the Franche-Comté region.

The first thing that struck me about Besançon was its extraordinary nature. Flowers, trees, birds - this is what the city can be proud of. It was the bright rays of the sun and the singing of birds that served as my alarm clock in the morning. It is also worth noting the fact that we were very lucky with the weather. During our 14 days in France it rained only once. By the way, it was on this rainy day that we went on an excursion to the waterfalls by water bus. The entire population of Besançon is very friendly. Everyone has a sparkling smile on their face, everyone is ready to help if necessary.

As for our educational process, it was organized very well. First of all, the wishes of the students themselves were taken into account. Thus, the training program was aimed at developing oral speech and presenting texts/articles read. It was immediately obvious that the university teachers sought to make classes as interesting and useful as possible. In addition to studying, we visited the castle (Château de Joux), the music festival in Belfort, and went to Dijon and Beaune. So the educational part of our program was also intense. On the last day of school, we were invited to the culture and sports department of the city administration, where we were told in detail about various aspects of the city’s development: administrative structure, economics, transport, culture, tourism, sports...

This is how our internship went in Besançon, one of the nicest cities in France. There was a lot of interesting things, but it’s impossible to describe everything on paper. But I can say for sure that each of us has our own vivid memories of Besançon. We will all remember our practice with great pleasure, and the camera will help us with this.
Dina Svirtsova

I had only positive impressions from the trip.

Firstly, we were able to communicate in French at an everyday level, which would have been impossible to do during classes at the university. Therefore, the trip was very useful in terms of practicing the language, and thanks to it it became much easier for me to speak French.

Secondly, we visited many interesting places, such as Dijon - an extremely beautiful city with many attractions, and where there is so much greenery...
I also remember the trip to the Chateau de Joux - a real medieval castle with thick walls, a huge spiral staircase and a cold dungeon.

Then we went to a restaurant where we tried the specialties of the region: fondue, cold meats and, of course, yellow wine.

A walk along the Doubs River, literally a hundred meters from Switzerland, made a strong impression on me...

As for Besançon itself, it is a very cute and beautiful town where you can endlessly walk around, admire the views and attractions...

In general, the trip left a lot of good impressions and positive emotions.
Kuznetsova Marina

Having learned that we had the opportunity to go to practice in France, our group lived only with dreams and hopes. In February, the idea that a trip awaits us in May seemed somehow unrealistic and distant, but dreams come true and voilà we are already in Besançon. The first impression of this city: “God, what nature is here, what trees, flowers. Everything is flowering!". From the very first day, despite being tired, I wanted to go for a walk. In my opinion, it is impossible to describe the feelings that we experienced during these two weeks. Endless delight and the desire to forget that you are a foreigner and a tourist, to merge with the crowd of townspeople and feel part of the city; try to behave like a French student: buy bus and train tickets, go to bakeries and enjoy the smell of freshly baked bread, be polite to the people around you.

The days flew by at the speed of light. And now, it’s time for us to pack our bags and leave for Russia. An ambivalent feeling: of course, we all missed our loved ones, but still, I am sure that each of us would have stayed here. We don’t want to leave a place where the only problem is the lack of time to see and visit everything, because in Moscow everyday problems and the summer session await us. Besançon is a very calm and quiet city, here I felt like I was at a resort, especially in the early morning when the sun illuminated the streets and trees.

But still, we were in practice, so we had to not only relax and have fun, but also study seriously. Our teachers were very attentive, they were always ready to help us, and they were interested in teaching us something new. In addition, our vocabulary has also expanded, because... Often we were in quite difficult situations, the way out of which was very difficult without knowing some concepts, but we always managed!

It is very nice! You get incredible pleasure when you say something in French and see that people understand you. Immediately there is a desire to speak French with everyone.

In conclusion, I want to say that practice in Besançon helped me learn many new expressions, feel involved in the everyday life of the city, overcome shyness and start communicating and, of course, we will all show our parents and friends when we arrive in Moscow photos and talk about our trip. Lots of memories that you really don’t want to forget!

I hope that next year we will have another opportunity to visit France. And now, in Moscow, we should study diligently and improve our knowledge.

I want to say a huge thank you to those who made our dreams come true and organized this wonderful two-week trip to one of the most, in my opinion, beautiful and cozy cities in France, namely, the city of Besançon.
Kryukova Nina

For me, this is my first trip to France - the fulfillment of a childhood dream, a dream that has finally come true.

Ancient paved cities, the sun in the spray of fountains, of which there are a lot in Besançon, tiled roofs of houses, green hills, an ancient (and eternal, as they say in the booklets) citadel, built by the Great Vauban and preserving the memory of past centuries and wars - all this is so beautiful , which seems like a dream. You walk around the city and think that you will wake up now, because there cannot be such beauty and such magic in the world. It turns out it can.

In this city, on the same square in the same house, the brothers Lumiere and Charles Nadier were born, and in the house opposite - Victor Hugo. The citadel itself rises above this square, and the sun sparkles on the multi-colored roof of the cathedral. Here history is intertwined with modernity, and it seems that Besançon has a special relationship with eternity, because the Museum of Time is located there.

In addition to lessons at the Center for Applied Linguistics (CLA), where time flew by, we were able to travel around France, which we actually did during the numerous May holidays. We were in Dijon, Lyon, Belfort, and three volumes of enthusiastic impressions could be written about each of these cities.
Ancient Dijon with narrow medieval streets, with the spiers of Gothic cathedrals soaring into the sky, with colorful roofs of houses that still remember the knights and the Duchy of Burgundy, the Notre-Dame de Bourgogne Cathedral, which houses a statue of the Virgin Mary created in the 11th century, Philip's Tower Kind, from which you can see the whole city, as if in the palm of your hand... Oh, we can talk endlessly about Dijon.

Lyon, a city of contrasts, standing on two rivers.

Belfort, home to an annual music festival that brings everyone together, regardless of age or native language.

The medieval castle of Joux, where Mirabeau lived freely in friendship with the owner of the castle and in love with his wife, where Toussaint Louverture died, where every stone has its own history, its own legends.

You can’t tell about everything, you can’t list everything. But infinitely beautiful France, amazing teachers, in whose lessons an hour flies by like a minute, an expressive lecture by the Algerian writer A. Dzhemaya on the work of Camus - we will never forget about the two weeks spent here.

I’ll go pack my bags and dream of returning to Besançon more than once.
Shpilenok Evgenia

Besançon is a small, hospitable city in eastern France. This was not my first trip to Besançon, but, nevertheless, this city opened up for me from a new side.

Besançon has a surprisingly harmonious combination of untouched nature and urban landscape: city residents walk along the Doubs River embankment on weekends, and lead an urban lifestyle on weekdays.

The authorities of Besançon and its residents take a very responsible approach to protecting the environment. So, with regard to public transport, most buses run on natural gas (Pour le bien-être de tous, cet autobus roule en gaz naturel).

During our stay in France, we visited not only Besançon, but also Dijon, which is an hour's drive away. We also spent a wonderful day at the Château de Joux. After a tour of the castle, we enjoyed fondue, and then took a pleasure boat to Le Saut du Doubs waterfall, which is located on the border with Switzerland. So, we can say that, without making much effort, we visited Switzerland.

Finally, we attended a student music festival in the small town of Belfort. There were many talented groups of different musical styles and directions presented there.

As for the classes at the Center for Applied Linguistics (CLA), we should note the professionalism of the teachers, their willingness to explain incomprehensible things and provide new knowledge. We analyzed poems, watched and dissected a film, practiced listening and had debates.

On this trip, we got to know each other better, and also met new interesting people with whom we will probably want to communicate in the future.

France greeted us with great weather. We enjoyed the warmth and sun, sunbathed, walked and breathed the clean air of Besançon.

The only thing that confuses me in France is the complete refusal of shops and other establishments providing services to work on Sundays and holidays. This was especially noticeable for us because there were a lot of them during our stay.

Overall, I think that our trip was a success, especially since it was most likely the last one in my student life.
Kashitsyn Igor

Our suitcases have already been packed, our documents have not been forgotten, and the miracle of French technology - the TGV is rushing us with great speed from hospitable and almost our own Besançon to the huge Paris. Two weeks in France flew by with the same lightning speed as that very high-speed train - Train à grande vitesse. It is difficult to systematize all the impressions, but among them the most vivid ones remain.

The first thing that catches your eye is that in the catchphrase “Greece has everything,” in the 21st century you can safely change the name of the country to France. Everywhere you can feel that France, for all its cosmopolitanism, sacredly honors its own traditions and isolation. Local residents drive comfortable French cars, are deservedly proud of French food, which is truly of the highest quality, and are ready to admire their language and their nature all day long.

Besançon itself turned out to be almost as I had imagined it, but this does not mean that I did not like it.

On the contrary, the city has preserved the spirit of Medieval France - with its churches, cobbled streets, small courtyards where the French spend their evenings over a glass of wine.

Besançon for me is names and titles: the Lumière brothers, Hugo, “Red and Black” (written by Stendhal, not Hugo, as I once believed). Red is the majestic citadel of Vauban, which offers an extraordinary view of the Doubs River valley, numerous bastions, bridges, which in case of war rose and made the city almost impregnable. Even the name of one of the city's quarters - Battant - indirectly speaks of the city's great past. Black is the cassock of a Catholic priest. If you remember, Julien Sorel was going to enter the seminary in Besançon. This is the stunning cathedral, the Church of Sainte-Madeleine, the foot of which has become a meeting and lunch place for Besançon students.

Besançon is unusually conveniently located, and well-developed communications allowed me in a fairly short time to see not only the city itself and its surroundings, but also visit Burgundy and its very beautiful capital Dijon, the dream city of Lyon, the entire center of which is part of the world heritage site, and even visit Switzerland for 5 minutes after an excursion to the Jura Mountains.

The residents of the east of France turned out to be unexpectedly open and hospitable for me, which was wonderfully combined with their innate intelligence and impeccable knowledge of etiquette. I would like to sincerely thank the hosts, the entire Center for Applied Linguistics (CLA) and especially our teachers. Thank you all very much!

P.S. Special thanks to the department and faculty and a small wish: can we go west now?! =) To Bordeaux, Nantes or Toulouse =)
Budulchieva Natalya

Besançon has become for me a city in which you forget about the existence of Moscow and Paris - megacities that overwhelm a person with their gigantic size and sky, distant and unreal. Someone will say that everything that happened to us in Besançon was a dream or a fairy tale, but this city seemed to me much more real than any other, because the water in it is really water, and the stars are close and bright.

Almost half of the days spent in France were holidays. Thanks to this circumstance, we had more time to wander around Besançon ourselves and go to neighboring cities. The last one delighted me the most: just one or two hours by high-speed train, and you are in Dijon, or Lyon, or Belfort. Yes, anywhere! France is indeed a small country, but always amazingly different. It was funny to go to the Chateau de Joux, a real medieval castle-fortress, one day, and the next day to be in Belfort at the university music festival.

Classes at the Center for Applied Linguistics were a revelation for me. It was really exciting: studying the lyrics of modern songs, films, reading popular poets in France. It was interesting, finally, just to chat with the teachers, to find out where the best pastry shops are located in Besançon, or, for example, where and how local youth like to relax.
Yana Shcherbina
For me, this is my first acquaintance with Besançon and France. This acquaintance is very long-awaited and pleasant. The first thing you notice in Besançon is the absence of traffic jams, so characteristic of Moscow. And there are few cars in Besançon - local residents prefer bicycles - benefits for both health and the environment. No cars means no noise, and the air is noticeably cleaner. The city has many trees and flowers, a large number of fountains.

The French know how to preserve their history, so houses from the 18th and 19th centuries are in excellent condition and suitable for habitation, but modern buildings are rare in this city.

The same can’t be said about the city’s residents. In my opinion, these are the most polite, kindest people. During all the time I spent in Besançon, I did not notice anyone making a scandal or simply raising their voice. The French’s sweet habit of greeting everyone, both acquaintances and strangers, is surprising.

All in all, Besançon is worth seeing with your own eyes, and I hope that I will return there more than once.
Kirdyashev Ivan

There is no feeling that we are in another country. It’s just that people speak a different, almost native, language. The meaning of what is happening is still unclear, but it seems that we are already in France.

... A few hours later. Gare de Lyon area. Place de la Bastille.

Paris at night turned out to be a dirty, slightly marginal, dangerous, but at the same time friendly and green city. Having been kicked out of the station along with our suitcases, in the cold night we tried to find a warm fireplace and a comfortable chair. As a result, we settled down on the veranda of a cafe with the quite formidable name “Bastilia”. The poor waiter who served us simply did not know what to do at half past four in the morning with twelve cold and hungry Russians. That night we felt like real travelers...

5 hours later we settled into a fairly comfortable seat (and most importantly warm!) in the TGV and headed off to our destination.

A fragrant French province. A little “à côté”, as the locals say, but without losing any of its merits, namely:

  1. The city is clean and green. “Besançon est une ville propre” - declare the local ballot boxes.
  2. The politeness of the French is amazing! I'm blown away!
  3. 6 hours of language a day. You can go crazy! From happiness. I experience catharsis every evening.
May 9, Friday. 5 days before departure.

Leshchenko, the Russian anthem and a company of 10 people in a cramped room. Seventeen times “hurray!” and the very next morning the whole hostel knows about us. In short, this is a special story about intercultural communication.

In the town of Belfort in the north of the region there was a music festival, to which we went. On the way we passed the Peugeot factories in Montbeillard. An unforgettable spectacle of the power of French industry. Belfort, unfortunately, turned out to be not “bel”, but “sal”. Apparently, several days of the festival and the presence of a significant number of people (and especially young people!) can ruin any place.

However, the performances of groups from different countries, musicians of different qualifications - from students of the Prague Conservatory to a traveling student orchestra from Portugal - created a special, hardly forgettable flavor of the action.

Truly, this is where the centrifugal forces of Mother Europe are most exposed. She is so different, but still united!

Last lesson with your favorite teacher. A symbolic exchange of gifts and the reading of a poem, setting the final, slightly sad chord of our studies. Lyrical. Sad. Goodbye Besancon.

Sans doute je reviens. J'espère donc je suis.

Early train and day in Paris. We miraculously managed to get to Notre Dame de Paris. There - crowds of half-dressed tourists with an absent look illuminate the stained glass windows with the flashes of their soap dishes. This place inspires both hostility and reverent adoration at the same time.

The acquaintance with the Parisian metro was unsuccessful - I had to trick my way through the turnstile. That's right, Russo Turisto! Then we went to the same de Gaulle airport, which I now consider the best I have ever seen.

Merci à toi, chère République Française.

FIN
Pyltsina Maria

Besançon strikes you at first glance with its high roofs and narrow cobbled streets. Captivates with its comfort and serenity. The first time I visited Besançon was two years ago, and even then the city fell into my soul. It seems like nothing special, but there is something that made me return to the capital of Franche-Comté again. After my first trip, I read Stendhal’s book “The Red and the Black,” in which part of the action takes place in Besançon. And therefore, on my second visit, I was very interested in Besançon by Julien Sorel. During a tour of the city with Monsieur Ouabian, we visited the seminary where Julien was going to enter.

The weather was very pleasant. It’s nice to remember the many picnics on the grass in the shade of plane trees and the fortress wall right next to the calm water of Doubs.
The classes at the CLA Center for Applied Linguistics amazed me with their variety every time. We managed to read and analyze several poems by French authors, watch the film “Amelie,” and even master the slang language of modern French youth.

At the end of the trip I really didn’t want to leave. But let’s hope that this is not our last time in Besançon, and we will definitely visit this calm, cozy and captivating city and its kind, open and hospitable inhabitants again.

« The opening of a department at MIPT allows us not only to help his students.

Our goal is to make the best Computer Science teaching in Russia at FIVT.”
Svetlana Luzgina, corporate communications service.


Head of the department: Vladimir Pavlovich Selegey, Director of Linguistic Research at ABBYY

The Department of Computational Linguistics at FIVT was founded in 2011 by the Russian company ABBYY, one of the leading software developers in the field of artificial intelligence, in particular, document recognition and natural language processing. The department trains specialists who are able to work effectively in the development of innovative language computer technologies, in particular, ABBYY Compreno technology for syntactic and semantic text analysis.

In the last decade, computational linguistics has been actively developing throughout the world. This is due to the growing influence of the Internet and the emergence of a large number of new technical devices with natural language interfaces. Technologies such as multilingual information retrieval, machine translation, knowledge extraction, speech recognition, etc. are developing especially rapidly. In Russia, computer linguistics has so far received insufficient attention in the education system. Because of this, the Russian language is not sufficiently represented in global scientific research on computational linguistics.

The specialization “Computational Linguistics” at MIPT is based on the deep technical education provided by the Physics and Technology Institute. Classes at the base department are held at the ABBYY office, where company employees teach courses in automatic language processing, general and computer lexicography, corpus linguistics, as well as integral Computer Science disciplines in the field of software creation.

One of the goals of the department is to actively involve students in scientific life. It is important not only to know about modern world “trends” in computational linguistics, but also to be part of the global process. Students of the department take an active part in the development of ABBYY Compreno technology and a joint research project with the Russian State University for the Humanities to create a General Internet Corpus of the Russian Language (GIKRY) based on Russian-language Internet resources.

Admission to the department is based on the results of a competition for both undergraduate and first-year master's degrees. Bachelors of all faculties of MIPT, as well as other higher educational institutions, are accepted for master's programs. Admission is based on the results of solving logical and algorithmic problems and an interview with the leadership of the department.

If you would like to have an interview at the department or ask a question, please write to [email protected]. See you at ABBYY!

Head of the UC


general information

The UC Center for Computational Linguistics was opened at the Institute of Linguistics of the Russian State University for the Humanities in 2011 with the participation of ABBYY and the support of the Russian branch of IBM. The UC trains professional linguists who are able to work effectively in the development of innovative language computer technologies. From 2012, the UC will train master's students in the Computer Linguistics program in the Fundamental and Applied Linguistics direction.

Computational linguistics is a relatively new field of scientific and engineering activity. The relevance of creating this master's program is determined by the fact that in the last 10-15 years there has been rapid development in this area, associated with the growing influence of the Internet and the emergence of a huge number of new technical devices, the most important part of which are natural language interfaces. In addition, in modern linguistics there is a rapid transition from traditional methods of obtaining language data to corpus methods, which require serious development of computer technologies.

The obvious, year-by-year growing need for specialists capable of participating in the development of appropriate technologies is, unfortunately, not yet supported by the presence of an adequate educational standard in the Russian education system. The proposed program is one of the first attempts to determine what kind of specialists the industry requires.

The field of activity associated with solving problems of automatic processing of Natural Language (NL) and called “Computer Linguistics” requires the training of specialists in two fundamentally different areas: linguists and engineers. These areas are based on two completely different education systems:

  • “Computational linguistics for engineers” is part of the so-called. Computer Science. Within the framework of this direction, engineers are trained who are capable of effectively solving problems of automatic processing of NL, relying on the existing linguistic resources and models necessary for a specific task. The UC promotes the emergence of such specialists and interaction with technical universities. In particular, with the participation of the Computer Linguistics Center of the Russian State University for the Humanities, a “parallel” master’s program in computer linguistics for engineers is being created at MIPT.
  • “Computational linguistics for linguists” is a branch of theoretical and applied linguistics. Within the framework of this direction, linguists are trained who are capable of solving the problems of creating formal language models and linguistic resources based on them, which have the necessary properties for their use in tasks of automatic processing of NL. It is this direction that is being implemented by the master’s program “Fundamental and Computational Linguistics,” created by the UC.

The most important circumstance is that specialists trained in these two areas are necessary participants in any serious projects in the field of automatic processing of NL. Although they perform significantly different functions, the ability to communicate effectively with each other is a key factor in the success of such projects. The foundations of such interaction are laid in programs through serious engineering and mathematical training of linguists (and corresponding linguistic training of engineers).

Thus, the preparation of masters in computational linguistics in this program is based on an in-depth study of the fundamental principles of linguistics with an emphasis on methods for creating operational formal models of the language system that are adequate to the complexity of such natural language processing tasks as speech recognition and synthesis, machine translation, semantic analysis and understanding text, intelligent search.

The specifics of the UC are reflected in the following sections:

1. Formal language models (with an emphasis on the prospects for applied use);

2. Instrumental direction: specialized languages ​​and packages for linguists (such as NLTK, R, etc.), available resources (from grammars and parsers to ontologies);

3. Applied direction (certain important NLP tasks, how they are solved, how linguistics is used);

4. Mathematical and engineering training. Statistics, formal grammars, introduction to machine learning methods.

The UC offers the following courses to master's students in the field of Computer Linguistics:

  • Mathematical foundations of computational linguistics. Review course of basic mathematical methods used in computational linguistics: mathematical logic; probability theory and statistics; formal grammars; theory of algorithms, in particular - the concept of algorithm complexity; machine learning;
  • Linguistic task programming (NLTK and R). The objective of the course is to train students to work with available interpreters based on the Python language. A brief introduction to programming techniques in general;
  • General and computer lexicography (according to the Lexicom program). The course introduces students to the principles of modern systematic lexicography; with new methods of lexicographic work, including corpus methods. Modern computer systems for creating dictionaries are considered, new trends in lexicography are analyzed (wiki projects, expert methods for assessing affiliation, etc.);
  • Models and methods of automatic text processing (NLP/AOT). An overview course consisting of two parts (matrix, with different lecturers): basic linguistic models + main tasks to be solved. The course is methodologically connected with the course “Mathematical Foundations of Linguistic Research”. The first part of the course is of a summary nature and is based on the systemic knowledge of language acquired by masters during undergraduate studies in linguistic specialties (this knowledge is necessary to pass the entrance exam);
  • Linguistic and ontological models. Ideologically, a very important course that builds a bridge between linguistic and extralinguistic models. The course examines the interface between lexical-semantic and ontological descriptions (in particular, the project of Igor Boguslavsky). Modern linguistic-ontological resources (*net), modern “mapping” projects between them (Martha Palmer and Co.) are analyzed;
  • Corpus linguistics. The problems of creating and evaluating corpora are considered. The Internet is like a building. Methods for automatically creating corpora. Analysis of methods for using corpora in linguistic research (assessing the significance of the obtained statistical results).
  • Linguistic annotation and markup. Markup languages ​​and methods, starting with XML. Ideologically close to Hovey's course;
  • Machine translate;
  • Methods for evaluating NLP applications;
  • Formal models and resources of the world's major languages ​​(non-Indo-European);
  • Information search;
  • Question-answering systems (IBM special course);
  • Specialized linguistic databases.

The UC offers the following courses to students of the Institute of Linguistics (specialty, bachelor's, master's):

  • Introduction to Computational Linguistics;
  • Computational linguistics. Main tasks and technologies;
  • Modern methods of sociolinguistics;
  • Automatic translation
  • Linguistic foundations of machine translation;
  • Fundamentals of Computer Science;
  • The main directions of linguistic support for new information technologies (computer analysis of texts);
  • Computer science and information technology in linguistics;
  • Automatic natural language processing;
  • Automatic text processing, Automatic generation of text descriptions of images;
  • Computer support for translation activities;
  • Corpus linguistics.

Students undergo internship at ABBYY.

See also the UC Computational Linguistics page on the ABBYY website.

List of employees of the UC Computational Linguistics

Vladimir Pavlovich Selegey – Director of Linguistic Research at ABBYY, Head of the National Research Center for Computational Linguistics
"Introduction to Computational Linguistics"

The cultural and educational center "Arhe" invites you to a course of lectures by Alexander Chedovich Piperski "Computer Linguistics".

Topic of the first lecture: “The main problems of computational linguistics and approaches to their solution.”

Machine translation, spell checking, text classification, speech recognition and much more: all these are tasks of computational linguistics. You can solve them in different ways: either by trying to imitate how a person works with language, or by hoping that everything can be dealt with using big data. But natural language is not easy to process automatically, and there are many difficulties along the way. The problems include homonymy (when the same word names different things), synonymy (when, on the contrary, the same thing is called different words) and other properties of human languages ​​that we don’t even pay attention to in ordinary life.

About the lecturer:
, candidate of philological sciences, associate professor at the Institute of Linguistics of the Russian State University for the Humanities, researcher at the School of Philology at the National Research University Higher School of Economics, author of the book “Constructing Languages” (Alpina Non-Fiction, 2017).

About the course of lectures “Computational Linguistics”:

Computational linguistics is one of the most dynamically developing areas at the intersection of theory and practice. We come across the achievements of computer linguistics every day: machine translation, Internet search, voice assistants, and much more. Behind each such product is the serious work of linguists and programmers. During the course, we will talk about the history of computational linguistics and its most popular methods, and also look at how they allow us to solve important practical problems: for example, checking spelling or classifying news by topic.

The Faculty of Philology of the Higher School of Economics is launching a new master's program dedicated to computational linguistics: it welcomes applicants with a basic education in the humanities and mathematics and anyone who is interested in solving problems in one of the most promising branches of science. Its director, Anastasia Bonch-Osmolovskaya, told Theories and Practitioners what computational linguistics is, why robots will not replace humans, and what they will teach in the HSE master’s program in computational linguistics.

This program is almost the only one of its kind in Russia. Where did you study?

I studied at Moscow State University in the department of theoretical and applied linguistics of the philological faculty. I didn’t get there right away, first I entered the Russian department, but then I became seriously interested in linguistics, and I was attracted by the atmosphere that remains in the department to this day. The most important thing there is good contact between teachers and students and their mutual interest.

When I had children and needed to earn a living, I went into the field of commercial linguistics. In 2005, it was not very clear what this area of ​​activity as such was. I worked in different linguistic companies: I started with a small company at the site Public.ru - this is a kind of media library, where I started working on linguistic technologies. Then I worked for a year at Rosnanotech, where there was an idea to create an analytical portal so that the data on it would be automatically structured. Then I headed the linguistic department at the Avicomp company - this is already a serious production in the field of computer linguistics and semantic technologies. At the same time, I taught a course on computational linguistics at Moscow State University and tried to make it more modern.

Two resources for a linguist: - a site created by linguists for scientific and applied research related to the Russian language. This is a model of the Russian language, presented using a huge array of texts from different genres and periods. The texts are equipped with linguistic markup, with the help of which you can obtain information about the frequency of certain linguistic phenomena. Wordnet is a huge lexical database of the English language; the main idea of ​​Wordnet is to connect not words, but their meanings into one large network. Wordnet can be downloaded and used for your own projects.

What does computational linguistics do?

This is the most interdisciplinary field. The most important thing here is to understand what is going on in the electronic world and who will help you do specific things.

We are surrounded by a very large amount of digital information, there are many business projects, the success of which depends on the processing of information, these projects can relate to the field of marketing, politics, economics and anything else. And it is very important to be able to handle this information effectively - the main thing is not only the speed of processing information, but also the ease with which you can, after filtering out the noise, get the data you need and create a complete picture from it.

Previously, some global ideas were associated with computer linguistics, for example: people thought that machine translation would replace human translation, that robots would work instead of people. But now it seems like a utopia, and machine translation is used in search engines to quickly search in an unknown language. That is, now linguistics rarely deals with abstract problems - mostly with some small things that can be inserted into a large product and make money on it.

One of the big tasks of modern linguistics is the semantic web, when the search occurs not just by matching words, but by meaning, and all sites are in one way or another marked by semantics. This can be useful, for example, for police or medical reports that are written every day. Analysis of internal connections provides a lot of necessary information, but reading and calculating it manually is incredibly time-consuming.

In a nutshell, we have a thousand texts, we need to sort them into groups, present each text in the form of a structure and get a table with which we can already work. This is called unstructured information processing. On the other hand, computational linguistics deals, for example, with the creation of artificial texts. There is a company that has come up with a mechanism for generating texts on topics that are boring for a person to write about: changes in real estate prices, weather forecasts, reports on football matches. It is much more expensive to order these texts for a person, and computer texts on such topics are written in coherent human language.

Yandex is actively involved in developments in the field of searching for unstructured information in Russia; Kaspersky Lab hires research groups that study machine learning. Is someone in the market trying to come up with something new in the field of computational linguistics?

**Books on computational linguistics:**

Daniel Jurafsky, Speech and Language Processing

Christopher Manning, Prabhakar Raghavan, Heinrich Schuetze, "Introduction to Information Retrieval"

Yakov Testelets, “Introduction to General Syntax”

Most linguistic developments are the property of large companies; almost nothing can be found in the public domain. This slows down the development of the industry; we do not have a free linguistic market or packaged solutions.

In addition, there is a lack of comprehensive information resources. There is such a project as the National Corpus of the Russian Language. This is one of the best national buildings in the world, which is rapidly developing and opens up incredible opportunities for scientific and applied research. The difference is about the same as in biology - before DNA research and after.

But many resources do not exist in Russian. Thus, there is no analogue to such a wonderful English-language resource as Framenet - this is a conceptual network where all possible connections of a particular word with other words are formally presented. For example, there is the word “fly” - who can fly, where, with what preposition this word is used, what words it is combined with, and so on. This resource helps to connect language with real life, that is, to trace how a specific word behaves at the level of morphology and syntax. It is very useful.

The Avicomp company is currently developing a plugin for searching articles with similar content. That is, if you are interested in an article, you can quickly look at the history of the plot: when the topic arose, what was written and when was the peak of interest in this problem. For example, with the help of this plugin it will be possible, starting from an article devoted to events in Syria, to very quickly see how events there have developed over the past year.

How will the learning process in the master's program be structured?

Education at HSE is organized in separate modules, just like in Western universities. Students will be divided into small teams, mini-startups - that is, at the end we should receive several finished projects. We want to get real products, which we will then open to people and leave in the public domain.

In addition to the students' immediate project managers, we want to find them curators from among their potential employers - from the same Yandex, for example, who will also play this game and give the students some advice.

I hope that people from a variety of fields will come to the master's program: programmers, linguists, sociologists, marketers. We will have several adaptation courses in linguistics, mathematics and programming. Then we will have two serious courses in linguistics, and they will be related to the most current linguistic theories; we want our graduates to be able to read and understand modern linguistic articles. It's the same with mathematics. We will have a course called “Mathematical Foundations of Computational Linguistics,” which will outline those branches of mathematics on which modern computational linguistics is based.

In order to enroll in a master's program, you need to pass an entrance exam in language and pass a portfolio competition.

In addition to the main courses, there will be a line of elective subjects. We have planned several cycles - two of them are focused on a more in-depth study of individual topics, which include, for example, machine translation and corpus linguistics, and one, on the contrary, is related to related areas: such as , social networks, machine learning or Digital Humanities - a course that we hope will be taught in English.