{"id":53055,"date":"2024-11-25T10:13:39","date_gmt":"2024-11-25T08:13:39","guid":{"rendered":"https:\/\/www.bas.bg\/?p=53055"},"modified":"2024-11-28T10:14:54","modified_gmt":"2024-11-28T08:14:54","slug":"significant-advances-in-language-modelling-by-scientists-from-iict-bas","status":"publish","type":"post","link":"https:\/\/www.bas.bg\/?p=53055&lang=en","title":{"rendered":"Significant advances in language modelling by scientists from IICT-BAS"},"content":{"rendered":"<div id=\"attachment_52947\" style=\"width: 252px\" class=\"wp-caption alignright\"><a href=\"https:\/\/www.bas.bg\/wp-content\/uploads\/2024\/11\/\u0438\u0438\u043a\u0442.jpg\"><img decoding=\"async\" aria-describedby=\"caption-attachment-52947\" class=\"wp-image-52947 size-medium\" src=\"https:\/\/www.bas.bg\/wp-content\/uploads\/2024\/11\/\u0438\u0438\u043a\u0442-242x300.jpg\" alt=\"\" width=\"242\" height=\"300\" srcset=\"https:\/\/www.bas.bg\/wp-content\/uploads\/2024\/11\/\u0438\u0438\u043a\u0442-121x150.jpg 121w, https:\/\/www.bas.bg\/wp-content\/uploads\/2024\/11\/\u0438\u0438\u043a\u0442-200x248.jpg 200w, https:\/\/www.bas.bg\/wp-content\/uploads\/2024\/11\/\u0438\u0438\u043a\u0442-242x300.jpg 242w, https:\/\/www.bas.bg\/wp-content\/uploads\/2024\/11\/\u0438\u0438\u043a\u0442-400x497.jpg 400w, https:\/\/www.bas.bg\/wp-content\/uploads\/2024\/11\/\u0438\u0438\u043a\u0442-600x745.jpg 600w, https:\/\/www.bas.bg\/wp-content\/uploads\/2024\/11\/\u0438\u0438\u043a\u0442-768x954.jpg 768w, https:\/\/www.bas.bg\/wp-content\/uploads\/2024\/11\/\u0438\u0438\u043a\u0442-800x994.jpg 800w, https:\/\/www.bas.bg\/wp-content\/uploads\/2024\/11\/\u0438\u0438\u043a\u0442-824x1024.jpg 824w, https:\/\/www.bas.bg\/wp-content\/uploads\/2024\/11\/\u0438\u0438\u043a\u0442-1200x1491.jpg 1200w, https:\/\/www.bas.bg\/wp-content\/uploads\/2024\/11\/\u0438\u0438\u043a\u0442-1237x1536.jpg 1237w, https:\/\/www.bas.bg\/wp-content\/uploads\/2024\/11\/\u0438\u0438\u043a\u0442.jpg 1557w\" sizes=\"(max-width: 242px) 100vw, 242px\" \/><\/a><p id=\"caption-attachment-52947\" class=\"wp-caption-text\">Georgi Shopov<\/p><\/div>\n<p style=\"text-align: justify;\"><a href=\"https:\/\/www.bas.bg\/wp-content\/uploads\/2024\/11\/1.png\"><img decoding=\"async\" class=\"size-medium wp-image-52919 alignleft\" src=\"https:\/\/www.bas.bg\/wp-content\/uploads\/2024\/11\/1-300x169.png\" alt=\"\" width=\"300\" height=\"169\" srcset=\"https:\/\/www.bas.bg\/wp-content\/uploads\/2024\/11\/1-150x84.png 150w, https:\/\/www.bas.bg\/wp-content\/uploads\/2024\/11\/1-200x113.png 200w, https:\/\/www.bas.bg\/wp-content\/uploads\/2024\/11\/1-300x169.png 300w, https:\/\/www.bas.bg\/wp-content\/uploads\/2024\/11\/1-400x225.png 400w, https:\/\/www.bas.bg\/wp-content\/uploads\/2024\/11\/1-600x338.png 600w, https:\/\/www.bas.bg\/wp-content\/uploads\/2024\/11\/1-768x432.png 768w, https:\/\/www.bas.bg\/wp-content\/uploads\/2024\/11\/1-800x450.png 800w, https:\/\/www.bas.bg\/wp-content\/uploads\/2024\/11\/1-1024x576.png 1024w, https:\/\/www.bas.bg\/wp-content\/uploads\/2024\/11\/1-1200x675.png 1200w, https:\/\/www.bas.bg\/wp-content\/uploads\/2024\/11\/1-1536x864.png 1536w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<p>The PhD student Georgi Shopov from the Institute of Information and Communication Technologies of the Bulgarian Academy of Sciences (IICT-BAS) took part in the world&#8217;s leading conference in the field of natural language processing &#8211; &#8222;Empirical Methods in Natural Language Processing&#8220; which was held from 12-16 November in Miami, USA. At the conference, Georgi Shopov presented new scientific results in the field of language modeling achieved at IICT-BAS and forming the main part of his dissertation. 1271 papers were selected for the main conference from 6105 submitted. Georgi Shopov&#8217;s paper, co-authored with Associate Professor Stefan Gerdjikov from IICT-BAS and FMI of SU \u201cSt. Kliment Ohridski\u201d were among the 168 awarded with an oral report.<\/p>\n<p>In recent years, language models have established themselves as a fundamental approach in Artificial Intelligence. They have demonstrated remarkable abilities in solving problems related to natural language processing, programming, protein modeling, and generating basic linguistic and mathematical reasoning. However, the widely used modern language models (ChatGPT, Llama, Gemini, Claude) are <em>unidirectional<\/em><em>:<\/em> they process and generate text strictly from left to right. The fixed directionality of this type of language models severely limits their expressiveness.<\/p>\n<p>In their development, the scientists from IICT-BAS have presented a new theoretical view on language modeling based on well-known formalisms from automata theory. Thanks to this formal connection, they have introduced a new class of <em>bidirectional<\/em> language models that are strictly more expressive than unidirectional ones and allow solving significantly more complex problems. Another advantage of bidirectional language models is their higher efficiency compared to unidirectional ones. In other words, bidirectional language models allow text generation to be performed faster, on lower-performance computing devices, and at lower power consumption which greatly increases their applicability.<\/p>\n<p>\u0412 \u0431\u044a\u0434\u0435\u0449\u0435 \u0443\u0447\u0435\u043d\u0438\u0442\u0435 \u043e\u0442 \u0418\u0418\u041a\u0422\u2013\u0411\u0410\u041d \u043f\u043b\u0430\u043d\u0438\u0440\u0430\u0442 \u0434\u0430 \u0440\u0430\u0437\u0432\u0438\u0432\u0430\u0442 \u0434\u0432\u0443\u043f\u043e\u0441\u043e\u0447\u043d\u0438\u0442\u0435 \u0435\u0437\u0438\u043a\u043e\u0432\u0438 \u043c\u043e\u0434\u0435\u043b\u0438 \u0441 \u0446\u0435\u043b \u0434\u0430 \u0441\u0435 \u043f\u043e\u0437\u0432\u043e\u043b\u0438 \u0435\u0444\u0435\u043a\u0442\u0438\u0432\u0435\u043d \u043a\u043e\u043d\u0442\u0440\u043e\u043b \u043d\u0430 \u0433\u0435\u043d\u0435\u0440\u0438\u0440\u0430\u043d\u0438\u044f \u0442\u0435\u043a\u0441\u0442, \u0434\u0430 \u0441\u0435 \u043e\u043f\u0440\u0435\u0434\u0435\u043b\u044f \u0434\u044a\u043b\u0431\u043e\u0447\u0438\u043d\u0430\u0442\u0430 \u043d\u0430 \u0438\u0437\u0432\u043e\u0434\u0430 \u0438 \u0434\u0430 \u0441\u0435 \u0438\u0437\u0431\u044f\u0433\u0432\u0430\u0442 \u0442\u0430\u043a\u0430 \u043d\u0430\u0440\u0435\u0447\u0435\u043d\u0438\u0442\u0435 \u0445\u0430\u043b\u044e\u0446\u0438\u043d\u0430\u0446\u0438\u0438 \u043d\u0430 \u0435\u0437\u0438\u043a\u043e\u0432\u0438\u0442\u0435 \u043c\u043e\u0434\u0435\u043b\u0438.<\/p>\n<p>In the future, researchers at IICT-BAS plan to develop bidirectional language models in order to allow effective control of the generated text, to determine the depth of inference and to avoid so-called hallucinations of language models.<\/p>\n<p>Link to publication:<\/p>\n<p><a href=\"https:\/\/aclanthology.org\/2024.emnlp-main.328.pdf\">https:\/\/aclanthology.org\/2024.emnlp-main.328.pdf<\/a><\/p>\n<p><a href=\"https:\/\/www.bas.bg\/wp-content\/uploads\/2024\/11\/2.png\"><img decoding=\"async\" class=\"alignnone wp-image-52921\" src=\"https:\/\/www.bas.bg\/wp-content\/uploads\/2024\/11\/2-300x70.png\" alt=\"\" width=\"356\" height=\"83\" srcset=\"https:\/\/www.bas.bg\/wp-content\/uploads\/2024\/11\/2-150x35.png 150w, https:\/\/www.bas.bg\/wp-content\/uploads\/2024\/11\/2-200x47.png 200w, https:\/\/www.bas.bg\/wp-content\/uploads\/2024\/11\/2-300x70.png 300w, https:\/\/www.bas.bg\/wp-content\/uploads\/2024\/11\/2-400x93.png 400w, https:\/\/www.bas.bg\/wp-content\/uploads\/2024\/11\/2-600x140.png 600w, https:\/\/www.bas.bg\/wp-content\/uploads\/2024\/11\/2-768x179.png 768w, https:\/\/www.bas.bg\/wp-content\/uploads\/2024\/11\/2-800x187.png 800w, https:\/\/www.bas.bg\/wp-content\/uploads\/2024\/11\/2-1024x239.png 1024w, https:\/\/www.bas.bg\/wp-content\/uploads\/2024\/11\/2-1200x280.png 1200w, https:\/\/www.bas.bg\/wp-content\/uploads\/2024\/11\/2-1536x358.png 1536w\" sizes=\"(max-width: 356px) 100vw, 356px\" \/><\/a><\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The PhD student Georgi Shopov from the Institute of Information and Communication Technologies of the Bulgarian Academy of Sciences (IICT-BAS) took part in the world&#8217;s leading conference in the field of natural language processing &#8211; &#8222;Empirical Methods in Natural Language Processing&#8220; which was held from 12-16 November in Miami, USA. At the conference, Georgi Shopov presented new scientific results in the field of language modeling achieved at IICT-BAS and forming the main part of his dissertation. 1271 papers were selected for the main conference from 6105 submitted. Georgi Shopov&#8217;s paper, co-authored with Associate Professor Stefan Gerdjikov from IICT-BAS and FMI of SU \u201cSt. Kliment Ohridski\u201d were among the 168 awarded with an oral report. In recent years, language models have established themselves as a fundamental <a href=\"https:\/\/www.bas.bg\/?p=53055&#038;lang=en\"> [&#8230;]<\/a><\/p>\n","protected":false},"author":15,"featured_media":52920,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1499],"tags":[],"_links":{"self":[{"href":"https:\/\/www.bas.bg\/index.php?rest_route=\/wp\/v2\/posts\/53055"}],"collection":[{"href":"https:\/\/www.bas.bg\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.bas.bg\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.bas.bg\/index.php?rest_route=\/wp\/v2\/users\/15"}],"replies":[{"embeddable":true,"href":"https:\/\/www.bas.bg\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=53055"}],"version-history":[{"count":2,"href":"https:\/\/www.bas.bg\/index.php?rest_route=\/wp\/v2\/posts\/53055\/revisions"}],"predecessor-version":[{"id":53057,"href":"https:\/\/www.bas.bg\/index.php?rest_route=\/wp\/v2\/posts\/53055\/revisions\/53057"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.bas.bg\/index.php?rest_route=\/wp\/v2\/media\/52920"}],"wp:attachment":[{"href":"https:\/\/www.bas.bg\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=53055"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.bas.bg\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=53055"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.bas.bg\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=53055"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}