Jun 29, 2026 News!Vol.18, No. 2 has been published with online version. [Click]
Mar 30, 2026 News!Vol.18, No. 1 has been published with online version. [Click]
Oct 09, 2025 News!The papers published in Vol. 17, No. 3 has been registered with CNKI. [Click]

General Information

ISSN: 1793-8236 (Online)
Abbreviated Title Int. J. Eng. Technol.
Frequency: Quarterly
DOI: 10.7763/IJET
APC: 500 USD
Managing Editor: Ms. Isa Yuan
Abstracting/ Indexing: CNKI, Google Scholar, Crossref, EBSCO etc.
E-mail: ijet_Editor@126.com

HOME > Archive > 2026 > Volume 18 Number 2 (2026) >

IJET 2026 Vol.18(2): 49-52
DOI: 10.7763/IJET.2026.V18.1342

Research on Core Issues and Mainstream Algorithms of Chinese Word Segmentation

Yuemeng Ren

School of Intelligent Science and Engineering, Chengdu Neusoft University, Sichuan, China
Email: 594251028@qq.com

Manuscript received May 1, 2026; accepted May 13, 2026; published June 15, 202

Abstract——With the rapid development of large language models and the deep integration of artificial intelligence into various industries the requirements for accuracy and generalization ability in Natural Language Processing (NLP) are constantly rising. Large models centered on Transformers are propelling NLP into a new stage. Chinese lacks natural word boundaries, making word segmentation a fundamental step in Chinese NLP, and its accuracy directly determines the effectiveness of subsequent tasks. Due to its linguistic characteristics, Chinese word segmentation has long faced three core challenges: inconsistencies between general vocabularies and segmentation standards, difficulties in handling ambiguous segments, and poor performance in out-of-vocabulary word identification. The paper highlights the advantages of deep learning for word segmentation, detailing classic neural networks such as CNN, RNN, LSTM, and BiLSTM-CRF, as well as the application of pre-trained models including BERT, RoBERTa, and lightweight real-time models. The paper emphasizes the advantages of deep learning in word segmentation, detailing classic neural network models such as CNN, RNN, LSTM, and BiLSTM CRF, as well as the application of BERT, RoBERTa pre-trained models, and lightweight real-time models in word segmentation. Research shows that deep learning-based word segmentation methods offer the best overall performance, effectively solving challenges in ambiguous segmentation and out-of-vocabulary word recognition. Different algorithms and systems can meet the diverse needs of scientific research, industry, and vertical fields. This study clarifies the evolution of Chinese word segmentation technology, providing a reference for the selection, engineering implementation, and optimization of word segmentation algorithms in the large model era, and is highly valuable for advancing the high-quality development of Chinese natural language processing.

Keywords—large language model, chinese word segmentation, deep learning, natural language processing

Cite: Yuemeng Ren, "Research on Core Issues and Mainstream Algorithms of Chinese Word Segmentation," International Journal of Engineering and Technology, vol. 18, no. 2, pp. 49-52, 2026.

Copyright © 2026 by the authors. This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).

附件说明

PREVIOUS PAPER

Technical Research on Social Media Text Stance Detection

NEXT PAPER

A Comprehensive Analysis of CMAS Characterization, Corrosion Scenarios, and Protective Measures

2009

2010

2011

2012

2013

2014

2015

2016

2017

2018

2019

2020

2021

2022

2023

2024

2025

2026

Research on Core Issues and Mainstream Algorithms of Chinese Word Segmentation