Table 1

Measurement methods of Chinese linguistic features

Linguistic featureFormulaOperation stepsCriteria
ANSEmbedded Image1. No of sentences: Syncopate sentence by Chinese punctuation mark (eg, ‘。’, ‘?’, ‘!’, ‘; ‘, ‘……’) and count them.>10 for simple materials; 6–10 for primary materials; <6 for intermediate and difficult materials.
2. No of words: The total no of words in the texts excluded Arabic numerals and English letters.
ASLEmbedded Image
ANDWEmbedded ImageDifficult words: The words in third, fourth and superclass grades of <Vocabulary and Characters of Different Hsk Levels>61
1. Segment words by NLPIR Chinese lexical analysis system.
2. Build word database based on <Vocabulary and Characters of Different Hsk Levels>by SQL Server software (Microsoft, Redmond, Washington, USA).
3. Calculate the word frequency of each grade in the text with SQL Server software.
CLDCASL +ANDW20–30 for intermediate materials; >30 for difficult materials.
  • ANDW, average number of difficult words per hundred words; ANS, average number of sentences per hundred words; ASL, average sentence length; CLDC, Chinese language difficulty coefficient; NLPIR, Natural Language Processing and Information Retrieval Sharing Platform.