Научная статья на тему 'ХИМИЧЕСКАЯ ХАРАКТЕРИСТИКА И КЛАССИФИКАЦИЯ ГЕОГРАФИЧЕСКОГО ПРОИСХОЖДЕНИЯ ВЬЕТНАМСКИХ ЗЕЛЕНЫХ ЧАЕВ НА ОСНОВЕ ДАННЫХ 1H-ЯМР В СОЧЕТАНИИ С МАШИННЫМ ОБУЧЕНИЕМ'

ХИМИЧЕСКАЯ ХАРАКТЕРИСТИКА И КЛАССИФИКАЦИЯ ГЕОГРАФИЧЕСКОГО ПРОИСХОЖДЕНИЯ ВЬЕТНАМСКИХ ЗЕЛЕНЫХ ЧАЕВ НА ОСНОВЕ ДАННЫХ 1H-ЯМР В СОЧЕТАНИИ С МАШИННЫМ ОБУЧЕНИЕМ Текст научной статьи по специальности «Химические науки»

CC BY
40
6
i Надоели баннеры? Вы всегда можете отключить рекламу.
Ключевые слова
PLS-DA и sPLS-DA / зеленые чаи / химическая характеристика / центелла азиат-ская / 1H-ЯМР / PLSDA and sPLS-DA / green teas / chemical characterization / Centella asiatica / 1H-NMR

Аннотация научной статьи по химическим наукам, автор научной работы — Фам Куанг Трунг, Хоанг Бич Нгок, Нгуен Ван Тхык, Чан Тхи Хюэ, Фам Гиа Бах

Спектроскопия ядерного магнитного резонанса (ЯМР) широко используется для анализа образцов биологического происхождения, таких как кофе, мед, фруктовые соки и т. д. В этом исследовании химический состав 34 образцов вьетнамского зеленого чая был идентифицирован с помощью 1H-ЯМР-спектроскопии. Образцы вьетнамского зеленого чая, собранные в трех провинциях — Баккан, Тайнгуен и Лаокай, были классифицированы в зависимости от возраста чайных листьев и чайных деревьев, включая древний зеленый чай и обычный зеленый чай, а также их географического происхождения. Химический со-став, такой как катехины, кофеин и некоторые аминокислоты, был идентифицирован в спектрах 1H-ЯМР как древнего чая, так и молодого зеленого чая. На основании спектральной картины классификацию образцов чая проводили с помощью моделей частичного наименьших квадратов дискриминантного анализа (PLS-DA) и разреженных частичных наименьших квадратов дискриминантного анализа (sPLS-DA) с использованием программного обеспечения “Metabo Analyst” 5.0. Дискриминационные результаты показали, что возраст и биологическое происхождение зеленого чая были точно классифицированы по PLS-DA и sPLS-DA, достигнув 82,6% и 81,2% соответственно. Кроме того, результаты классификации выявили значительную разницу между зеленым чаем Лаокай и Баккан, в то время как зеленые чаи Тайнгуен демонстрировали характеристики обоих регионов. Модели классификации на основе контролируемого обучения были применены для создания базы данных, классификации и идентификации образцов зеленого чая на основе данных 1H-ЯМР-спектроскопии.

i Надоели баннеры? Вы всегда можете отключить рекламу.

Похожие темы научных работ по химическим наукам , автор научной работы — Фам Куанг Трунг, Хоанг Бич Нгок, Нгуен Ван Тхык, Чан Тхи Хюэ, Фам Гиа Бах

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

CHEMICAL CHARACTERIZATION AND CLASSIFICATION OF GEOGRAPHICAL ORIGIN OF VIETNAMESE GREEEN TEAS BASED ON 1HNMR DATA COMBINED WITH MACHINE LEARNING

Nuclear magnetic resonance spectroscopy (NMR) is widely used for analyzing biological origin samples such as coffee, honey, fruit juice, etc. In this study, the chemical composition of 34 samples of Vietnamese green teas were identified by 1H-NMR spectroscopy. The Vietnamese green tea samples collected in three provinces Bac Kan, Thai Nguyen and Lao Cai were classified according to the age of tea leaves and tea trees, including ancient green tea and regular green tea as well as their geographical origin. The chemical compositions such as catechins, caffeine and some amino acids were identified in 1H-NMR spectra for both ancient tea and young green tea. Based on spectral pattern, the classification of tea samples was performed by partial least squares discriminant analysis (PLS-DA) and sparse partial least squares discriminant analysis (sPLS-DA) models using Metabo Analyst 5.0 software. The discriminate results showed that the age and bio-logical origin of green teas were classified accurately by PLS-DA and sPLS-DA, reaching 82.6% and 81.2%, respectively. Furthermore, the classification results revealed a significant difference of Lao Cai and Bac Kan green tea, while Thai Nguyen green teas exhibited characteristics of both regions. The supervised learning-based classification models were applied to build database, clas-sify, and identify green tea patterns based on 1H-NMR spectroscopic data.

Текст научной работы на тему «ХИМИЧЕСКАЯ ХАРАКТЕРИСТИКА И КЛАССИФИКАЦИЯ ГЕОГРАФИЧЕСКОГО ПРОИСХОЖДЕНИЯ ВЬЕТНАМСКИХ ЗЕЛЕНЫХ ЧАЕВ НА ОСНОВЕ ДАННЫХ 1H-ЯМР В СОЧЕТАНИИ С МАШИННЫМ ОБУЧЕНИЕМ»

DOI: 10.6060/ivkkt.20236612.6874

ХИМИЧЕСКАЯ ХАРАКТЕРИСТИКА И КЛАССИФИКАЦИЯ ГЕОГРАФИЧЕСКОГО ПРОИСХОЖДЕНИЯ ВЬЕТНАМСКИХ ЗЕЛЕНЫХ ЧАЕВ НА ОСНОВЕ ДАННЫХ 1Н-ЯМР В СОЧЕТАНИИ С МАШИННЫМ ОБУЧЕНИЕМ

Фам Куанг Трунг, Хоанг Бич Нгок, Нгуен Ван Тхык, Чан Тхи Хюэ, Фам Гиа Бах, Та Тхи Тхао

Фам Куанг Трунг, Хоанг Бич Нгок, Нгуен Ван Тхык*, Фам Гиа Бах, Та Тхи Тхао

Химический Факультет, Университет Естественных Наук, Ханойский Государственный Университет, 19, Ле Тхан Тонг - Хоан Кием - Ханой, Вьетнам

E-mail: Trungpham781@hus.edu.vn, bichngoc.hoang1999@gmail.com, nguyenvanthuc@hus.edu.vn*, ta-thithao@hus.edu.vn

Чан Тхи Хюэ

Химический Факультет, Тхайнгуенский Педагогический университет, Тан Тхинь, Тхайнгуен, Вьетнам E-mail: huekhoahoand@gmail.com

Спектроскопия ядерного магнитного резонанса (ЯМР) широко используется для анализа образцов биологического происхождения, таких как кофе, мед, фруктовые соки и т. д. В этом исследовании химический состав 34 образцов вьетнамского зеленого чая был идентифицирован с помощью 1Н-ЯМР-спектроскопии. Образцы вьетнамского зеленого чая, собранные в трех провинциях — Баккан, Тайнгуен и Лаокай, были классифицированы в зависимости от возраста чайных листьев и чайных деревьев, включая древний зеленый чай и обычный зеленый чай, а также их географического происхождения. Химический состав, такой как катехины, кофеин и некоторые аминокислоты, был идентифицирован в спектрах 1Н-ЯМР как древнего чая, так и молодого зеленого чая. На основании спектральной картины классификацию образцов чая проводили с помощью моделей частичного наименьших квадратов - дискриминантного анализа (PLS-DA) и разреженных частичных наименьших квадратов - дискриминантного анализа (sPLS-DA) с использованием программного обеспечения "Metabo Analyst" 5.0. Дискриминационные результаты показали, что возраст и биологическое происхождение зеленого чая были точно классифицированы по PLS-DA и sPLS-DA, достигнув 82,6% и 81,2% соответственно. Кроме того, результаты классификации выявили значительную разницу между зеленым чаем Лаокай и Баккан, в то время как зеленые чаи Тайнгуен демонстрировали характеристики обоих регионов. Модели классификации на основе контролируемого обучения были применены для создания базы данных, классификации и идентификации образцов зеленого чая на основе данных 1Н-ЯМР-спектроскопии.

Ключевые слова: PLS-DA и sPLS-DA, зеленые чаи, химическая характеристика, центелла азиатская, 'H-ЯМР

Для цитирования:

Фам Куанг Трунг, Хоанг Бич Нгок, Нгуен Ван Тхык, Чан Тхи Хюэ, Фам Гиа Бах, Та Тхи Тхао Химическая характеристика и классификация географического происхождения вьетнамских зеленых чаев на основе данных :H-HMP в сочетании с машинным обучением. Изв. вузов. Химия и хим. технология. 2023. Т. 66. Вып. 12. С. 56-64. DOI: 10.6060/ivkkt.20236612.6874. For citation:

Pham Quang Trung, Hoang Bich Ngoc, Nguyen Van Thuc, Tran Thi Hue, Pham Gia Bach, Ta Thi Thao Chemical characterization and classification of geographical origin of vietnamese greeen teas based on 1H- NMR data combined with machine learning. ChemChemTech [Izv. Vyssh. Uchebn. Zaved. Khim. Khim. Tekhnol.]. 2023. V. 66. N 12. P. 56-64. DOI: 10.6060/ivkkt.20236612.6874.

CHEMICAL CHARACTERIZATION AND CLASSIFICATION OF GEOGRAPHICAL ORIGIN OF VIETNAMESE GREEEN TEAS BASED ON 1H- NMR DATA COMBINED

WITH MACHINE LEARNING

Pham Quang Trung, Hoang Bich Ngoc, Nguyen Van Thuc, Tran Thi Hue, Pham Gia Bach, Ta Thi Thao

Pham Quang Trung, Hoang Bich Ngoc, Nguyen Van Thuc*, Pham Gia Bach, Ta Thi Thao

Faculty of Chemistry, VNU University of Science, Hanoi, 19 Le Thanh Tong Street, Hoan Kiem District, 110401,

Hanoi, Vietnam

E-mail: Trungpham781@hus.edu.vn, bichngoc.hoang1999@gmail.com, nguyenvanthuc@hus.edu.vn*, phamgia-bach@hus.edu.vn, tathithao@hus.edu.vn

Tran Thi Hue

Faculty of Chemistry, Thai Nguyen University of Education, Tan Thinh Ward, Thai Nguyen City, 250000, Thai

Nguyen, Vietnam

E-mail: huekhoahoand@gmail.com

Nuclear magnetic resonance spectroscopy (NMR) is widely used for analyzing biological origin samples such as coffee, honey, fruit juice, etc. In this study, the chemical composition of 34 samples of Vietnamese green teas were identified by 1H-NMR spectroscopy. The Vietnamese green tea samples collected in three provinces - Bac Kan, Thai Nguyen and Lao Cai were classified according to the age of tea leaves and tea trees, including ancient green tea and regular green tea as well as their geographical origin. The chemical compositions such as catechins, caffeine and some amino acids were identified in 1H-NMR spectra for both ancient tea and young green tea. Based on spectral pattern, the classification of tea samples was performed by partial least squares -discriminant analysis (PLS-DA) and sparse partial least squares - discriminant analysis (sPLS-DA) models using Metabo Analyst 5.0 software. The discriminate results showed that the age and biological origin of green teas were classified accurately by PLS-DA and sPLS-DA, reaching 82.6% and 81.2%, respectively. Furthermore, the classification results revealed a significant difference of Lao Cai and Bac Kan green tea, while Thai Nguyen green teas exhibited characteristics of both regions. The supervised learning-based classification models were applied to build database, classify, and identify green tea patterns based on 1H-NMR spectroscopic data.

Key words: PLS- DA and sPLS-DA, green teas, chemical characterization, Centella asiatica, 1H-NMR

INTRODUCTION

Tea (Camellia sinensis) has been a familiar plant to humans for thousand years. It is not only widely used as a beverage, but also used in numerous food industries across the world. Tea leaves contain up to several thousand chemical compounds, of which polyphenols make up about 30-45% and caffeine accounts for about 2-5% of the solid green tea extract [ 1]. These chemical components are of particular interest because of their invaluable biological activities: antioxidant properties, stimulation of the central nervous system, heart, and respiratory system; prevention of cardiovascular disease, depression and even cancer [2-4]. Many studies have shown that climate and soil conditions, in addition to factors such as varieties, cultivation methods, manufacturing techniques [5, 6], etc., greatly influence the chemical composition of tea [7]. It's also affected by the age of tea tree [8], especially

in the case of ancient tea tree. Therefore, teas from different geographical regions have different economic values [7]. Currently, many tea shops have famous brands but no standard labels. Facing the above challenge, quality control and product traceability have become essential needs, and provide practical benefits to manufacturers and consumers.

The ancient tea tree (Camellia taliensis) is an evergreen tree that typically grows between 2-8 m high [9]. They endemic to subtropical mountain evergreen forest in the southern region of China (western of Yunan province), northern of Myanmar and Thailand, at altitude ranging from 1300-2000 m, sometime it can be found at 2700 m. In Vietnam, the ancient tea trees can be found in northern high mountainous regions of Ha Giang, Son La, Lao Cai, Bac Kan and Dien Bien provinces, and are commonly referred to be as "wild" tea trees. Local people believe that the ancient tea is

healthier, therefore their price are 4-7 times higher than ordinary teas such as Snow Shan tea of Ta Xua (Son La province) or Suoi Giang (Thai Nguyen province).

In Vietnam, apart of TCVN 9740:2013 (ISO 11287:2011), the classification of tea quality and geographical original has traditionally relied on sensory evaluations by professional tea tasters. However, this evaluation method has limitation in terms of consistency and accuracy in differentiating tea qualities. Recently, the identification and geographical origin recognition of teas could accomplish using chemomet-rics combined with some specific chemical composition including metal contents profile [10], main cate-chins, polyphenols and caffeine contents [2, 11, 12] and even stable isotope fingerprinting[13]. For analyzing chemical composition of teas, instrumental methods such as gas chromatography-mass spectrometry (GC-MS) [14] and high performance liquid chromatography (HPLC) [2, 4, 15] have been used. However, these technics require complicate sample preparation, and are time-consuming. They require large amount of solvent as well. Another involves spectroscopy method such near-infrared to classify the geographical origin of certain black and oolong tea [16], or Chinese green tea [17], but the typical broad and heavy overlapped signals lead to complex spectra, and make it very difficult to assign specific features to a specific component [7]. Nuclear magnetic resonance (NMR) spectros-copy, on the other hand, has been also applied [5, 6, 8, 18, 19]. It is a fast technique, but can provide a wide range of metabolites and a comprehensive view of the composition of tea, without requiring complicated sample preparation. It can provide a chemical fingerprinting that is useful in determination the origin and quality of tea.

In this study, a combination of untargeted :H-NMR based metabolomic and pattern recognition technique of multivariate statistical model will be established to identify Vietnamese teas from three distinct regions, based on three factors: plant age, leaf age and geography. The obtained 1H-NMR spectra could provide much useful information about the chemical compositions of tea, and also can be used as input data for the partial least squares - discriminant analysis (PLS-DA) and sparse partial least squares - discriminant analysis (sPLS-DA).

EXPERIMENTS

Materials

Total of 34 tea leave samples collected in Lao Cai, Bac Kan and Thai Nguyen provinces were dried until the leaves were crisp, then constant dried in dry air for a week and finally ground into fine powder.

Dried tea powder is stored in a zip bag at room temperature. Grinded tea samples should be stored in a cool place and should only be used within 6-8 months after grinding.

NMR measurement

An amount of 0.5 g of ground tea leaves was put into a test tube and 10 mL of boiled deionized water was added. The mixture was ultrasonicated at 50 °C for 15 min, then it was left to room temperature before the supernatant was filtered. This process was repeated a second time. The two filtrates were mixed together. An amount of 0.9 mL of supernatant was transferred into 5 mm NMR tube, and 0.1 mL of D2O was added. All :H-NMR spectra were acquired at 298 K on a Bruker AVANCE III 500 MHz spectrometer, equipped a 5mm BBFO multi-nuclei probe, with z-gradient and auto field locking, tunning and nuclei matching functions. For solvent suppression, a ZGPR pulse sequence, followed by a NOESYGPPR1D pulse sequence were used to find the exact frequency of water and then suppress this signal. For each sample, 32 transients were collected into a time domain (TD) of 65 K complex data points using a 10245 Hz spectral width. The relaxation delay (RD) and acquisition time (AQ) were set to 4 s and 3.27 s, respectively, and four dummy scans (DS) were applied just before the acquisition. The 90° pulse width was calculated by using command puslecal in Topspin 3.2 package (Bruker Biospin, Germany) and receiver gain was optimized by Topspin AU xaua.

Data processing and multivariate analysis of H-NMR spectra

The FID signals were multiplied with a 0.3 Hz line broadening exponential function prior to Fourier transformation (FT). All 1H-NMR spectra were manually phase and baseline corrected in Topspin 3.2. Then the data were converted to ASCII format, with the parameters of sample name, frequency, peak intensity and chemical shift. These data were classified follow its origin, and then uploaded to the Metabo Analyst, an online software.

Pattern Recognition

Partial least squares discriminant analysis (PLS-DA) is considered an alternative classifier to PCA, as it is a fast linear method that often leads to optimal performance. In this case, the appropriate number of latent variables of the model must be selected for example by maximizing the classifier efficiency in cross-validation. Intertextual Range - Mean Center - Median Normalization was used as a pre-processing method for all data sets prior to analysis.

The sparse method is an extension of the classical methods, where the parameter vectors of a model are forced to contain many zeros by adding a penalty

term to the objective function of the method under consideration. The algorithms used in this work for sPLS-DA apply the lowest absolute shrinkage and selection operator (Lasso) approach to generate sparsity on the coefficients of the model. Standard Deviation - Normalized to Sum - Original Cube Transform - Autoscal-ing was used as a pre-processing method for all data sets prior to analysis.

To perform tea classification, the study has tested two multivariate algorithm models PLS-DA, sPLS-DA. The data preprocessing and classification process are all done on the online software Metabo Analyst 5.0. Perform a combination of data normalization options to maximize the number of PCs and model accuracy.

A

B

C

1 L

Fig. 1. 'H-NMR spectral of a Snow Shan tea. (A) 'H-NMR spectral; (B) 'H-NMR with solvent suppression using ZGPR pulse sequence;

(C) 'H-NMR with solvent suppression using NOESYGPPR1D pulse sequence Рис. 1. Спектр 'H-ЯМР чая "Snow Shan" (А) Спектр 'Н-ЯМР; (B) 'H-ЯМР с подавлением растворителем и использованием импульсной последовательности ZGPR; (C) 'H-ЯМР с подавлением растворителя и использованием импульсной последовательности NOESYGPPR1D

8.0

7.5

7.0

6.5

6.0

5.5

5.0

4.5

4.0

3.5

3.0

2.5

2.0

1.5

1.0

0.5

RESULT AND DISCUSSION

Identification of Chemical Constituents in Dried Tea Leaves

An example of 1H-NMR spectra of tea extracts obtained from Bac Kan province was shown in fig. 2. The assignements of main metabolites were compared with data of some references [5, 6, 18, 20], revealing approximately 30 compounds in more than 50 signals or groups of signals have been shown. The whole spectra could be divided into three main regions.

The first region, between 0.5-3.0 ppm (Fig. 1), contained significant signals of theanine (1.02, 1.99, 2.31, 3.11 ppm), which are commonly found in tea, quinic acid (1.80, 1.90, 1.96 ppm) and theogallin (2.02, 2.05 ppm). Fatty acids and of a-amino acids, such as

leucine, valine, and alanine were also detected in this region. In the second region from 3.0 to 5.0 ppm (Fig. 2), small signals corresponding with sugars were recognized, with sucrose giving the most obvious signal. Caffeine was the main xanthine observed in the spectrum with significant signals at 3.22, 3.39, and 3.81 ppm. Five other common catechins, including (-)-epi-gallocatechin gallate (EGCG), (-)-epigallocatechin (EGC), (-)-epicatechin gallate (ECG), (-)-epicatechin (EC) and as well as unknown catechins such as (-)-gal-locatechin-3-gallate (GCG), (-)-gallocatechin (GC), (-)-catechin-3-gallate (CG), (-)-epigallo-catechin-3-(3''-O-methyl)-gallate, (-)-epigallocatechin -3,5-digal-late, (-)-epicatechin-3,5-digallate, or epiafzelechin, were mainly detected in the third region between 2.33.0 ppm and 5.0-8.0 ppm. EGC and EC signals were

observed to be insignificant in this region due to the are mainly those of kaempferol and quercetin glyco-

possibility this family of compounds can be easily sides (flavonols) along with signals of gallic acid, the-

identified by the characteristic signals arising from H- ogallin and possibly p-coumaroyl quinic acid. Theo-

3 (3.8-5.2 ppm) and H-4 (2.5-3.1 ppm) of the hetero- bromine was detected at 7.84 ppm, while theophylline

cyclic ring small signals in the region of 5.0-8.0 ppm signal was too low to be detected.

Fig. 2. 'H-NMR spectra of an ancient green tea sample from Bac Kan province. Regions from 0 - 3 ppm (above), 3 - 4.5 ppm (central)

and 5 - 8.5 ppm (below)

Рис. 2. Спектры 'H-ЯМР образца древнего зеленого чая из провинции Баккан. Регионы от 0-3 ppm (вверху), 3-4,5 ppm

(центральные) и 5-8,5 ppm (внизу)

Tea classification according to cultivate region In general, the chemical composition of the 34 tea samples were slightly similar, but there were differences in the signal intensities of the samples between regions, between green and ancient tea, and even between leaves of different ages from the same plant. Therefore, multivariate statistical analysis was employed to recognize the pattern in entire :H-NMR dataset for visualizing the global differences in tea leaf metabolite according to cultivate region.

The partial least squares-discriminant analysis (PLS-DA) model was firstly utilized to analyze the metabolic dependence of tea leaf on cultivate regions (Fig. 3). The first component (PCI) got the highest score (90.3% of the total variance), but the model's accuracy was not high, with the highest score achieved at PC8 (58.8% and 99.2% of the total variance). The figure shows that the regions are not clearly separated. Therefore, the PLS-DA model is not the optimal choice for classification. In contrast, the sparse partial least

PLS-DA

Scores Plot

□ ВАС KAN L&OCAI LTHAI NGUYEN

\ \ л Га?', д A L—^

о

squares-discriminant (sPLS-DA) model reached the highest accuracy 55.9% at PC = 2, accounting for 55.5% of the total variance.

This model more clearly separated the metabolic dependencies between the samples of Bac Kan and Lao Cai regions (Fig. 3). However, the samples from Thai Nguyen were scattered and often mistakenly identified in the Bac Kan tea area, and it could lead to significantly lower accuracy.

A further pairwise classification study had been proceeded with tea samples of Bac Kan - Lao Cai, Lao Cai - Thai Nguyen, Thai Nguyen - Bac Kan, utilizing the sPLS-DA model (Fig. 4). The results demonstrated that the classification between Bac Kan - Lao Cai tea samples exhibited better differentiation, with a high score of PC8 (84% accuracy and 72.4% of the total variance). However, the classifications between Lao Cai - Thai Nguyen (highest 64% accuracy at PC6, 84.2% of the total variance) and Thai Nguyen - Bac Kan (highest 61.1% accuracy at PC2, 77.5% of the to-

sPLS-DA

Scores Plot

0 №00 СеггропеЩ 1( ВОЗ K)

Component 1 ( 38 7 4)

Fig. 3. Tea classification results between 3 regions Bac Kan - Lao Cai - Thai Nguyen according to PLS-DA and sPLS-DA models Рис. 3. Результаты классификации чая между 3 регионами Баккан - Лаокай - Тайнгуен согласно моделям PLS-DA и sPLS-DA

Вас Kan - Thai Nguyen

Lao Cai - Thai Nguyen

Bac Kan - Lao Cai

Scores Pin

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

1 о LAO CAJ 1 IHAIЖМГГЕН о

\ % \ 0 \ ©

\/ Q \ /W О \ 1

}

V D О 1 У

CrrniKHKrrt 1 (li Sfc)

:1(503%)

Fig. 4. Tea classification results for each pair of regions of the sPLS-DA model Рис. 4. Результаты классификации чая для каждой пары регионов модели sPLS-DA

tal variance) were not as clear, as several tea samples are misidentified into Thai Nguyen area. As such, it can be concluded that Thai Nguyen tea samples possess some metabolite profiles close to both Bac Kan and Lao Cai regions, indicating that they are not very different from these two regions, while tea samples

from Bac Kan and Lao Cai exhibit distinctive characteristics. Regarding the 1H-NMR spectra of tea samples from these three locations (Fig. 5), it revealed that Thai Nguyen tea contains primary compositions such as theanine, caffein, which are similar to Lao Cai tea, but its other compositions are comparable to Bac Kan tea.

J

JlwUl

A

B

Ajiuiul_

AJLxJ

D

2.2 2.0 f1 (ppm)

Fig. 5. 'H-NMR of four samples of tea from Thai Nguyen (A and D), Bac Kan (B) and Lao Cai (C), region 0.5 - 4.4 ppm Рис. 5. 'H-ЯМР четырех образцов чая из Тайнгуена (A и D), Баккана (B) и Лаокая (C), область 0,5-4,4 частей на миллион

PLS-DA

Scot« Plot

sPLS-DA

Score* Plot

Сппртоеп 1(771%)

CtxnptHWf* 1 ( 12.5 %)

Fig. 6. Classification results between old leaves - young leaves according to 2 models PLS-DA and sPLS-DA Рис. 6. Результаты классификации старые листья - молодые листья по двум моделям PLS-DA и sPLS-DA

Classification of tea leaf metabolites according to the age of tea leaf

The differences of tea leaf metabolite profiles between young leaves and old leaves were firstly iden-

tified by PLS-DA models. The tea samples were divided into two groups based on leaf age.Their :H-NMR data were used as input, and data processing were the same as before. The PLS-DA scores had the highest accuracy of 86.96% at PC6, accounting for 98.3% of

C

4.2

4.0

3.8

3.6

3.4

3.2

3.0

2.8 2.6

2.4

1.8

1.6

.4

1.2

.0

0.8 0.6

0.4

0.2

the total variance, and there was only one misidentifed sample for each type of leaf. However, the PLS-DA score plot (Fig. 6) did not completely separate old leaves and young leaves. To improve the differentiation between the two types of tea, the sPLS-DA model was applied. This model got high scores, which had the highest accuracy of 82.6% at PC7 (85.1% of the total variance) and showed better separation (Fig. 6b).

Metabolic differentiations of ancient tea leaves and green tea leaves

Two groups of tea were analyzed. The first group consisted the ancient tea, including Snow Shan tea from Lao Cai and Bac Kan, as well as midland tea from Thai Nguyen. The second groups included red bud tea and F1 hybrid tea from Bac Kan. The obtain results showed that the PLS-DA model achieved its highest accuracy of 73.68% at PC4 (95.2% of the total variance), while the sPLS-DA model got its highest accuracy of 81.6% at PC7 (74.2% of total variance). From the classification results showed in Fig. 7, both models achieved relatively high accuracy at PCs accounting for a very high percentage of total variance. However, in the PLS-DA model, the green tea samples

PLS-DA

were entirely located within the region of the ancient tea samples. It is understandable because the compositions of ancient tea are similar to the green tea, but with different content [9, 21, 22]. Therefore, the sPLS-DA model is a more reasonable model to show the difference between green tea and ancient tea, as the separation is quite clear.

CONCLUSION

The chemical characterization and classification of Vietnameses tea samples were realised based on 1H-NMR spectroscopic data from 34 known origin sample. These samples were successfully classified by their geophraphical origin, tree age and leaf age, using the PLS-DA and sPLS-DA models, whereas the later models achieved more accurate separation. The supervised learning models PLS-DA and sPLS- DA can replace for principal component analysis (PCA) to get more accurate prediction results. These models can provide a reliable and fast tool, not requiring the need for traditional determination of chemical composition of whole sample.

sPLS-DA

Fig. 7. Results of classification of ancient tea and green tea according to 2 models PLS-DA and sPLS-DA Рис. 7. Результаты классификации древнего чая и зеленого чая по двум моделям PLS-DA и sPLS-DA

ACKNOWLEDGEMENT

This study is supported by Vietnam National University, Hanoi, in the project number QG.20.23.

Это исследование поддержано Вьетнамским национальным университетом в Ханое в рамках проекта QG.20.23.

CONFLICT OF INTEREST

The authors declare the absence a conflict of interest warranting disclosure in this article.

Авторы заявляют об отсутствии конфликта интересов, требующего раскрытия в данной статье.

REFERENCES ЛИТЕРАТУРА

Mandel S.A., Avramovich-Tirosh Y., Reznichenko L., Zheng H., Weinreb O., Amit T., Youdim M.B. // Neurosig-nals. 2005. V. 14(1-2). P. 46-60. DOI: 10.1159/000085385. Chen Q., Guo Z., Zhao J. // J. Pharmaceut. Biomed. Anal. 2008. V. 48(5). P. 1321-1325. DOI: 10.1016/j.jpba.2008.09.016. Cimpoiu C., Cristea V.-M., Hosu A., Sandru M., Seserman L. // Food Chem. 2011. V. 127(3). P. 1323-1328. DOI: 10.1016/j.foodchem.2011.01.091.

El-Shahawi M.S., Hamza A., Bahaffi S.O., Al-Sibaai A.A., Abduljabbar T.N. // Food Chem. 2012. V. 134(4). P. 2268-2275. DOI: 10.1016/j.foodchem.2012.03.039.

®aM KyaHr TpyHr h gp.

5. Lee J.-E., Lee B.-J., Chung J.-O., Shin H.-J., Lee S.-J., Lee C.-H., Hong Y.-S. // Food Res. Internal 2011. V. 44(2). P. 597-604. DOI: 10.1016/j.foodres.2010.12.004.

6. Tarachiwin L., Ute K., Kobayashi A., Fukusaki E. // J.Ag-ricult. Food Chem. 2007. V. 55(23). P. 9330-9336. DOI: 10.1021/jf071956x.

7. Chen Q., Zhang D., Pan W., Ouyang Q., Li H., Urmila K., Zhao J. // Trends Food Sci. Technol. 2015. V. 43(1). P. 63-82. DOI: 10.1016/j.tifs.2015.01.009.

8. Mozumder N., Lee Y.-R., Hwang K., Lee M.-S., Kim E.H., Hong Y.-S. // Appl. Biolog. Chem. 2020. V. 63. DOI: 10.1186/s13765-020-0492-7.

9. Gao D.-F., Zhang Y.-J., Yang C.-R., Chen K.-K., Jiang H.-J. // J. Agricult. Food Chem. 2008. V. 56(16). P. 7517-7521. DOI: 10.1021/jf800878m.

10. Meng L., Chen X., Chen X., Yuan L., Shi W., Cai Q., Huang G. // Microchem. J. 2020. V. 153. P. 104512. DOI: 10.1016/j.microc.2019.104512.

11. Guo, Z., Barimah A.O., Yin L., Chen Q., Shi J., El-Seedi H.R., Zou X. // Food Chem. 2021. V. 353. P. 129372. DOI: 10.1016/j.foodchem.2021.129372.

12. Wang J., Wang Y., Cheng J., Wang J., Sun X., Sun S., Zhang Z. // LWT - Food Sci. Technol. 2018. V. 96. P. 90-97. DOI: 10.1016/j.lwt.2018.05.012.

13. Cengiz M.F., Turan O., Ozdemir D., Albayrak Y., Per-incek F., Kocabas H // Int. J. Food Prop. 2017. V. 20(12). P. 3234-3243. DOI: 10.1080/10942912.2017.1283327.

14. Del Rio D., Stewart A.J., Mullen W., Burns J., Lean M.E., Brighenti F., Crozier A. // J. Agricult. Food Chem. 2004. V. 52(10). P. 2807-2815. DOI: 10.1021/jf0354848.

15. Li Y.-F., Ouyang S.-H., Chang Y.-Q., Wang T.-M., Li W-X., Tian H.-Y., Cao H., Kurihara H., He R.-R. // Food Chem. 2017. V. 216. P. 282-288. DOI: 10.1016/j.foodchem. 2016.08.017.

16. Chen Q., Zhao J., Fang C.H., Wang D. // Spectrochim. Acta Part A: Molec. Biomolec. Spectrosc. 2007. V. 66(3). P. 568-574. DOI: 10.1016/j.saa.2006.03.038.

17. Chen, Q., Zhao J., Huang X., Zhang H., Liu M. // Microchem. J. 2006. V. 83(1). P. 42-47. DOI: 10.1016/j.microc. 2006.01.023.

18. Ohno A., Oka K., Sakuma C., Okuda H., Fukuhara K. //

J. Agricult. Food Chem. 2011. V. 59(10). P. 5181-5187. DOI: 10.1021/jf200204y.

19. Ta Thi Thao, Nguyen Thi Ngan, Vu Anh Phuong, Ha Tran Hung, Nguyen Van Thuc, Pham Quang Trung // Chem-ChemTech [Izv. Vyssh. Uchebn. Zaved. Khim. Khim. Tekhnol.]. 2021. V. 64. N 2. P. 41- 48. Та Тхи Тхао, Нгуен Тхи Нган, Ву Ан Фуонг, Ха Тран Хунг, Нгуен Ван Тхык, Фам Куанг Трунг // Изв. вузов. Химия и хим. технология. 2021. Т. 64. Вып. 2. С. 41-48. DOI: 10.6060/ivkkt. 20216402.6294.

20. Song Y. // Analyt. methods. 2014. V. 6. P. 907-914. DOI: 10.1039/c3ay41369a.

21. Mishra P., Nordon A., Tschannerl J., Lian G., Redfern S., Marshall S. // J. Food Eng. 2018. V. 238. P. 70-77. DOI: 10.1016/j.jfoodeng.2018.06.015.

22. Horzic D., Komes D., Belscak A., Game K.K., Ivekovic D., Karlovic D. // Food Chem. 2009. V. 115(2). P. 441-448. DOI: 10.1016/j.foodchem.2008.12.022.

Поступила в редакцию 19.04.2023 Принята к опубликованию 07.06.2023

Received 19.04.2023 Accepted 07.06.2023

i Надоели баннеры? Вы всегда можете отключить рекламу.