Trung tâm Thông tin - Thư viện, Đại học Nguyễn Tất Thành

TRA CỨU

thông tin biểu ghi

Bài báo khoa học công nghệ
Ký hiệu PL/XG: 004
Nhan đề: English sentiment classification using a Gower-2 coefficient and a genetic algorithm with a fitness-proportionate selection in a parallel network environment /

ISSN	1992-8645
DDC	004
Tác giả CN	Vo, Ngoc Phu
Nhan đề	English sentiment classification using a Gower-2 coefficient and a genetic algorithm with a fitness-proportionate selection in a parallel network environment / Vo Ngoc Phu, Vo Thi Ngoc Tran
Thông tin xuất bản	Pakistan : Little Lion Scientific, 2018
Mô tả vật lý	50 p.
Tóm tắt	We have already studied a data mining field and a natural language processing field for many years. There are many significant relationships between the data mining and the natural language processing. Sentiment classification Has HAd many crucial contributions to many different fields in everyday life, such as in political Activities, commodity production, and commercial Activities. A new model using a Gower-2 Coefficient (HA) and a Genetic Algorithm (GA) with a fitness function (FF) which is a Fitness- proportionate Selection (FPS) has been proposed for the sentiment classification. This can be applied to a big data. The GA can process many bit arrays. Thus, it saves a lot of storage spaces. We do not need lots of storage spaces to store a big data. Firstly, we create many sentiment lexicons of our basis English sentiment dictionary (bESD) by using the HA through a Google search engine with AND operator and OR operator. Next, According to the sentiment lexicons of the bESD, we encode 7,000,000 sentences of our training data set including the 3,500,000 negative and the 3,500,000 positive in English successfully into the bit arrays in a small storage space. We also encrypt all sentences of 8,000,000 documents of our testing data set comprising the 4,000,000 positive and the 4,000,000 negative in English successfully into the bit arrays in the small storage space. We use the GA with the FPS to cluster one bit array (corresponding to one sentence) of one document of the testing data set into either the bit arrays of the negative sentences or the bit arrays of the positive sentences of the training data set. The sentiment classification of one document is based on the results of the sentiment classification of the sentences of this document of the testing data set. We tested the proposed model in both a sequential environment and a distributed network system. We achieved 88.12% accuracy of the testing data set. The execution time of the model in the parallel network environment is faster than the execution time of the model in the sequential system. The results of this work can be widely used in applications and research of the English sentiment classification.
Thuật ngữ chủ đề	Bigdata-English sentiment-Algorithm
Từ khóa tự do	English sentiment classification
Từ khóa tự do	Distributed system
Từ khóa tự do	Cloudera
Từ khóa tự do	Gower-2
Từ khóa tự do	Similarity coefficient
Từ khóa tự do	Hadoop map and hadoop reduce
Từ khóa tự do	Fitness-proportionate selection
Từ khóa tự do	Genetic algorithm
Khoa	Khoa Công nghệ Thông tin
Tác giả(bs) CN	Vo, Thi Ngoc Tran
Nguồn trích	Journal of Theoretical and Applied Information Technology. Số: Vol. 96 (2018), No. 4, P.887-936, ,
Địa chỉ	Thư Viện Đại học Nguyễn Tất Thành


000	00000nam#a2200000u##4500
001	19614
002	12
004	62EC1205-FA8D-46AA-936C-0CD285FE0524
005	202003090056
008	200302s2018 pk eng
009	1 0
022	\|a1992-8645
039	\|a20200309005618\|bphucvh\|c20200302140113\|dphucvh\|y20200302135714\|zphucvh
040	\|aNTT
041	\|aeng
044	\|apk
082	\|a004\|223
100	\|aVo, Ngoc Phu\|cDr.
245	\|aEnglish sentiment classification using a Gower-2 coefficient and a genetic algorithm with a fitness-proportionate selection in a parallel network environment / \|cVo Ngoc Phu, Vo Thi Ngoc Tran
260	\|aPakistan : \|bLittle Lion Scientific, \|c2018
300	\|a50 p.
520	\|aWe have already studied a data mining field and a natural language processing field for many years. There are many significant relationships between the data mining and the natural language processing. Sentiment classification Has HAd many crucial contributions to many different fields in everyday life, such as in political Activities, commodity production, and commercial Activities. A new model using a Gower-2 Coefficient (HA) and a Genetic Algorithm (GA) with a fitness function (FF) which is a Fitness- proportionate Selection (FPS) has been proposed for the sentiment classification. This can be applied to a big data. The GA can process many bit arrays. Thus, it saves a lot of storage spaces. We do not need lots of storage spaces to store a big data. Firstly, we create many sentiment lexicons of our basis English sentiment dictionary (bESD) by using the HA through a Google search engine with AND operator and OR operator. Next, According to the sentiment lexicons of the bESD, we encode 7,000,000 sentences of our training data set including the 3,500,000 negative and the 3,500,000 positive in English successfully into the bit arrays in a small storage space. We also encrypt all sentences of 8,000,000 documents of our testing data set comprising the 4,000,000 positive and the 4,000,000 negative in English successfully into the bit arrays in the small storage space. We use the GA with the FPS to cluster one bit array (corresponding to one sentence) of one document of the testing data set into either the bit arrays of the negative sentences or the bit arrays of the positive sentences of the training data set. The sentiment classification of one document is based on the results of the sentiment classification of the sentences of this document of the testing data set. We tested the proposed model in both a sequential environment and a distributed network system. We achieved 88.12% accuracy of the testing data set. The execution time of the model in the parallel network environment is faster than the execution time of the model in the sequential system. The results of this work can be widely used in applications and research of the English sentiment classification.
650	\|aBigdata\|vEnglish sentiment\|xAlgorithm
653	\|aEnglish sentiment classification
653	\|aDistributed system
653	\|aCloudera
653	\|aGower-2
653	\|aSimilarity coefficient
653	\|aHadoop map and hadoop reduce
653	\|aFitness-proportionate selection
653	\|aGenetic algorithm
690	\|aKhoa Công nghệ Thông tin
700	\|aVo, Thi Ngoc Tran\|cDr.
773	\|tJournal of Theoretical and Applied Information Technology\|gVol. 96 (2018), No. 4, P.887-936
852	\|aThư Viện Đại học Nguyễn Tất Thành
890	\|c1\|a0\|b0\|d1

Không tìm thấy biểu ghi nào

English sentiment classification using a Gower-2 coefficient and a genetic algorithm with a fitness-proportionate selection in a parallel network environment / Vo Ngoc Phu, Vo Thi Ngoc Tran

Địa chỉ 1: 331 Quốc Lộ 1A, Phường An Phú Đông, Quận 12, TP.HCM
Địa chỉ 2: 300A Nguyễn Tất Thành, P.13, Q.4, TP.HCM
Email: thuvienntt@ntt.edu.vn
Điện thoại: (028) 71080889 - Số nội bộ 408

Hỗ trợ: 19002039 (Số nội bộ: 408)

Hôm nay: 20 Tháng Năm 2024
Người dùng online: 955
Ngày hôm nay: 72805
Tuần qua: 72802
Tháng này: 1155573
Tổng lượt truy cập: 22237282