我們觀察用戶評論發(fā)現(xiàn):屬性詞往往和情感詞伴隨出現(xiàn),原因是用戶通常會在描述屬性時表達情感,屬性是情感表達的對象。還發(fā)現(xiàn):屬性詞和專用情感詞基本都是名詞或形容詞(形謂詞)。
算法流程圖如下:
評論數(shù)據(jù)如下:
代碼如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
|
#encoding=utf-8 ############################# # # 功能:給定一些中文的產(chǎn)品評論,希望從中找到評價對象及評價詞。 # # @author:licl # ############################## fdata = open ( 'jd_dfb_comments_out.txt' , 'r' ) output = open ( 'pattern_result.txt' , 'a' ) try : data = fdata.readlines() listline = [] for line in data: listline = line.replace( " " , "/" ) listline = listline.split( "/" ) i = 1 while i < len (listline): if listline[i] ! = "名詞" : i = i + 2 else : new_list = [" "," "," "] new_list[ 0 ] = listline[i - 1 ] a = i - 1 i = i + 2 while i < len (listline): if listline[i] = = "標點" : i = i + 2 break else : if listline[i - 1 ] = = '不' or listline[i - 1 ] = = '不怎么樣' or listline[i - 1 ] = = '不怎么' or listline[i - 1 ] = = '不太' : new_list[ 1 ] = listline[i - 1 ] if listline[i] = = "形容詞" or listline[i] = = "形謂詞" : new_list[ 1 ] + = listline[i - 1 ] b = i - 1 t = (b - a) / 2 new_list[ 2 ] = str (t) for line in new_list: output.write(line + " " ) output.write( "\n" ) break else : i = i + 2 except : print "‘文件不存在'或者‘文件無法打開'" finally : fdata.close() output.close() |
以上就是本文的全部內(nèi)容,希望對大家的學習有所幫助,也希望大家多多支持服務器之家。
原文鏈接:https://blog.csdn.net/m53931422/article/details/41042791