机器学习 李宏毅 HW

HW0

Q1

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
f = open('data.txt')
readline=f.readlines()

for line in readline:
line = line.replace(',', ' ')
line = line.replace('.', ' ')
line = line.strip()
allWord = line.split(' ')

##print(line)
wordCount = []
wordDict = {}.fromkeys(allWord)
wordList = list(wordDict.keys())

cnt = 0

for i in wordList:
wordDict[i] = allWord.count(i)
print(i + ' ' + str(cnt) + ' ' + str(wordDict[i]))
cnt = cnt+1
##print(wordDict)

Sample input: data.txt
1
2
data.txt:
A small sample of texts from Project Gutenberg appears in the corpus collection. However, you may be interested in analyzing other texts from Project Gutenberg. You can browse the catalog of 25,000 free online books at xxx, and obtain a URL to an ASCII text file. Although 90% of the texts in Project Gutenberg are in English, it includes material in over 50 other languages.
Sample output
1
2
3
4
5
6
7
8
9
10
11
12
A 0 1
small 1 1
sample 2 1
of 3 3
texts 4 3
from 5 2
Project 6 3
Gutenberg 7 3
appears 8 1
in 9 5
the 10 3
...

打开文件后读取所有的字符,用 replace 去除符号后用split将字符串分割成 list

然后用字典方法去重。在wordList中记录单词和出现次数(key 为单词,value 为出现次数)

最后输出


Q2 图片淡化

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
from PIL import Image

oriImg = Image.open("westbrook.jpg")
pixel = oriImg.load()
##print(oriImg.size)
##(a, b) = oriImg.size
print((a, b))
for x in range(oriImg.size[0]):
for y in range(oriImg.size[1]):
##print(x, y)
(r ,g, b) = pixel[x, y]
#print((r, g, b))
##print((int(r/2), int(g/2), int(b/2)))
oriImg.putpixel((x, y), (int(r/2), int(g/2), int(b/2)))
##oriImg.putpixel((x, y), (100,100,100))

oriImg.save("new.jpg")

原图片:

减淡后:

遍历图片,putpixel即可。