前言:爬蟲需要用到基礎Python和Html

查看網站的html

進入中油網頁:https://www.cpc.com.tw/
點擊右上角三個點->更多工具->開發人員工具

點擊右上角三個點->更多工具->開發人員工具

點擊小紅匡的工具並且找到畫面上要取資料的地方按左鍵它就會跳到相應的程式碼片段了

點擊小紅匡的工具並且找到畫面上要取資料的地方按左鍵它就會跳到相應的程式碼片段了

接著把程式碼片段展開來,會長以下這樣

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
<div class="in"><article class="today_price">
<article class="today_price">
<div class="today_price_hd"><span class="since"><span id="t_PriceUpdate">10月30日</span>零時起實施</span><span class="unit">(單位:元/公升)</span></div>
<span id="t_UpOrDown"><div class="ups_and_downs downs"><b class="name">本週汽油價格</b><b class="sys">調降</b><b class="rate"><i>0.2</i></b></div></span>
<ul class="today_price_ct">
<li><span class="today_price_img"><img alt="92無鉛圖示" src="/images/icon-pic/today_price_img1.png"> </span>
<div class="today_price_info"><b class="name">92無鉛</b> <b class="price" id="t_sPrice1">29.7</b></div>
</li>
<li><span class="today_price_img"><img alt="95無鉛圖示" src="/images/icon-pic/today_price_img2.png"> </span>
<div class="today_price_info"><b class="name">95無鉛</b> <b class="price" id="t_sPrice2">31.2</b></div>
</li>
<li><span class="today_price_img"><img alt="98無鉛圖示" src="/images/icon-pic/today_price_img3.png"> </span>
<div class="today_price_info"><b class="name">98無鉛</b> <b class="price" id="t_sPrice3">33.2</b></div>
</li>
<li><span class="today_price_img"><img alt="酒精汽油圖示" src="/images/icon-pic/today_price_img4.png"> </span>
<div class="today_price_info"><b class="name">酒精汽油</b> <b class="price" id="t_sPrice4">31.2</b></div>
</li>
<li><span class="today_price_img"><img alt="超級柴油圖示" src="/images/icon-pic/today_price_img5.png"> </span>
<div class="today_price_info"><b class="name">超級柴油</b> <b class="price" id="t_sPrice5">27.3</b></div>
</li>
<li><span class="today_price_img"><img alt="液化石油氣圖示" src="/images/icon-pic/today_price_img6.png"> </span>
<div class="today_price_info"><b class="name">液化石油氣</b> <b class="price" id="t_sPrice6">14.6</b></div>
</li>
</ul>
<ul class="today_price_ft" id="t_sPicCount">
<li class="price_since">(液化石油氣價格為車用,自<font id="t_LPGdate">109年12月2日</font>零時起實施)</li>
<li class="subsidy">(酒精汽油未扣除能源局優惠補助)</li>
</ul>
<div class="more"><a href="historyprice.aspx?n=2890" target="_self" title="歷史價格(另開新視窗)"><b>歷史價格</b></a></div>
</article>
</article>
<script>
$(function () {
index_today_price();
function index_today_price() {
$.ajax(
{
type: "GET",
url: "/GetOilPriceJson.aspx?type=TodayOilPriceString",
success: function (data) {
if (data != "") {
var oil= $.parseJSON(data);
if(oil['sPrice1'] != ""){
$('#t_UpOrDown').html(oil['UpOrDown_Html']);
$('#t_PriceUpdate').html(oil['PriceUpdate']);
$('#t_sPrice1').html(oil['sPrice1']);
$('#t_sPrice2').html(oil['sPrice2']);
$('#t_sPrice3').html(oil['sPrice3']);
$('#t_sPrice4').html(oil['sPrice4']);
$('#t_sPrice5').html(oil['sPrice5']);
$('#t_sPrice6').html(oil['sPrice6']);
$('#t_LPGdate').html(oil['LPGdate']);
}
}
}
});
}
});
</script>
</div>

我們這邊使用92無鉛來做示範

1
2
3
<li><span class="today_price_img"><img alt="92無鉛圖示" src="/images/icon-pic/today_price_img1.png"> </span>
<div class="today_price_info"><b class="name">92無鉛</b> <b class="price" id="t_sPrice1">29.7</b></div>
</li>

我們要取的是「29.7」這個數字

開始寫程式

導入需要的模組

首先要先導入requestsBeautifulSoup這兩個模組(請先確認有已安裝)

1
2
import requests
from bs4 import BeautifulSoup

發送請求並獲取資料

使用requests.get(url)發送請求,其中url就是要取資料的網址(這邊就是中油的網站)

1
2
3
4
url = "https://www.cpc.com.tw/GetOilPriceJson.aspx?type=TodayOilPriceString"
response = requests.get(url)
data = response.json()

由於價格是用JavaScript動態生成的所以我們的網址後面還要加上

1
/GetOilPriceJson.aspx?type=TodayOilPriceString
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
$(function () {
index_today_price();
function index_today_price() {
$.ajax(
{
type: "GET",
url: "/GetOilPriceJson.aspx?type=TodayOilPriceString",
success: function (data) {
if (data != "") {
var oil= $.parseJSON(data);
if(oil['sPrice1'] != ""){
$('#t_UpOrDown').html(oil['UpOrDown_Html']);
$('#t_PriceUpdate').html(oil['PriceUpdate']);
$('#t_sPrice1').html(oil['sPrice1']);
$('#t_sPrice2').html(oil['sPrice2']);
$('#t_sPrice3').html(oil['sPrice3']);
$('#t_sPrice4').html(oil['sPrice4']);
$('#t_sPrice5').html(oil['sPrice5']);
$('#t_sPrice6').html(oil['sPrice6']);
$('#t_LPGdate').html(oil['LPGdate']);
}
}
}
});
}
});

篩選出我們要的資料

92的價格是sPrice1,所以我們從data取出來

1
oilPrice92 = data.get('sPrice1')

輸出

最後就是要印出我們的資料了

1
print(f"92無鉛價格:{oilPrice92}")


這樣就完成取出92無鉛的價格了!!!其他價格就是以此類推!

完整程式碼

1
2
3
4
5
6
7
8
9
10
import requests
from bs4 import BeautifulSoup

url = "https://www.cpc.com.tw/GetOilPriceJson.aspx?type=TodayOilPriceString"
response = requests.get(url)
data = response.json()

oilPrice92 = data.get('sPrice1')

print(f"92無鉛價格:{oilPrice92}")

全部價格的程式碼

包含92 95 98 超級柴油 酒精汽油以及液化石油氣
連結:https://github.com/omd0118/WebCrawler/blob/main/oil.py