2017年1月26日 星期四

在樹莓派上安裝 cURL 與設定 Crontab

昨天搞定布署在樹莓派上的 PHP 專案程式上傳問題後, 馬上就測試一下能否在區域網路中透過 cURL 順利執行網路爬蟲程式, 結果程式雖然有執行, 但卻沒跑出預期的結果. 我查看了 phpinfo 的設定表, 發現 PHP 預設沒有 cURL 模組, 得自行安裝.

安裝 cURL 其實只要一個指令就可以了 :

$ sudo apt-get install php5-curl  

參考 :

How do I install PHP cURL on Linux Debian?
How to enable curl, installed Ubuntu LAMP stack?
# ubuntu开启crontab日志记录及解决No MTA installed, discarding output问题

執行結果如下 :

pi@raspberrypi:~ $ sudo apt-get install php5-curl
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following packages were automatically installed and are no longer required:
  libasn1-8-heimdal libgssapi3-heimdal libhcrypto4-heimdal libheimbase1-heimdal libheimntlm0-heimdal libhx509-5-heimdal
  libkrb5-26-heimdal libroken18-heimdal libwind0-heimdal libxfce4ui-1-0 xfce-keyboard-shortcuts
Use 'apt-get autoremove' to remove them.
The following NEW packages will be installed:
  php5-curl
0 upgraded, 1 newly installed, 0 to remove and 17 not upgraded.
Need to get 24.5 kB of archives.
After this operation, 79.9 kB of additional disk space will be used.
Get:1 http://mirrordirector.raspbian.org/raspbian/ jessie/main php5-curl armhf 5.6.29+dfsg-0+deb8u1 [24.5 kB]
Fetched 24.5 kB in 4s (5,884 B/s)
Selecting previously unselected package php5-curl.
(Reading database ... 129241 files and directories currently installed.)
Preparing to unpack .../php5-curl_5.6.29+dfsg-0+deb8u1_armhf.deb ...
Unpacking php5-curl (5.6.29+dfsg-0+deb8u1) ...
Processing triggers for libapache2-mod-php5 (5.6.29+dfsg-0+deb8u1) ...
Setting up php5-curl (5.6.29+dfsg-0+deb8u1) ...

Creating config file /etc/php5/mods-available/curl.ini with new version
php5_invoke: Enable module curl for cli SAPI
php5_invoke: Enable module curl for apache2 SAPI
Processing triggers for libapache2-mod-php5 (5.6.29+dfsg-0+deb8u1) ...
pi@raspberrypi:/var/www/html/tony1966 $ sudo service apache2 restart
Warning: Unit file of apache2.service changed on disk, 'systemctl daemon-reload' recommended.
pi@raspberrypi:~ $

安裝好後須重啟伺服器 :

$ sudo /etc/init.d/apache2 restart
$ sudo service apache2 restart

然後再次執行 phpinfo.php 就可以看到 cURL 了 :


再次執行爬蟲程式果然 OK! 只是 fetch_twse_daily_close.php 的執行時間竟然 110 秒! 這在 Hostinger 的主機大概 60 秒內就跑完了, 當然樹莓派 B+ 跟人家刀鋒伺服器是不能比, 能跑這樣算不錯了. 我全部爬蟲程式都設 120 秒為最長執行時間, 所以程式部分還 OK, 但 Apache2 伺服器預設卻只有 30 秒 , 如下 phpinfo() 的 Core 部分之 max_execution_time  :


我用 nano 去修改 /etc/php5/apache2/php.ini 檔 "Resource Limits" 的 max_executon_time 參數設定, 由 30 提高為 180 秒 :

$ sudo nano /etc/php5/apache2/php.ini

;;;;;;;;;;;;;;;;;;;
; Resource Limits ;
;;;;;;;;;;;;;;;;;;;

; Maximum execution time of each script, in seconds
; http://php.net/max-execution-time
; Note: This directive is hardcoded to 0 for the CLI SAPI
max_execution_time = 30    (改成 180)

這樣就能避免程式還沒跑完就因為 timeout 被伺服器停掉了.

參考 :

# 修改PHP的執行時間上限,避免程式執行過久被終止
[php參數修改]允許PHP執行的時間變長(max_execution_time)

搞定 cURL 後, 接下來是設定 Crontab (Cron Table) 來自動執行 PHP 程式. Cron 是 UNIX 作業系統用來定時或週期性執行指定的 Shell 命令或 Script 程式一張設定表. 每一個使用者都有自己的 Crontab, 包括 root 帳號. 關於 Crontab 的格式參考 :

SCHEDULING TASKS WITH CRON
# Cron job not working in Raspberry

編輯 Crontab 指令如下 :

$ crontab -e

初次執行此指令時會出現一張選單, 要求我們指定 Crontab 的編輯器, 建議用 nano :

pi@raspberrypi:~ $ crontab -e
no crontab for pi - using an empty one

Select an editor.  To change later, run 'select-editor'.
  1. /bin/ed
  2. /bin/nano        <---- easiest="" p="">  3. /usr/bin/vim.tiny

Choose 1-3 [2]: 2

(進入 nano 編輯畫面, 編輯完成後按 Ctrl+O 存檔, 按 Enter, 再按 Ctrl+X 退出)

crontab: installing new crontab
pi@raspberrypi:~ $


我在 Crobtab 最底下添加了三個 PHP 程式, 每 5 分鐘執行一次.

顯示 Crontab 內容使用 -l 參數 :

$ crontab -l

顯示內容如下 :

# Edit this file to introduce tasks to be run by cron.
#
# Each task to run has to be defined through a single line
# indicating with different fields when the task will be run
# and what command to run for the task
#
# To define the time you can provide concrete values for
# minute (m), hour (h), day of month (dom), month (mon),
# and day of week (dow) or use '*' in these fields (for 'any').#
# Notice that tasks will be started based on the cron's system
# daemon's notion of time and timezones.
#
# Output of the crontab jobs (including errors) is sent through
# email to the user the crontab file belongs to (unless redirected).
#
# For example, you can run a backup of all your user accounts
# at 5 a.m every week with:
# 0 5 * * 1 tar -zcf /var/backups/home.tgz /home/
#
# For more information see the manual pages of crontab(5) and cron(8)
#
# m h  dom mon dow   command
*/5 * * * * /var/www/html/tony1966/cron/fetch_yahoo_earning.php
*/5 * * * * /var/www/html/tony1966/cron/fetch_yahoo_major.php
*/5 * * * * /var/www/html/tony1966/cron/fetch_yahoo_usa_stocks.php

並將此三個程式加上 x 權限 (參考 : Cron not working in Raspberry)  :

$ sudo chmod +x  /var/www/html/tony1966/cron/fetch_yahoo_earning.php
$ sudo chmod +x  /var/www/html/tony1966/cron/fetch_yahoo_major.php
$ sudo chmod +x  /var/www/html/tony1966/cron/fetch_yahoo_usa_stocks.php

如果要設定 root 帳號的 Crontab (這跟預設帳號 pi 是不同的), 就在 crontab 前加 sudo :

$ sudo crontab -e

也可以在命令列直接執行 PHP 程式 :

php fetch_yahoo_usa_stocks.php

pi@raspberrypi:/var/www/html/tony1966/cron $ php fetch_yahoo_usa_stocks.php
<!DOCTYPE html>
<html>
<head>
  <meta charset="UTF-8">
  <title>擷取 Yahoo 美股收盤報告</title>
</head>
<body>
<P>擷取 Yahoo 美股收盤報告</P>
Asia/Taipei<br><a href='https://tw.finance.yahoo.com/us/worldidx.php' target='_blank'>原始網頁</a><br>交易日期=01/27/2017<br>交易日期=2017-01-27<br>道瓊工業指數收盤=20100.91<br>道瓊工業指數漲跌=▲32.40(+0.16%)<br>那斯達克指數收盤=5655.18<br>那斯達克指數漲跌=▼1.16(-0.02%)<br>史坦普指數收盤=2296.68<br>史坦普指數漲跌=▼1.69(-0.07%)<br>費城半導體指數收盤=950.32<br>費城半導體指數漲跌=▼5.74(-0.60%)<br><br>處理時間 :1 秒</body>
pi@raspberrypi:/var/www/html/tony1966/cron $

注意, 由於 PHP 程式使用了相對位置存取函式庫, 因此必須切換到 PHP 程式所在目錄執行指令, 否則會出現 "failed to open stream: No such file or directory" 錯誤訊息 :

Asia/Taipei
PHP Warning:  include(../db.php): failed to open stream: No such file or directory in /var/www/html/tony1966/cron/fetch_yahoo_usa_stocks.php on line 43
PHP Warning:  include(): Failed opening '../db.php' for inclusion (include_path='.:/usr/share/php:/usr/share/pear') in /var/www/html/tony1966/cron/fetch_yahoo_usa_stocks.php on line 43
PHP Warning:  include(../lib/mysqli.php): failed to open stream: No such file or directory in /var/www/html/tony1966/cron/fetch_yahoo_usa_stocks.php on line 44

但是 Crontab 設定好後, 在資料庫中的 cron_log 資料表裡卻沒看到每五分鐘的執行紀錄, 只有我手動執行的紀錄 :


檢視 syslog 中關於 cron 的執行紀錄, 發現 "No MTA installed, discardi ng output" 訊息 :

pi@raspberrypi:~ $ grep CRON /var/log/syslog
Jan 27 06:30:01 raspberrypi CRON[26496]: (pi) CMD (/var/www/html/tony1966/cron/f                                                                             etch_yahoo_major.php)
Jan 27 06:30:01 raspberrypi CRON[26497]: (pi) CMD (/var/www/html/tony1966/cron/f                                                                             etch_yahoo_earning.php)
Jan 27 06:30:01 raspberrypi CRON[26500]: (pi) CMD (/var/www/html/tony1966/cron/f                                                                             etch_yahoo_usa_stocks.php)
Jan 27 06:30:01 raspberrypi CRON[26477]: (CRON) info (No MTA installed, discardi                                                                             ng output)
Jan 27 06:30:01 raspberrypi CRON[26476]: (CRON) info (No MTA installed, discardi                                                                             ng output)
Jan 27 06:30:01 raspberrypi CRON[26475]: (CRON) info (No MTA installed, discardi                                                                             ng output)

我參考了下面這篇文章 :

# “(CRON) info (No MTA installed, discarding output)” error in the syslog

裡面提到此 MTA (Mail Transfer Agent) 錯誤訊息可以透過下列指令安裝郵件伺服器 Postfix 解決 :

$ sudo apt-get install postfix

不過裡面也說, MTA 錯誤並不會影響 Cron jobs 的執行, 可以忽略不管; 但安裝 MTA 可以讓我們從傳送的郵件訊息中獲得一些可判斷錯誤原因的訊息 :

"Or you can ignore it. I don't think the inability of cron to send messages has anything to do with the CPU spike (that's linked to the underlying job that cron is running). It might be safest to install an MTA and then read through the messages"

關於 Postfix 參考 :

# Raspberry Pi Email Server Part 1: Postfix

下面是安裝 Postfix mail server 的紀錄 :

pi@raspberrypi:~/cronjobs $ sudo apt-get install postfix
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following packages were automatically installed and are no longer required:
  libasn1-8-heimdal libgssapi3-heimdal libhcrypto4-heimdal libheimbase1-heimdal libheimntlm0-heimdal libhx509-5-heimdal
  libkrb5-26-heimdal libroken18-heimdal libwind0-heimdal libxfce4ui-1-0 xfce-keyboard-shortcuts
Use 'apt-get autoremove' to remove them.
Suggested packages:
  procmail postfix-mysql postfix-pgsql postfix-ldap postfix-pcre sasl2-bin dovecot-common postfix-cdb ufw postfix-doc
The following NEW packages will be installed:
  postfix
0 upgraded, 1 newly installed, 0 to remove and 17 not upgraded.
Need to get 1,293 kB of archives.
After this operation, 3,095 kB of additional disk space will be used.
Get:1 http://mirrordirector.raspbian.org/raspbian/ jessie/main postfix armhf 2.11.3-1 [1,293 kB]
Fetched 1,293 kB in 4s (310 kB/s)
Preconfiguring packages ...
Selecting previously unselected package postfix.
(Reading database ... 129248 files and directories currently installed.)
Preparing to unpack .../postfix_2.11.3-1_armhf.deb ...
Unpacking postfix (2.11.3-1) ...
Processing triggers for systemd (215-17+deb8u6) ...
Processing triggers for man-db (2.7.0.2-5) ...
Setting up postfix (2.11.3-1) ...
Adding group `postfix' (GID 119) ...
Done.
Adding system user `postfix' (UID 112) ...
Adding new user `postfix' (UID 112) with group `postfix' ...
Not creating home directory `/var/spool/postfix'.
Creating /etc/postfix/dynamicmaps.cf
Adding tcp map entry to /etc/postfix/dynamicmaps.cf
Adding sqlite map entry to /etc/postfix/dynamicmaps.cf
Adding group `postdrop' (GID 120) ...
Done.
setting myhostname: raspberrypi
setting alias maps
setting alias database
mailname is not a fully qualified domain name.  Not changing /etc/mailname.
setting destinations: raspberrypi, localhost.localdomain, , localhost
setting relayhost:
setting mynetworks: 127.0.0.0/8 [::ffff:127.0.0.0]/104 [::1]/128
setting mailbox_size_limit: 0
setting recipient_delimiter: +
setting inet_interfaces: all
/etc/aliases does not exist, creating it.
WARNING: /etc/aliases exists, but does not have a root alias.

Postfix is now set up with a default configuration.  If you need to make
changes, edit
/etc/postfix/main.cf (and others) as needed.  To view Postfix configuration
values, see postconf(1).

After modifying main.cf, be sure to run '/etc/init.d/postfix reload'.

Running newaliases
Processing triggers for systemd (215-17+deb8u6) ...
Processing triggers for libc-bin (2.19-18+deb8u7) ...
pi@raspberrypi:~/cronjobs $ g

安裝中途會詢問



只要用預設選項 "Internet Site" 與 "raspberrypi" 即可. 安裝好郵件伺服器後, 用 grep CRON /var/log/syslog 指令去撈, 果然 MTA 錯誤已經消失了, Crontab 的執行結果會丟到 /var/mail/帳號, 例如以 pi 帳號登入的話, pi 的 crontab 結果會輸出到 /var/mail/pi 這個檔案裡, 可以用 cat 顯示內容.

但是安裝郵件伺服器後 Cron jobs 依然沒有順利執行, 沒錯, MTA 與此無關. 其實這篇文章有提到要消除 MTA 錯誤訊息不一定要安裝郵件伺服器, 只要在每個 cron 指令後面加上下面導向即可 :

</dev/null 2<&1

回到 Crontab 沒有順利執行問題, 前面試過在 /var/www/html/tony1966/ 專案目錄下, 直接執行 PHP 指令又 OK 呀! 我覺得可能跟 PHP 程式中使用相對路徑有關, 因此參考柯博文的 "Raspberry Pi 超炫專案與完全實戰第二版" 10.3 節的做法, 先為每一個 PHP 爬蟲程式製作一個 shell 指令檔, 然後在 crontab 程式中執行這個 shell 程式. 這些 shell 程式都放在自建的 /home/pi/cronjobs 目錄下 :

pi@raspberrypi:~ $mkdir cronjobs
pi@raspberrypi:~ $cd cronjobs
pi@raspberrypi:~/cronjobs $nano fetch_yahoo_usa_stocks.sh

以 fetch_yahoo_usa_stocks.php 為例, 用 nano 編輯一個 shell 指令檔 fetch_yahoo_usa_stocks.sh, 其內容如下 :

cd /var/www/html/tony1966/cron
/usr/bin/php fetch_yahoo_usa_stocks.php

第一行用 cd 指令切換工作目錄到專案資料夾 tony1966 底下放置 PHP 爬蟲程式的 cron 資料夾, 第二行呼叫 /usr/bin 下的 php 程式執行 cron 下的 PHP 指令稿.

然後用 chmod 指令將此 shell 檔更改為 755 權限 : 

pi@raspberrypi:~/cronjobs $sudo chmod 755 fetch_yahoo_usa_stocks.sh
  
最後修改 crontab 檔, 每 5 分鐘執行此 fetch_yahoo_usa_stocks.sh  :

*/5 * * * * /home/pi/cronjobs/fetch_yahoo_usa_stocks.sh

經過這樣設定, 果然 crontab 就順利執行了 :


由於前面安裝了郵件伺服器, crontab 執行結果會放在 /var/mail/pi 檔案裡, 可用 cat 顯示執行結果 :

You have new mail in /var/mail/pi
pi@raspberrypi:~/cronjobs $ cat /var/mail/pi

From pi@raspberrypi  Wed Feb  1 14:55:09 2017
Return-Path: <pi@raspberrypi>
X-Original-To: pi
Delivered-To: pi@raspberrypi
Received: by raspberrypi (Postfix, from userid 1000)
        id EBECA2AC32; Wed,  1 Feb 2017 14:55:08 +0800 (CST)
From: root@raspberrypi (Cron Daemon)
To: pi@raspberrypi
Subject: Cron <pi@raspberrypi> /home/pi/cronjobs/fetch_yahoo_usa_stocks.sh
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-Cron-Env: <SHELL=/bin/sh>
X-Cron-Env: <HOME=/home/pi>
X-Cron-Env: <PATH=/usr/bin:/bin>
X-Cron-Env: <LOGNAME=pi>
Message-Id: <20170201065508.EBECA2AC32@raspberrypi>
Date: Wed,  1 Feb 2017 14:55:08 +0800 (CST)

<!DOCTYPE html>
<html>
<head>
  <meta charset="UTF-8">
  <title>擷取 Yahoo 美股收盤報告</title>
</head>
<body>
<P>擷取 Yahoo 美股收盤報告</P>
Asia/Taipei<br><a href='https://tw.finance.yahoo.com/us/worldidx.php' target='_blank'>原始網頁</a><br>交易日期=02/01/2017<br>交易日期=2017-02-01<br>道瓊工業指數收盤=19864.09<br>道瓊工業指數漲跌=▼107.04(-0.54%)<br>那斯達克指數收盤=5614.79<br>那斯達克指數漲跌=▲1.07(+0.02%)<br>史坦普指數收盤=2278.87<br>史坦普指數漲跌=▼2.03(-0.09%)<br>費城半導體指數收盤=944.28<br>費城半導體指數漲跌=▼12.56(-1.31%)<br>PHP Notice:  Undefined variable: RS in /var/www/html/tony1966/lib/mysqli.php on line 335
<br>處理時間 :7 秒</body>
</html>

其他 PHP 爬蟲程式都如法炮製即可.

參考 :

Where is the cron / crontab log?
Cron job not working in Raspberry
Running Things Regularly - cron (@reboot)
Raspberry Pi Simple Cron Jobs Explanation

其他 :

# Top 8 IDEs for Programmers, Coders and Beginners on the Raspberry Pi
Tinkering with the Raspberry Pi A+


2017-02-01 補充 :

終於在春節假期結束的年初五完成 PHP 專案移植到樹莓派的計畫了, 驗證了我接觸樹莓派近三年來的假想.

# 樹莓派通過郵件上報實時IP,隨時隨地遠程登錄樹莓派
# 編程篇Python 實現SMTP發送郵件、Web伺服器

2017-02-08 補充 :

解除安裝 Postfix 的方法 :

$ sudo aptitude remove postfix* --purge   

注意, 解除安裝需要 root 權限, 故須加 sudo :

pi@raspberrypi:~ $ aptitude remove postfix* --purge
E: Could not open lock file /var/lib/dpkg/lock - open (13: Permission denied)
E: Unable to lock the administration directory (/var/lib/dpkg/), are you root?
pi@raspberrypi:~ $ sudo aptitude remove postfix* --purge
Couldn't find any package whose name or description matched "postfix*"
Couldn't find any package whose name or description matched "postfix*"
The following packages will be REMOVED:
  libasn1-8-heimdal{pu} libgssapi3-heimdal{pu} libhcrypto4-heimdal{pu}
  libheimbase1-heimdal{pu} libheimntlm0-heimdal{pu} libhx509-5-heimdal{pu}
  libkrb5-26-heimdal{pu} libroken18-heimdal{pu} libwind0-heimdal{pu}
  libxfce4ui-1-0{pu} xfce-keyboard-shortcuts{pu}
0 packages upgraded, 0 newly installed, 11 to remove and 17 not upgraded.
Need to get 0 B of archives. After unpacking 3,368 kB will be freed.
Do you want to continue? [Y/n/?] y
(Reading database ... 129461 files and directories currently installed.)
Removing libgssapi3-heimdal:armhf (1.6~rc2+dfsg-9+rpi1) ...
Purging configuration files for libgssapi3-heimdal:armhf (1.6~rc2+dfsg-9+rpi1) ...
Removing libheimntlm0-heimdal:armhf (1.6~rc2+dfsg-9+rpi1) ...
Purging configuration files for libheimntlm0-heimdal:armhf (1.6~rc2+dfsg-9+rpi1) ...
Removing libkrb5-26-heimdal:armhf (1.6~rc2+dfsg-9+rpi1) ...
Purging configuration files for libkrb5-26-heimdal:armhf (1.6~rc2+dfsg-9+rpi1) ...
Removing libhx509-5-heimdal:armhf (1.6~rc2+dfsg-9+rpi1) ...
Purging configuration files for libhx509-5-heimdal:armhf (1.6~rc2+dfsg-9+rpi1) ...
Removing libhcrypto4-heimdal:armhf (1.6~rc2+dfsg-9+rpi1) ...
Purging configuration files for libhcrypto4-heimdal:armhf (1.6~rc2+dfsg-9+rpi1) ...
Removing libheimbase1-heimdal:armhf (1.6~rc2+dfsg-9+rpi1) ...
Purging configuration files for libheimbase1-heimdal:armhf (1.6~rc2+dfsg-9+rpi1) ...
Removing libwind0-heimdal:armhf (1.6~rc2+dfsg-9+rpi1) ...
Purging configuration files for libwind0-heimdal:armhf (1.6~rc2+dfsg-9+rpi1) ...
Removing libxfce4ui-1-0 (4.10.0-6) ...
Purging configuration files for libxfce4ui-1-0 (4.10.0-6) ...
Removing xfce-keyboard-shortcuts (4.10.0-6) ...
Purging configuration files for xfce-keyboard-shortcuts (4.10.0-6) ...
Removing libasn1-8-heimdal:armhf (1.6~rc2+dfsg-9+rpi1) ...
Purging configuration files for libasn1-8-heimdal:armhf (1.6~rc2+dfsg-9+rpi1) ...
Removing libroken18-heimdal:armhf (1.6~rc2+dfsg-9+rpi1) ...
Purging configuration files for libroken18-heimdal:armhf (1.6~rc2+dfsg-9+rpi1) ...
Processing triggers for libc-bin (2.19-18+deb8u7) ...

參考 :


2 則留言 :

Unknown 提到...

狐狸兄您好:
一直都有在拜讀你的blog 發現我們研究的東西都有重疊,以後可以多多交流

你文中有裝 curl 所以在程式時可以用我這方法來設定
https://www.webteach.tw/?p=33
就不用管執行權限問題

另外寄mail方面如果是純寄信可以裝seedmail
https://www.webteach.tw/?s=sendmail
會比較單純不用多餘的設定就可以寄mail

小狐狸事務所 提到...

Jeff 兄您好, 我對 Linux 不熟, 是初學者, 能收到您的留言真是天助我也, 我要好好拜讀您的大作, 解除我摸索之苦. 敬請多多指教啊, 感謝您.