修改:/e/class/connect.php文件,在该文件最上面加上以下函数:
//获取https链接内容 function getHTTPS($url) { $ch = curl_init(); curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE); curl_setopt($ch, CURLOPT_HEADER, false); curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_REFERER, $url); curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE); $result = curl_exec($ch); curl_close($ch); return $result; }
找到ReadFiletext函数如下代码:
function ReadFiletext($filepath){ $filepath=trim($filepath); $htmlfp=@fopen($filepath,"r"); //远程 if(strstr($filepath,"://")) { while($data=@fread($htmlfp,500000)) { $string.=$data; } } //本地 else { $string=@fread($htmlfp,@filesize($filepath)); } @fclose($htmlfp); return $string; }
改成:
function ReadFiletext($filepath){ $filepath=trim($filepath); $htmlfp=@fopen($filepath,"r"); //远程 if(strstr($filepath,"https://")){ return getHTTPS($filepath); } if(strstr($filepath,"://")) { while($data=@fread($htmlfp,500000)) { $string.=$data; } } //本地 else { $string=@fread($htmlfp,@filesize($filepath)); } @fclose($htmlfp); return $string; }
自此可实现采集https开头的网页链接
- THINKPHP随风自动采集百度知道问答系统源码v22.5 [2024-07-17]
- wordpress小白网创自动采集发布插件1.3版本修复版 [2024-07-10]
- 帝国cms自带采集和火车头采集器哪个更好用 [2024-04-25]
- [Chrome浏览器插件]anypicker可视化爬虫采集插件 [2024-04-22]
- TTC线报网实时自动采集程序源码,带模板和采集器 [2024-01-16]