React + Electron でスクレイピングをする
はじめに
日本語の資料しか調べていませんが、外部のサイトへのスクレイピングをするために必要なHTTPリクエストのコードが探してもなかったので、書いてみました。 私はjs初心者もあり、ソースコードが汚いですが、お好きなように変更してください。
プロジェクトの構成は以下の通りです。
project/
src/
main.js
util.js
public/
index.html
srcディレクトリの中に、util.jsを配置してください。
src/main.js
const electron = require("electron");
const app = electron.app;
const BrowserWindow = electron.BrowserWindow;
const path = require('path')
const url = require('url');
let mainWindow;
function createWindow() {
mainWindow = new BrowserWindow({
width: 1366, height: 720,
minWidth: 1194, minHeight: 720,
webPreferences: {
nodeIntegration: true,
}
})
mainWindow.loadURL('http://localhost:3000');
mainWindow.webContents.openDevTools()
mainWindow.on("closed", () => (mainWindow = null));
}
app.on("ready", createWindow);
app.on("window-all-closed", () => {
if (process.platform !== "darwin") {
app.quit();
}
});
app.on("activate", () => {
if (mainWindow === null) {
createWindow();
}
});
global.util = require('./util'); // utilモジュールのロード
public/index.html
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8" />
<!--
manifest.json provides metadata used when your web app is installed on a
user's mobile device or desktop. See https://developers.google.com/web/fundamentals/web-app-manifest/
-->
<link rel="manifest" href="%PUBLIC_URL%/manifest.json" />
<!--
Notice the use of %PUBLIC_URL% in the tags above.
It will be replaced with the URL of the `public` folder during the build.
Only files inside the `public` folder can be referenced from the HTML.
Unlike "/favicon.ico" or "favicon.ico", "%PUBLIC_URL%/favicon.ico" will
work correctly both with client-side routing and a non-root public URL.
Learn how to configure a non-root public URL by running `npm run build`.
-->
<title>React App</title>
<script> const electron = require('electron').remote, util = electron.getGlobal('util'); window.electron = electron; window.util = util;</script>
</head>
<body>
<noscript>You need to enable JavaScript to run this app.</noscript>
<div id="root"></div>
<!--
This HTML file is a template.
If you open it directly in the browser, you will see an empty page.
You can add webfonts, meta tags, or analytics to this file.
The build step will place the bundled scripts into the <body> tag.
To begin the development, run `npm start` or `yarn start`.
To create a production bundle, use `npm run build` or `yarn build`.
-->
</body>
</html>
使用例
usage.js
const util = window.util;
(async() => {
let url = 'https://qiita.com';
let cookie = '';
let ck = util.createCookieStore();
let headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.120
Safari/537.36',
'Cookie': cookie,
};
let res = await util.request({
method: 'GET',
url: url,
headers: headers,
data: '',
}, 'utf8');
ck = util.updateCookieStore(ck, cookie, res);
cookie = ck.getAll();
console.log(res.headers, res.body, cookie);
})();
usage.js
const util = window.util;
(async() => {
let url = 'https://qiita.com';
let cookie = '';
let ck = util.createCookieStore();
let headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.120
Safari/537.36',
'Cookie': cookie,
};
let res = await util.request({
method: 'GET',
url: url,
headers: headers,
data: '',
}, 'utf8');
ck = util.updateCookieStore(ck, cookie, res);
cookie = ck.getAll();
console.log(res.headers, res.body, cookie);
})();
Author And Source
この問題について(React + Electron でスクレイピングをする), 我々は、より多くの情報をここで見つけました https://qiita.com/rop/items/38e589e3bc6e6332c08b著者帰属:元の著者の情報は、元のURLに含まれています。著作権は原作者に属する。
Content is automatically searched and collected through network algorithms . If there is a violation . Please contact us . We will adjust (correct author information ,or delete content ) as soon as possible .