WebScrapperjs -コンテンツを取得する/任意のウェブサイトのHTMLを使用しても、Javascriptを使用してブロックされずに

15239 ワード

sh20raj whollyapi webscrapperjs javascript テキストリンク

WebScrapperjs
WebScrapperjs -コンテンツを取得する/任意のウェブサイトのHTMLを使用しても、Javascriptを使用してブロックされずに
ウェブサイト https://sh20raj.github.io/WebScrapperJS/
GitHub | Repl.it |

GNUをつかむか、JavaScriptファイルをダウンロードしてください

<script src="https://cdn.jsdelivr.net/gh/SH20RAJ/WebScrapperJS/WebScrapper.js" ></script>

WebScrapper.get() 指定したURLの内容を文字列で返します.

WebScrapper.gethtml() 指定したURLの内容をパースしたDOMとして返します.( HTMLを取得し、DOMオブジェクトとしてパースします.

WebScrapper.getjson() 指定したURLの内容をパースしたJSONとして返します.

HTML/テキスト/コンテンツを任意のウェブサイトの文字列を取得します.

var html = WebScrapper.get('https://webscrapperjs.sh20raj.repl.co/');//This will be return the HTML/Text inside the webpage in a String.
console.log(html);

これは、Webページ内のHTML/テキストを文字列で返します.
Try this

DOMパースフォーム内の任意のウェブサイトのHTMLコンテンツを取得するにはWebScrapper.gethtml()

var url = 'https://google.com/';
var html = WebScrapper.gethtml(url);//html of the url will be Parsed and stored in this variable
console.log(html);
console.log(html.title);//As you Use document.title you can Use Like this to get the title.

URLで自分のウェブスクラップパーをintiisenew scrapper()

let MyWebScrapper = new scrapper('https://example.com/');
//You can now directly call gethtml() instead of passing a url into it.

console.log(MyWebScrapper.gethtml()); //Grab https://example.com/ and print on console

それでも、新しい作成スクラップを使用することができますMyWebScrapper 新しいURLをつかんでいます.ライク

let MyWebScrapper = new scrapper('https://example.com/');
//You can now directly call gethtml() instead of passing a url into it.

console.log(MyWebScrapper.gethtml()); //Grab https://example.com/ and print on console

console.log(MyWebScrapper.gethtml('https://example.com/')); //Grab https://youtube.com/ and print on console

また、JSSONをWebScrapperjsを使用して取得することもできます

var json = WebScrapper.getjson('https://jsonplaceholder.typicode.com/todos/1');//Return result direct in json format
console.log(json);

Try This

結果をより速く得る
以下のコード/メソッドを使用します

あなたの起源があなたをブロックしていないなら、あなたは直接gethtl ()の代わりに以下のfetch ()コードを使わなければなりません.
APIを使わずに結果を早く返すので.それは直接Ajaxを使用して起源を取得します.

用途WebScrapper.fetch() HTML/テキストを文字列で取得するには
このURLを使用しますhttps://webscrapperjs.sh20raj.repl.co/ ブロックされないからです.

var html = WebScrapper.fetch('https://webscrapperjs.sh20raj.repl.co/');//This will be return the HTML/Text inside the webpage a string.
console.log(html);

これは、Webページ内のHTML/テキストを文字列で返します.
Try this

用途WebScrapper.fetchhtml() HTML/DOM文書を解析するにはWebScrapper.gethtml() .

var html = WebScrapper.fetchhtml('https://webscrapperjs.sh20raj.repl.co/');//This will be return the Parsed HTML inside the webpage. 
console.log(html);
console.log(html.title);

Try this

用途WebScrapper.fetchjson() 解析されたJSONを取得するには

var json = WebScrapper.fetchjson('https://webscrapperjs.sh20raj.repl.co/sample.json');//This will be return the JSON inside the webpage. 
console.log(json);
console.log(json.id);

Try this

codepenの上でこれをためしてください
Codeen :サンプルコード https://codepen.io/SH20RAJ/pen/VwrwjXJ?editors=1001

<div id="scrappedcontent"></div>

<script src="https://cdn.jsdelivr.net/gh/SH20RAJ/WebScrapperJS/WebScrapper.min.js" ></script> 
<script>
  let MyWebScrapper = new scrapper('https://google.com/');
//You can now directly call gethtml() instead of passing a url into it.

console.log(MyWebScrapper.gethtml()); //Grab https://example.com/ and print on console
var html = MyWebScrapper.gethtml('https://example.com/');

console.log(html); //Grab https://youtube.com/ and print on console

document.getElementById('scrappedcontent').innerHTML = html;
</script>

結果を見るHere

その他の機能
WebScrapper.getparam() URLパラメータを取得する
現在のURLを仮定するとhttps://example.com/?id=7 .

let id = getparam('id');
console.log(id);//Will Return "7" .

現在のURLの代わりにカスタム文字列を使用する

let id = getparam('id','https://example.com/?id=20');
console.log(id);//Will Return "20" .

Reference

この問題について(WebScrapperjs -コンテンツを取得する/任意のウェブサイトのHTMLを使用しても、Javascriptを使用してブロックされずに), 我々は、より多くの情報をここで見つけました https://dev.to/sh20raj/webscrapperjs-get-contenthtml-of-any-website-without-being-blocked-by-cors-even-using-javascript-by-whollyapi-42l7

テキストは自由に共有またはコピーできます。ただし、このドキュメントのURLは参考URLとして残しておいてください。

Collection and Share based on the CC Protocol

名前付き関数式の隠された範囲

JavaScriptを使用して言葉を騙す方法