Cara menggunakan htmlparser2

test(nama, fungsi () { var src = fs. readFileSync(inputDir + '/' + nama + '. giok', 'utf8'); . parseDOM(fs. readFileSync(inputDir + '/' + nama + '. html', 'utf8')); . writeFileSync(outputDir + '/' + nama + '. giok', src); . compileFileClient(inputDir + '/' + nama + '. giok', {outputFile. outputDir + '/' + nama + '. js', berdasarkan. inputDir }); . writeFileSync(outputDir + '/' + nama + '. js', js); . mengejek(); . compileFile(inputDir + '/' + nama + '. giok', {outputFile. outputDir + '/' + nama + '. js', berdasarkan. inputDir }); . 'Giok'}); . filter(fungsi(elemen) { mengembalikan elemen. jenis. == 'teks' }). panjangnya. == 1; . anak-anak. sebenarnya; . mengatur ulang();

htmlparser2 adalah , dan mengambil beberapa jalan pintas untuk sampai ke sana. Jika Anda memerlukan kepatuhan spesifikasi HTML yang ketat, lihat parse5

Instalasi

npm install htmlparser2

Demo langsung htmlparser2 tersedia

Ekosistem

NameDescriptionhtmlparser2Cepat & memaafkan HTML/XML parserdomhandlerHandler untuk htmlparser2 yang mengubah dokumen menjadi DOMdomutilsUtilities untuk bekerja dengan mesin pemilih DOMcss-selectCSS domhandler, kompatibel dengan DOMcheerio domhandlerJQuery API untuk DOMdom-serializerSerializer domhandler untuk DOM domhandler

Penggunaan

htmlparser2 sendiri menyediakan antarmuka callback yang memungkinkan konsumsi dokumen dengan alokasi minimal. Untuk pengalaman yang lebih ergonomis, baca di bawah ini

const htmlparser2 = require("htmlparser2");
const parser = new htmlparser2.Parser({
    onopentag(name, attributes) {
        /*
         * This fires when a new tag is opened.
         *
         * If you don't need an aggregated `attributes` object,
         * have a look at the `onopentagname` and `onattribute` events.
         */
        if (name === "script" && attributes.type === "text/javascript") {
            console.log("JS! Hooray!");
        }
    },
    ontext(text) {
        /*
         * Fires whenever a section of text was processed.
         *
         * Note that this can fire at any point within text and you might
         * have to stich together multiple pieces.
         */
        console.log("-->", text);
    },
    onclosetag(tagname) {
        /*
         * Fires when a tag is closed.
         *
         * You can rely on this event only firing when you have received an
         * equivalent opening tag before. Closing tags without corresponding
         * opening tags will be ignored.
         */
        if (tagname === "script") {
            console.log("That's it?!");
        }
    },
});
parser.write(
    "Xyz <script type='text/javascript'>const foo = '<<bar>>';</ script>"
);
parser.end();
_

Keluaran (dengan gabungan beberapa peristiwa teks)

--> Xyz
JS! Hooray!
--> const foo = '<<bar>>';
That's it?!

Contoh ini hanya menunjukkan tiga kemungkinan kejadian. Baca lebih lanjut tentang parser, acara, dan opsinya di wiki

Penggunaan dengan aliran

Sementara antarmuka Parser sangat mirip dengan Node. js, ini bukan 100% cocok. Gunakan antarmuka

const htmlparser2 = require("htmlparser2");
const parser = new htmlparser2.Parser({
    onopentag(name, attributes) {
        /*
         * This fires when a new tag is opened.
         *
         * If you don't need an aggregated `attributes` object,
         * have a look at the `onopentagname` and `onattribute` events.
         */
        if (name === "script" && attributes.type === "text/javascript") {
            console.log("JS! Hooray!");
        }
    },
    ontext(text) {
        /*
         * Fires whenever a section of text was processed.
         *
         * Note that this can fire at any point within text and you might
         * have to stich together multiple pieces.
         */
        console.log("-->", text);
    },
    onclosetag(tagname) {
        /*
         * Fires when a tag is closed.
         *
         * You can rely on this event only firing when you have received an
         * equivalent opening tag before. Closing tags without corresponding
         * opening tags will be ignored.
         */
        if (tagname === "script") {
            console.log("That's it?!");
        }
    },
});
parser.write(
    "Xyz <script type='text/javascript'>const foo = '<<bar>>';</ script>"
);
parser.end();
_0 untuk memproses input streaming

const { WritableStream } = require("htmlparser2/lib/WritableStream");
const parserStream = new WritableStream({
    ontext(text) {
        console.log("Streaming:", text);
    },
});

const htmlStream = fs.createReadStream("./my-file.html");
htmlStream.pipe(parserStream).on("finish", () => console.log("done"));
_

Mendapatkan DOM

const htmlparser2 = require("htmlparser2");
const parser = new htmlparser2.Parser({
    onopentag(name, attributes) {
        /*
         * This fires when a new tag is opened.
         *
         * If you don't need an aggregated `attributes` object,
         * have a look at the `onopentagname` and `onattribute` events.
         */
        if (name === "script" && attributes.type === "text/javascript") {
            console.log("JS! Hooray!");
        }
    },
    ontext(text) {
        /*
         * Fires whenever a section of text was processed.
         *
         * Note that this can fire at any point within text and you might
         * have to stich together multiple pieces.
         */
        console.log("-->", text);
    },
    onclosetag(tagname) {
        /*
         * Fires when a tag is closed.
         *
         * You can rely on this event only firing when you have received an
         * equivalent opening tag before. Closing tags without corresponding
         * opening tags will be ignored.
         */
        if (tagname === "script") {
            console.log("That's it?!");
        }
    },
});
parser.write(
    "Xyz <script type='text/javascript'>const foo = '<<bar>>';</ script>"
);
parser.end();
1 menghasilkan DOM (model objek dokumen) yang dapat dimanipulasi menggunakan helper
const htmlparser2 = require("htmlparser2");
const parser = new htmlparser2.Parser({
    onopentag(name, attributes) {
        /*
         * This fires when a new tag is opened.
         *
         * If you don't need an aggregated `attributes` object,
         * have a look at the `onopentagname` and `onattribute` events.
         */
        if (name === "script" && attributes.type === "text/javascript") {
            console.log("JS! Hooray!");
        }
    },
    ontext(text) {
        /*
         * Fires whenever a section of text was processed.
         *
         * Note that this can fire at any point within text and you might
         * have to stich together multiple pieces.
         */
        console.log("-->", text);
    },
    onclosetag(tagname) {
        /*
         * Fires when a tag is closed.
         *
         * You can rely on this event only firing when you have received an
         * equivalent opening tag before. Closing tags without corresponding
         * opening tags will be ignored.
         */
        if (tagname === "script") {
            console.log("That's it?!");
        }
    },
});
parser.write(
    "Xyz <script type='text/javascript'>const foo = '<<bar>>';</ script>"
);
parser.end();
2

const htmlparser2 = require("htmlparser2");

const dom = htmlparser2.parseDocument(htmlString);

const htmlparser2 = require("htmlparser2");
const parser = new htmlparser2.Parser({
    onopentag(name, attributes) {
        /*
         * This fires when a new tag is opened.
         *
         * If you don't need an aggregated `attributes` object,
         * have a look at the `onopentagname` and `onattribute` events.
         */
        if (name === "script" && attributes.type === "text/javascript") {
            console.log("JS! Hooray!");
        }
    },
    ontext(text) {
        /*
         * Fires whenever a section of text was processed.
         *
         * Note that this can fire at any point within text and you might
         * have to stich together multiple pieces.
         */
        console.log("-->", text);
    },
    onclosetag(tagname) {
        /*
         * Fires when a tag is closed.
         *
         * You can rely on this event only firing when you have received an
         * equivalent opening tag before. Closing tags without corresponding
         * opening tags will be ignored.
         */
        if (tagname === "script") {
            console.log("That's it?!");
        }
    },
});
parser.write(
    "Xyz <script type='text/javascript'>const foo = '<<bar>>';</ script>"
);
parser.end();
1, meskipun masih dibundel dengan modul ini, dipindahkan ke modulnya sendiri. Lihat itu untuk informasi lebih lanjut

Mengurai Umpan RSS/RDF/Atom

const feed = htmlparser2.parseFeed(content, options);
_

Catatan. Meskipun penangan umpan yang disediakan berfungsi untuk sebagian besar umpan, Anda mungkin ingin menggunakan danmacough/node-feedparser, yang jauh lebih baik diuji dan dipelihara secara aktif

Pertunjukan

Setelah memiliki beberapa tolok ukur buatan selama beberapa waktu, @AndreasMadsen menerbitkan

const htmlparser2 = require("htmlparser2");
const parser = new htmlparser2.Parser({
    onopentag(name, attributes) {
        /*
         * This fires when a new tag is opened.
         *
         * If you don't need an aggregated `attributes` object,
         * have a look at the `onopentagname` and `onattribute` events.
         */
        if (name === "script" && attributes.type === "text/javascript") {
            console.log("JS! Hooray!");
        }
    },
    ontext(text) {
        /*
         * Fires whenever a section of text was processed.
         *
         * Note that this can fire at any point within text and you might
         * have to stich together multiple pieces.
         */
        console.log("-->", text);
    },
    onclosetag(tagname) {
        /*
         * Fires when a tag is closed.
         *
         * You can rely on this event only firing when you have received an
         * equivalent opening tag before. Closing tags without corresponding
         * opening tags will be ignored.
         */
        if (tagname === "script") {
            console.log("That's it?!");
        }
    },
});
parser.write(
    "Xyz <script type='text/javascript'>const foo = '<<bar>>';</ script>"
);
parser.end();
4 miliknya, yang mem-parse HTML tolok ukur berdasarkan situs web dunia nyata

Pada saat penulisan, versi terbaru dari semua parser yang didukung menunjukkan karakteristik kinerja berikut pada GitHub Actions (bersumber dari sini)

htmlparser2        : 2.17215 ms/file ± 3.81587
node-html-parser   : 2.35983 ms/file ± 1.54487
html5parser        : 2.43468 ms/file ± 2.81501
neutron-html5parser: 2.61356 ms/file ± 1.70324
htmlparser2-dom    : 3.09034 ms/file ± 4.77033
html-dom-parser    : 3.56804 ms/file ± 5.15621
libxmljs           : 4.07490 ms/file ± 2.99869
htmljs-parser      : 6.15812 ms/file ± 7.52497
parse5             : 9.70406 ms/file ± 6.74872
htmlparser         : 15.0596 ms/file ± 89.0826
html-parser        : 28.6282 ms/file ± 22.6652
saxes              : 45.7921 ms/file ± 128.691
html5              : 120.844 ms/file ± 153.944

Apa perbedaan modul ini dengan node-htmlparser?

Pada tahun 2011, modul ini dimulai sebagai percabangan dari modul

const htmlparser2 = require("htmlparser2");
const parser = new htmlparser2.Parser({
    onopentag(name, attributes) {
        /*
         * This fires when a new tag is opened.
         *
         * If you don't need an aggregated `attributes` object,
         * have a look at the `onopentagname` and `onattribute` events.
         */
        if (name === "script" && attributes.type === "text/javascript") {
            console.log("JS! Hooray!");
        }
    },
    ontext(text) {
        /*
         * Fires whenever a section of text was processed.
         *
         * Note that this can fire at any point within text and you might
         * have to stich together multiple pieces.
         */
        console.log("-->", text);
    },
    onclosetag(tagname) {
        /*
         * Fires when a tag is closed.
         *
         * You can rely on this event only firing when you have received an
         * equivalent opening tag before. Closing tags without corresponding
         * opening tags will be ignored.
         */
        if (tagname === "script") {
            console.log("That's it?!");
        }
    },
});
parser.write(
    "Xyz <script type='text/javascript'>const foo = '<<bar>>';</ script>"
);
parser.end();
5. htmlparser2_ telah ditulis ulang beberapa kali dan, meskipun mempertahankan API yang sebagian besar kompatibel dengan
const htmlparser2 = require("htmlparser2");
const parser = new htmlparser2.Parser({
    onopentag(name, attributes) {
        /*
         * This fires when a new tag is opened.
         *
         * If you don't need an aggregated `attributes` object,
         * have a look at the `onopentagname` and `onattribute` events.
         */
        if (name === "script" && attributes.type === "text/javascript") {
            console.log("JS! Hooray!");
        }
    },
    ontext(text) {
        /*
         * Fires whenever a section of text was processed.
         *
         * Note that this can fire at any point within text and you might
         * have to stich together multiple pieces.
         */
        console.log("-->", text);
    },
    onclosetag(tagname) {
        /*
         * Fires when a tag is closed.
         *
         * You can rely on this event only firing when you have received an
         * equivalent opening tag before. Closing tags without corresponding
         * opening tags will be ignored.
         */
        if (tagname === "script") {
            console.log("That's it?!");
        }
    },
});
parser.write(
    "Xyz <script type='text/javascript'>const foo = '<<bar>>';</ script>"
);
parser.end();
5 dalam banyak kasus, proyek tidak lagi membagikan kode apa pun

Parser sekarang menyediakan antarmuka callback yang terinspirasi oleh sax. js (awalnya ditargetkan untuk readabilitySAX). Akibatnya, penangan lama tidak akan berfungsi lagi

const htmlparser2 = require("htmlparser2");
const parser = new htmlparser2.Parser({
    onopentag(name, attributes) {
        /*
         * This fires when a new tag is opened.
         *
         * If you don't need an aggregated `attributes` object,
         * have a look at the `onopentagname` and `onattribute` events.
         */
        if (name === "script" && attributes.type === "text/javascript") {
            console.log("JS! Hooray!");
        }
    },
    ontext(text) {
        /*
         * Fires whenever a section of text was processed.
         *
         * Note that this can fire at any point within text and you might
         * have to stich together multiple pieces.
         */
        console.log("-->", text);
    },
    onclosetag(tagname) {
        /*
         * Fires when a tag is closed.
         *
         * You can rely on this event only firing when you have received an
         * equivalent opening tag before. Closing tags without corresponding
         * opening tags will be ignored.
         */
        if (tagname === "script") {
            console.log("That's it?!");
        }
    },
});
parser.write(
    "Xyz <script type='text/javascript'>const foo = '<<bar>>';</ script>"
);
parser.end();
8 dan
const htmlparser2 = require("htmlparser2");
const parser = new htmlparser2.Parser({
    onopentag(name, attributes) {
        /*
         * This fires when a new tag is opened.
         *
         * If you don't need an aggregated `attributes` object,
         * have a look at the `onopentagname` and `onattribute` events.
         */
        if (name === "script" && attributes.type === "text/javascript") {
            console.log("JS! Hooray!");
        }
    },
    ontext(text) {
        /*
         * Fires whenever a section of text was processed.
         *
         * Note that this can fire at any point within text and you might
         * have to stich together multiple pieces.
         */
        console.log("-->", text);
    },
    onclosetag(tagname) {
        /*
         * Fires when a tag is closed.
         *
         * You can rely on this event only firing when you have received an
         * equivalent opening tag before. Closing tags without corresponding
         * opening tags will be ignored.
         */
        if (tagname === "script") {
            console.log("That's it?!");
        }
    },
});
parser.write(
    "Xyz <script type='text/javascript'>const foo = '<<bar>>';</ script>"
);
parser.end();
9 diganti namanya untuk memperjelas tujuan mereka (menjadi
const htmlparser2 = require("htmlparser2");
const parser = new htmlparser2.Parser({
    onopentag(name, attributes) {
        /*
         * This fires when a new tag is opened.
         *
         * If you don't need an aggregated `attributes` object,
         * have a look at the `onopentagname` and `onattribute` events.
         */
        if (name === "script" && attributes.type === "text/javascript") {
            console.log("JS! Hooray!");
        }
    },
    ontext(text) {
        /*
         * Fires whenever a section of text was processed.
         *
         * Note that this can fire at any point within text and you might
         * have to stich together multiple pieces.
         */
        console.log("-->", text);
    },
    onclosetag(tagname) {
        /*
         * Fires when a tag is closed.
         *
         * You can rely on this event only firing when you have received an
         * equivalent opening tag before. Closing tags without corresponding
         * opening tags will be ignored.
         */
        if (tagname === "script") {
            console.log("That's it?!");
        }
    },
});
parser.write(
    "Xyz <script type='text/javascript'>const foo = '<<bar>>';</ script>"
);
parser.end();
1 dan
--> Xyz
JS! Hooray!
--> const foo = '<<bar>>';
That's it?!
1). Nama lama masih tersedia saat membutuhkan htmlparser2, kode Anda harus berfungsi seperti yang diharapkan

Informasi kontak keamanan

Untuk melaporkan kerentanan keamanan, harap gunakan kontak keamanan Tidelift. Tidelift akan mengoordinasikan perbaikan dan pengungkapan

htmlparser2_ untuk perusahaan

Tersedia sebagai bagian dari Langganan Tidelift

Pemelihara htmlparser2 dan ribuan paket lainnya bekerja dengan Tidelift untuk memberikan dukungan komersial dan pemeliharaan untuk dependensi open source yang Anda gunakan untuk membangun aplikasi Anda. Hemat waktu, kurangi risiko, dan tingkatkan kesehatan kode, sembari membayar pengelola dependensi persis yang Anda gunakan. Belajarlah lagi