In front-end development, code duplication is a common problem. Duplicate code not only increases the maintenance cost of the code, but may also lead to inefficient operation of the program. To solve this problem, many tools and techniques are used to detect and eliminate code duplication. One of the widely used tools is jscpd
.
Contents
Introduction to jscpd
jscpd
It is an open source JavaScript
tool library used to detect code duplication. It is very effective in detecting copied and pasted code. It can detect code duplication by scanning source code files, analyzing code snippets within them, and comparing similarities between them. jscpd
Supports various front-end frameworks and languages, including 150 source code file formats such as HTML, CSS and JavaScript. Whether it is native JavaScript, CSS, HTML code, 还是使用typescript
, scss
, vue
, , react
etc., it can well detect duplicate codes in the project.
Open source warehouse address: github.com/kucherenko/jscpd/tree/master
how to use
jscpd
Code duplication detection is very simple to use . We need to install jscpd
. Can be installed via npm
or .yarn
jscpd
npm install -g jscpd
yarn global add jscpd
After the installation is complete, we can run the jscpd command in the terminal and specify the code directory or file to be detected. For example, we can enter the following command to detect all JavaScript files in the current directory:
jscpd.
Specify directory detection:
jscpd /path/to/code
The effect after successful execution on the command line is as shown below:
Briefly explain the field contents in the corresponding diagram:
- Clone found (javascript):
Displays the duplicate code blocks found, here are javascript files. And the specific number of lines of repeated code in the file will be displayed for easy search. - Format: file format, here is javascript, it can also be scss, markup, etc.
- Files analyzed: The number of files analyzed, counting the number of files being detected.
- Total lines: The total number of lines in all files.
- Total tokens: The number of all tokens. A line of code generally contains several to dozens of tokens.
- Clones found: Number of duplicate blocks found.
- Duplicated lines: The number and proportion of duplicate lines of code.
- Duplicated tokens: The number and proportion of duplicate tokens.
- Detection time: detection time.
Project configuration
The above example is a relatively simple and direct detection of a single file or folder. Most of the current mainstream front-end projects are generated based on scaffolding or contain related front-end engineering files. Since many files are auxiliary tools such as dependency packages, build scripts, documents, configuration files, etc., such files do not need to be detected and need to be excluded. Projects in this case generally use configuration files and jscpd
use option configuration specifications.
jscpd
The configuration options can be created in the following two ways. The added content is the same without distinguishing the corresponding front-end framework.
Create a configuration file in the project root directory .jscpd.json
, and then add specific configuration options to the file:
{ "threshold" : 0 , "reporters" : [ "html" , "console" , "badge" ] , "ignore" : [ "**/__snapshots__/**" ] , "absolute" : true }
You can also package.json
add it directly to the file jscpd
:
{ ... "jscpd" : { "threshold" : 0.1 , "reporters" : [ "html" , "console" , "badge" ] , "ignore" : [ "**/__snapshots__/**" ] , "absolute" : true , "gitignore" : true } ... }
Let’s briefly introduce the meaning of the above configuration fields:
- threshold: represents the threshold of repetition. If this value is exceeded, an error alarm will be output. For example, if the threshold is set to 10 and the repetition rate is 18.1%, the following error ❌ will be prompted, but the code detection will be completed normally.
ERROR: jscpd found too many duplicates (18.1%) over threshold (10%)
- Reporters: Indicates the way to generate result detection reports, generally including the following:
- console: console printout
- consoleFull: The console prints the repeated code block completely
- json: output
json
report in format - xml: output
xml
report in format - csv: output
csv
report in format - markdown: output
markdown
report with format - html: Generate
html
report tohtml
folder - verbose: output a large amount of debugging information to the console
- ignore: detect ignored files or file directories, filter some non-business code, such as dependent packages, documents or static files, etc.
- Format: Source code formats that require duplication detection. Currently, more than 150 types are supported. Commonly used ones include javascript, typescript, css, etc.
- absolute: use absolute paths in detection reports
In addition, there are many other configurations. If you are interested, you can read the detailed introduction in the source code document. Complete configuration document address:github.com/kucherenko/jscpd/tree/master/packages/jscpd
Test Report
After completing the above jscpd
configuration, execute the following command to output the corresponding duplicate detection report. After running, jscpd
a report will be generated showing information about each repeated code snippet. The report includes details such as the location of duplicate code, similarity percentage, and number of lines of code. Through this information, we can perform targeted code reconstruction.
jscpd ./src -o 'report'
The business code in the project is usually placed ./src
in the directory, so the files in this directory can be directly detected. If it is placed in another directory, it can be adjusted according to the actual situation. Output the detection report to the folder in the project root directory
through command line parameters . You can also customize other directory names here . The output directory structure is as follows:-o 'report'
report
report
The generated report page looks like this:
Project overview data:
The specific location and number of lines of repeated code:
The default number of lines (5 lines) and tokens (50) for detecting duplicate codes is relatively small, so there may be more duplicate code blocks generated. In actual use, the detection range can be set. The following setting parameters are for reference:
- Minimum tokens:
--min-tokens
, abbreviation-k
- Minimum number of lines:
--min-lines
, abbreviation-l
- Maximum number of lines:
--max-lines
, abbreviation-x
jscpd ./src --min-tokens 200 --min-lines 20 -o 'report'
In order to use this command more conveniently, this command can be integrated into package.json
in scripts
, and then you only need to execute npm run jscpd
to perform detection. As follows:
"scripts" : { ... "jscpd" : "jscpd ./src --min-tokens 200 --min-lines 20 -o 'report'" , ... }
ignore code block
As mentioned above, ignore
you can ignore a file or folder. Another way to ignore it is to ignore a certain block of code in the file. Since some repeated code is necessary in actual situations, you can use code comment identification to ignore detection, add comments at the beginning and end of the code, jscpd:ignore-start
and jscpd:ignore-end
wrap the code.
How to use it in js code:
/* jscpd:ignore-start */ import lodash from 'lodash' ; import React from 'react' ; import { User } from './models' ; import { UserService } from './services' ; /* jscpd:ignore -end */
The usage in CSS and various preprocessing is consistent with the usage in js:
/* jscpd:ignore-start */ .style { padding : 40px 0 ; font-size : 26px ; font-weight : 400 ; color : #464646 ; line-height : 26px ; } /* jscpd:ignore-end */
How to use it in html code:
<!-- // jscpd:ignore-start --> < meta data-react-helmet = "true" name = "theme-color" content = "#cb3837" /> < link data-react-helmet = "true" rel = "stylesheet" href = "https: //static.npmjs.com/103af5b8a2b3c971cba419755f3a67bc.css" /> < link data-react-helmet = "true" rel = "apple-touch-icon" sizes = "120x120" href = "https://static.npmjs. com/58a19602036db1daee0d7863c94673a4.png" /> < link data-react-helmet = "true" rel = "icon" type = "image/png" href = "https://static.npmjs.com/b0f1a8318363185cc2ea6a40ac23eeb2.png" sizes = "32x32" /> <!-- // jscpd:ignore-end -->
Summarize
jscpd
It is a powerful front-end local code duplication detection tool. It can help developers quickly discover code duplication problems. Simple configuration can output intuitive code duplication data, and improve code quality and maintainability by solving duplicate codes.
We jscpd
can effectively optimize the front-end development process and improve the efficiency and performance of the code. I hope this article can help you understand jscpd
the front-end local code duplication detection based on .