Front-end code duplication detection

In front-end development, code duplication is a common problem. Duplicate code not only increases the maintenance cost of the code, but may also lead to inefficient operation of the program. To solve this problem, many tools and techniques are used to detect and eliminate code duplication. One of the widely used tools is jscpd.

Introduction to jscpd

jscpdIt is an open source JavaScripttool library used to detect code duplication. It is very effective in detecting copied and pasted code. It can detect code duplication by scanning source code files, analyzing code snippets within them, and comparing similarities between them. jscpdSupports various front-end frameworks and languages, including 150 source code file formats such as HTML, CSS and JavaScript. Whether it is native JavaScript, CSS, HTML code, 还是使用typescriptscssvue, , reactetc., it can well detect duplicate codes in the project.

Open source warehouse address: github.com/kucherenko/jscpd/tree/master

how to use

jscpdCode duplication detection is very simple to use . We need to install jscpd. Can be installed via npmor .yarnjscpd

npm install -g jscpd
yarn global add jscpd

After the installation is complete, we can run the jscpd command in the terminal and specify the code directory or file to be detected. For example, we can enter the following command to detect all JavaScript files in the current directory:

jscpd.

Specify directory detection:

jscpd /path/to/code

The effect after successful execution on the command line is as shown below:

Briefly explain the field contents in the corresponding diagram:

  • Clone found (javascript):
    Displays the duplicate code blocks found, here are javascript files. And the specific number of lines of repeated code in the file will be displayed for easy search.
  • Format: file format, here is javascript, it can also be scss, markup, etc.
  • Files analyzed: The number of files analyzed, counting the number of files being detected.
  • Total lines: The total number of lines in all files.
  • Total tokens: The number of all tokens. A line of code generally contains several to dozens of tokens.
  • Clones found: Number of duplicate blocks found.
  • Duplicated lines: The number and proportion of duplicate lines of code.
  • Duplicated tokens: The number and proportion of duplicate tokens.
  • Detection time: detection time.

Project configuration

The above example is a relatively simple and direct detection of a single file or folder. Most of the current mainstream front-end projects are generated based on scaffolding or contain related front-end engineering files. Since many files are auxiliary tools such as dependency packages, build scripts, documents, configuration files, etc., such files do not need to be detected and need to be excluded. Projects in this case generally use configuration files and jscpduse option configuration specifications.

jscpdThe configuration options can be created in the following two ways. The added content is the same without distinguishing the corresponding front-end framework.

Create a configuration file in the project root directory .jscpd.json, and then add specific configuration options to the file:

{ 
  "threshold" :  0 , 
  "reporters" :  [ "html" ,  "console" ,  "badge" ] , 
  "ignore" :  [ "**/__snapshots__/**" ] , 
  "absolute" :  true 
}

You can also package.jsonadd it directly to the file jscpd:

{
  ...
  "jscpd" :  { 
    "threshold" :  0.1 , 
    "reporters" :  [ "html" ,  "console" ,  "badge" ] , 
    "ignore" :  [ "**/__snapshots__/**" ] , 
    "absolute" :  true , 
    "gitignore" :  true 
  }
  ...
}

Let’s briefly introduce the meaning of the above configuration fields:

  • threshold: represents the threshold of repetition. If this value is exceeded, an error alarm will be output. For example, if the threshold is set to 10 and the repetition rate is 18.1%, the following error ❌ will be prompted, but the code detection will be completed normally.
ERROR: jscpd found too many duplicates (18.1%) over threshold (10%)
  • Reporters: Indicates the way to generate result detection reports, generally including the following:
    • console: console printout
    • consoleFull: The console prints the repeated code block completely
    • json: output jsonreport in format
    • xml: output xmlreport in format
    • csv: output csvreport in format
    • markdown: output markdownreport with format
    • html: Generate htmlreport to htmlfolder
    • verbose: output a large amount of debugging information to the console
  • ignore: detect ignored files or file directories, filter some non-business code, such as dependent packages, documents or static files, etc.
  • Format: Source code formats that require duplication detection. Currently, more than 150 types are supported. Commonly used ones include javascript, typescript, css, etc.
  • absolute: use absolute paths in detection reports

In addition, there are many other configurations. If you are interested, you can read the detailed introduction in the source code document. Complete configuration document address:github.com/kucherenko/jscpd/tree/master/packages/jscpd

Test Report

After completing the above jscpdconfiguration, execute the following command to output the corresponding duplicate detection report. After running, jscpda report will be generated showing information about each repeated code snippet. The report includes details such as the location of duplicate code, similarity percentage, and number of lines of code. Through this information, we can perform targeted code reconstruction.

jscpd ./src -o 'report'

The business code in the project is usually placed ./srcin the directory, so the files in this directory can be directly detected. If it is placed in another directory, it can be adjusted according to the actual situation. Output the detection report to the folder in the project root directory
through command line parameters . You can also customize other directory names here . The output directory structure is as follows:-o 'report'reportreport

The generated report page looks like this:

Project overview data:

The specific location and number of lines of repeated code:

The default number of lines (5 lines) and tokens (50) for detecting duplicate codes is relatively small, so there may be more duplicate code blocks generated. In actual use, the detection range can be set. The following setting parameters are for reference:

  • Minimum tokens: --min-tokens, abbreviation-k
  • Minimum number of lines: --min-lines, abbreviation-l
  • Maximum number of lines: --max-lines, abbreviation-x
jscpd ./src --min-tokens 200 --min-lines 20 -o 'report'

In order to use this command more conveniently, this command can be integrated into package.jsonin scripts, and then you only need to execute npm run jscpdto perform detection. As follows:

"scripts" :  {
  ...
  "jscpd" :  "jscpd ./src --min-tokens 200 --min-lines 20 -o 'report'" ,
  ...
}

ignore code block

As mentioned above, ignoreyou can ignore a file or folder. Another way to ignore it is to ignore a certain block of code in the file. Since some repeated code is necessary in actual situations, you can use code comment identification to ignore detection, add comments at the beginning and end of the code, jscpd:ignore-startand jscpd:ignore-endwrap the code.

How to use it in js code:

/* jscpd:ignore-start */ 
import lodash from  'lodash' ;
 import  React  from  'react' ;
 import { User } from  './models' ;
 import { UserService } from  './services' ;
 /* jscpd:ignore -end */

The usage in CSS and various preprocessing is consistent with the usage in js:

/* jscpd:ignore-start */ 
.style {
   padding : 40px  0 ;
   font-size : 26px ;
   font-weight : 400 ;
   color : #464646 ;
   line-height : 26px ;
}
/* jscpd:ignore-end */

How to use it in html code:

<!--
// jscpd:ignore-start
--> 
< meta  data-react-helmet = "true"  name = "theme-color"  content = "#cb3837" /> 
< link  data-react-helmet = "true"  rel = "stylesheet"  href = "https: //static.npmjs.com/103af5b8a2b3c971cba419755f3a67bc.css" /> 
< link  data-react-helmet = "true"  rel = "apple-touch-icon"  sizes = "120x120"  href = "https://static.npmjs. com/58a19602036db1daee0d7863c94673a4.png" /> 
< link  data-react-helmet = "true"  rel = "icon"  type = "image/png"  href = "https://static.npmjs.com/b0f1a8318363185cc2ea6a40ac23eeb2.png"  sizes = "32x32" /> 
<!--
// jscpd:ignore-end
-->

Summarize

jscpdIt is a powerful front-end local code duplication detection tool. It can help developers quickly discover code duplication problems. Simple configuration can output intuitive code duplication data, and improve code quality and maintainability by solving duplicate codes.

We jscpdcan effectively optimize the front-end development process and improve the efficiency and performance of the code. I hope this article can help you understand jscpdthe front-end local code duplication detection based on .