19 | > Copy/paste detector for programming source code, supports [150+ formats](../../supported_formats.md).
20 |
21 | Copy/paste is a common technical debt on a lot of projects. The jscpd gives the ability to find duplicated blocks implemented on more than 150 programming languages and digital formats of documents.
22 | The jscpd tool implements [Rabin-Karp](https://en.wikipedia.org/wiki/Rabin%E2%80%93Karp_algorithm) algorithm for searching duplications.
23 |
45 |
46 |
47 | ## Features
48 | - Detect duplications in programming source code, use semantic of programing languages, can skip comments, empty lines etc.
49 | - Detect duplications in embedded blocks of code, like `<script>` or `<style>` sections in html
50 | - Blame authors of duplications
51 | - Generate XML report in pmd-cpd format, JSON report, [HTML report](http://kucherenko.github.io/jscpd-report.html)
52 | - Integrate with CI systems, use thresholds for level of duplications
53 |
54 | ## Getting started
55 |
56 | ### Installation
57 | ```bash
58 | $ npm install -g jscpd
59 | ```
60 | ### Usage
61 | ```bash
62 | $ npx jscpd /path/to/source
63 | ```
64 | or
65 |
66 | ```bash
67 | $ jscpd /path/to/code
68 | ```
69 | or
70 |
71 | ```bash
72 | $ jscpd --pattern "src/**/*.js"
73 | ```
74 | ## Options
75 | ### Pattern
76 |
77 | Glob pattern for find files to detect
78 |
79 | - Cli options: `--pattern`, `-p`
80 | - Type: **string**
81 | - Default: "**/*"
82 |
83 | Example:
84 | ```bash
85 | $ jscpd --pattern "**/*.js"
86 | ```
87 |
88 | ### Min Tokens
89 |
90 | Minimal block size of code in tokens. The block of code less than `min-tokens` will be skipped.
91 |
92 | - Cli options: `--min-tokens`, `-k`
93 | - Type: **number**
94 | - Default: **50**
95 |
96 | *This option is called ``minTokens`` in the config file.*
97 |
98 | ### Min Lines
99 |
100 | Minimal block size of code in lines. The block of code less than `min-lines` will be skipped.
101 |
102 | - Cli options: `--min-lines`, `-l`
103 | - Type: **number**
104 | - Default: **5**
105 | ### Max Lines
106 |
107 | Maximum file size in lines. The file bigger than `max-lines` will be skipped.
108 |
109 | - Cli options: `--max-lines`, `-x`
110 | - Type: **number**
111 | - Default: **1000**
112 | ### Max Size
113 |
114 | Maximum file size in bytes. The file bigger than `max-size` will be skipped.
115 |
116 | - Cli options: `--max-size`, `-z`
117 | - Type: **string**
118 | - Default: **100kb**
119 | ### Threshold
120 |
121 | The threshold for duplication level, check if current level of duplications bigger than threshold jscpd exit with error.
122 |
123 | - Cli options: `--threshold`, `-t`
124 | - Type: **number**
125 | - Default: **null**
126 | ### Config
127 |
128 | The path to configuration file. The config should be in `json` format. Supported options in config file can be the same with cli options.
129 |
130 | - Cli options: `--config`, `-c`
131 | - Type: **path**
132 | - Default: **null**
133 | ### Ignore
134 |
135 | The option with glob patterns to ignore from analyze. For multiple globs you can use comma as separator.
136 | Example:
137 | ```bash
138 | $ jscpd --ignore "**/*.min.js,**/*.map" /path/to/files
139 | ```
140 | - Cli options: `--ignore`, `-i`
141 | - Type: **string**
142 | - Default: **null**
143 | ### Reporters
144 | The list of reporters. Reporters use for output information of clones and duplication process.
145 |
146 | Available reporters:
147 | - **console** - report about clones to console;
148 | - **consoleFull** - report about clones to console with blocks of code;
149 | - **json** - output `jscpd-report.json` file with clones report in json format;
150 | - **xml** - output `jscpd-report.xml` file with clones report in xml format;
151 | - **csv** - output `jscpd-report.csv` file with clones report in csv format;
152 | - **markdown** - output `jscpd-report.md` file with clones report in markdown format;
153 | - **html** - generate html report to `html/` folder;
154 | - **sarif** - generate a report in SARIF format (https://github.com/oasis-tcs/sarif-spec), save it to `jscpd-sarif.json` file;
155 | - **verbose** - output a lot of debug information to console;
156 |
157 | > Note: A reporter can be developed manually, see [@jscpd/finder](../finder) package.
158 |
159 | - Cli options: `--reporters`, `-r`
160 | - Type: **string**
161 | - Default: **console**
162 | ### Output
163 |
164 | The path to directory for reports. JSON and XML reports will be saved there.
165 |
166 | - Cli options: `--output`, `-o`
167 | - Type: **path**
168 | - Default: **./report/**
169 |
170 | ### Mode
171 | The mode of detection quality.
172 | - `strict` - use all types of symbols as token, skip only blocks marked as ignored.
173 | - `mild` - skip blocks marked as ignored and new lines and empty symbols.
174 | - `weak` - skip blocks marked as ignored and new lines and empty symbols and comments.
175 |
176 | > Note: A mode can be developed manually, see API section.
177 |
178 | - Cli options: `--mode`, `-m`
179 | - Type: **string**
180 | - Default: **mild**
181 | ### Format
182 |
183 | The list of formats to detect for duplications. Available over [150 formats](../../supported_formats.md).
184 |
185 | Example:
186 | ```bash
187 | $ jscpd --format "php,javascript,markup,css" /path/to/files
188 | ```
189 |
190 | - Cli options: `--format`, `-f`
191 | - Type: **string**
192 | - Default: **{all formats}**
193 | ### Blame
194 | Get information about authors and dates of duplicated blocks from git.
195 |
196 | - Cli options: `--blame`, `-b`
197 | - Type: **boolean**
198 | - Default: **false**
199 | ### Silent
200 | Don't write a lot of information to a console.
201 |
202 | Example:
203 | ```
204 | $ jscpd /path/to/source --silent
205 | Duplications detection: Found 60 exact clones with 3414(46.81%) duplicated lines in 100 (31 formats) files.
206 | Execution Time: 1381.759ms
207 | ```
208 | - Cli options: `--silent`, `-s`
209 | - Type: **boolean**
210 | - Default: **false**
211 | ### Absolute
212 | Use the absolute path in reports.
213 |
214 |
215 | - Cli options: `--absolute`, `-a`
216 | - Type: **boolean**
217 | - Default: **false**
218 | ### Ignore Case
219 | Ignore case of symbols in code (experimental).
220 |
221 |
222 | - Cli options: `--ignoreCase`
223 | - Type: **boolean**
224 | - Default: **false**
225 |
226 | ### No Symlinks
227 | Do not follow symlinks.
228 |
229 | - Cli options: `--noSymlinks`, `-n`
230 | - Type: **boolean**
231 | - Default: **false**
232 |
233 | ### Skip Local
234 | Use for detect duplications in different folders only. For correct usage of `--skipLocal` option you should provide list of path's with more than one item.
235 |
236 | Example:
237 | ```bash
238 | jscpd --skipLocal /path/to/folder1/ /path/to/folder2/
239 | ```
240 | will detect clones in separate folders only, clones from same folder will be skipped.
241 |
242 |
243 | - Cli options: `--skipLocal`
244 | - Type: **boolean**
245 | - Default: **false**
246 |
247 | ### Formats Extensions
248 | Define the list of formats with file extensions. Available over [150 formats](../../supported_formats.md).
249 |
250 | In following example jscpd will analyze files `*.es` and `*.es6` as javascript and `*.dt` files as dart:
251 | ```bash
252 | $ jscpd --formats-exts javascript:es,es6;dart:dt /path/to/code
253 | ```
254 | > Note: formats defined in the option redefine default configuration, you should define all need formats manually or create two configuration for run `jscpd`
255 |
256 | - Cli options: `--formats-exts`
257 | - Type: **string**
258 | - Default: **null**
259 |
260 | ### Store
261 |
262 | Stores used for collect information about code, by default all information collect in memory.
263 |
264 | Available stores:
265 | - **leveldb** - leveldb store all data to files. The store recommended as store for big repositories. Should install @jscpd/leveldb-store before;
266 |
267 | > Note: A store can be developed manually, see [@jscpd/finder](../finder) package and [@jscpd/leveldb-store](../leveldb-store) as example.
268 |
269 | - Cli options: `--store`
270 | - Type: **string**
271 | - Default: **null**
272 |
273 | ### Ignore Pattern
274 | Ignore code blocks matching the regexp patterns.
275 |
276 | - Cli options: `--ignore-pattern`
277 | - Type: **string**
278 | - Default: **null**
279 |
280 | Example:
281 | ```
282 | $ jscpd /path/to/source --ignore-pattern "import.*from\s*'.*'"
283 | ```
284 | Excludes import statements from the calculation.
285 |
286 | ## Config File
287 |
288 | Put `.jscpd.json` file in the root of the projects:
289 | ```json
290 | {
291 | "threshold": 0,
292 | "reporters": ["html", "console", "badge"],
293 | "ignore": ["**/__snapshots__/**"],
294 | "absolute": true
295 | }
296 | ```
297 |
298 | Also you can use section in `package.json`:
299 |
300 | ```json
301 | {
302 | ...
303 | "jscpd": {
304 | "threshold": 0.1,
305 | "reporters": ["html", "console", "badge"],
306 | "ignore": ["**/__snapshots__/**"],
307 | "absolute": true,
308 | "gitignore": true
309 | }
310 | ...
311 | }
312 |
313 |
314 | ```
315 |
316 | ### Exit code
317 |
318 | By default, the tool exits with code 0 even code duplications were
319 | detected. This behaviour can be changed by specifying a custom exit
320 | code for error states.
321 |
322 | Example:
323 | ```bash
324 | jscpd --exitCode 1 .
325 | ```
326 |
327 | - Cli options: `--exitCode`
328 | - Type: **number**
329 | - Default: **0**
330 |
331 |
332 | ## Ignored Blocks
333 |
334 | Mark blocks in code as ignored:
335 | ```javascript
336 | /* jscpd:ignore-start */
337 | import lodash from 'lodash';
338 | import React from 'react';
339 | import {User} from './models';
340 | import {UserService} from './services';
341 | /* jscpd:ignore-end */
342 | ```
343 |
344 | ```html
345 | <!--
346 | // jscpd:ignore-start
347 | -->
348 | <meta data-react-helmet="true" name="theme-color" content="#cb3837"/>
349 | <link data-react-helmet="true" rel="stylesheet" href="https://static.npmjs.com/103af5b8a2b3c971cba419755f3a67bc.css"/>
350 | <link data-react-helmet="true" rel="stylesheet" href="https://static.npmjs.com/cms/flatpages.css"/>
351 | <link data-react-helmet="true" rel="apple-touch-icon" sizes="120x120" href="https://static.npmjs.com/58a19602036db1daee0d7863c94673a4.png"/>
352 | <link data-react-helmet="true" rel="apple-touch-icon" sizes="144x144" href="https://static.npmjs.com/7a7ffabbd910fc60161bc04f2cee4160.png"/>
353 | <link data-react-helmet="true" rel="apple-touch-icon" sizes="152x152" href="https://static.npmjs.com/34110fd7686e2c90a487ca98e7336e99.png"/>
354 | <link data-react-helmet="true" rel="apple-touch-icon" sizes="180x180" href="https://static.npmjs.com/3dc95981de4241b35cd55fe126ab6b2c.png"/>
355 | <link data-react-helmet="true" rel="icon" type="image/png" href="https://static.npmjs.com/b0f1a8318363185cc2ea6a40ac23eeb2.png" sizes="32x32"/>
356 | <!--
357 | // jscpd:ignore-end
358 | -->
359 | ```
360 |
361 | ## Reporters
362 |
363 | ### HTML
364 |
365 | [Demo report](http://kucherenko.github.io/jscpd-report.html)
366 | ### Badge
367 | ![jscpd](../../assets/jscpd-badge.svg)
368 |
369 | More info [jscpd-badge-reporter](https://github.com/kucherenko/jscpd-badge-reporter)
370 | ### PMD CPD XML
371 | ```xml
372 | <?xml version="1.0" encoding="utf-8"?>
373 | <pmd-cpd>
374 | <duplication lines="10">
375 | <file path="/path/to/file" line="1">
376 | <codefragment><![CDATA[ ...first code fragment... ]]></codefragment>
377 | </file>
378 | <file path="/path/to/file" line="5">
379 | <codefragment><![CDATA[ ...second code fragment...}]]></codefragment>
380 | </file>
381 | <codefragment><![CDATA[ ...duplicated fragment... ]]></codefragment>
382 | </duplication>
383 | </pmd-cpd>
384 | ```
385 | ### JSON reporters
386 | ```json
387 | {
388 | "duplicates": [{
389 | "format": "javascript",
390 | "lines": 27,
391 | "fragment": "...code fragment... ",
392 | "tokens": 0,
393 | "firstFile": {
394 | "name": "tests/fixtures/javascript/file2.js",
395 | "start": 1,
396 | "end": 27,
397 | "startLoc": {
398 | "line": 1,
399 | "column": 1
400 | },
401 | "endLoc": {
402 | "line": 27,
403 | "column": 2
404 | }
405 | },
406 | "secondFile": {
407 | "name": "tests/fixtures/javascript/file1.js",
408 | "start": 1,
409 | "end": 24,
410 | "startLoc": {
411 | "line": 1,
412 | "column": 1
413 | },
414 | "endLoc": {
415 | "line": 24,
416 | "column": 2
417 | }
418 | }
419 | }],
420 | "statistic": {
421 | "detectionDate": "2018-11-09T15:32:02.397Z",
422 | "formats": {
423 | "javascript": {
424 | "sources": {
425 | "/path/to/file": {
426 | "lines": 24,
427 | "sources": 1,
428 | "clones": 1,
429 | "duplicatedLines": 26,
430 | "percentage": 45.33,
431 | "newDuplicatedLines": 0,
432 | "newClones": 0
433 | }
434 | },
435 | "total": {
436 | "lines": 297,
437 | "sources": 1,
438 | "clones": 1,
439 | "duplicatedLines": 26,
440 | "percentage": 45.33,
441 | "newDuplicatedLines": 0,
442 | "newClones": 0
443 | }
444 | }
445 | },
446 | "total": {
447 | "lines": 297,
448 | "sources": 6,
449 | "clones": 5,
450 | "duplicatedLines": 26,
451 | "percentage": 45.33,
452 | "newDuplicatedLines": 0,
453 | "newClones": 0
454 | }
455 | }
456 | }
457 | ```
458 | ## API
459 |
460 |
461 | For integration copy/paste detection to your application you can use programming API:
462 |
463 | `jscpd` Promise API
464 | ```typescript
465 | import {IClone} from '@jscpd/core';
466 | import {jscpd} from 'jscpd';
467 |
468 | const clones: Promise<IClone[]> = jscpd(process.argv);
469 | ```
470 |
471 | `jscpd` async/await API
472 | ```typescript
473 | import {IClone} from '@jscpd/core';
474 | import {jscpd} from 'jscpd';
475 | (async () => {
476 | const clones: IClone[] = await jscpd(['', '', __dirname + '/../fixtures', '-m', 'weak', '--silent']);
477 | console.log(clones);
478 | })();
479 |
480 | ```
481 |
482 | `detectClones` API
483 | ```typescript
484 | import {detectClones} from "jscpd";
485 |
486 | (async () => {
487 | const clones = await detectClones({
488 | path: [
489 | __dirname + '/../fixtures'
490 | ],
491 | silent: true
492 | });
493 | console.log(clones);
494 | })()
495 | ```
496 |
497 | `detectClones` with persist store
498 | ```typescript
499 | import {detectClones} from "jscpd";
500 | import {IMapFrame, MemoryStore} from "@jscpd/core";
501 |
502 | (async () => {
503 | const store = new MemoryStore<IMapFrame>();
504 |
505 | await detectClones({
506 | path: [
507 | __dirname + '/../fixtures'
508 | ],
509 | }, store);
510 |
511 | await detectClones({
512 | path: [
513 | __dirname + '/../fixtures'
514 | ],
515 | silent: true
516 | }, store);
517 | })()
518 | ```
519 |
520 | In case of deep customisation of detection process you can build your own tool:
521 | If you are going to detect clones in file system you can use [@jscpd/finder](../finder) for make a powerful detector.
522 | In case of detect clones in browser or not node.js environment you can build your own solution base on [@jscpd/code](../core)
523 |
524 | ## Changelog
525 | [Changelog](CHANGELOG.md)
526 |
527 | ## Who uses jscpd
528 | - [Code-Inspector](https://www.code-inspector.com/) is a code analysis and technical debt management service.
529 | - [Mega-Linter](https://nvuillam.github.io/mega-linter/) is a 100% open-source linters aggregator for CI (GitHub Action & other CI tools) or to run locally
530 | - [vscode-jscpd](https://marketplace.visualstudio.com/items?itemName=paulhoughton.vscode-jscpd) VSCode Copy/Paste detector plugin.
531 |
558 | ## License
559 |
560 | [MIT](LICENSE) © Andrey Kucherenko