semgrep

Lightweight static analysis for many languages. Find bug variants with patterns that look like source code.

LGPL-2.1 License

Stars
9.7K
Committers
170

Bot releases are visible (Hide)

semgrep - Release v0.50.0

Published by github-actions[bot] over 3 years ago

Added

  • JS/TS: Infer global constants even if the const qualifier is missing (#2978)
  • PHP: Resolve names and infer global constants in the same way as for Python

Fixed

  • Empty yaml files do not crash
  • Autofix does not insert newline characters for patterns from semgrep.live (#3045)
  • Autofix printout is grouped with its own finding rather than the one below it (#3046)
  • Do not assign constant values to assigned variables (#2805)
  • A --time flag instead of --json-time which shows a summary of the
    timing information when invoked with normal output and adds a time field
    to the json output when --json is also present

Changed

  • .git/ directories are ignored when scanning
  • External Python API (semgrep_main.invoke_semgrep) now takes an
    optional OutputSettings argument for controlling output
  • OutputSettings.json_time has moved to OutputSettings.output_time,
    this and many other OutputSettings arguments have been made optional

Removed

  • --debugging-json flag in favor of --json + --debug
  • --json-time flag in favor of --json + --time
semgrep - Release v0.49.0

Published by github-actions[bot] over 3 years ago

Added

  • Support for matching multiple arguments with a metavariable (#3009)
    This is done with a 'spread metavariable' operator that looks like
    $...ARGS. This used to be available only for JS/TS and is now available
    for the other languages (Python, Java, Go, C, Ruby, PHP, and OCaml).
  • A new --optimizations [STR] command-line flag to turn on/off some
    optimizations. Use 'none' to turn off everything and 'all' to turn on
    everything.
    Just using --optimizations is equivalent to --optimizations all, and
    not using --optimizations is equivalent to --optimizations none.
  • JS/TS: Support '...' inside JSX text to match any text, as in
    <a href="foo">...</a> (#2963)
  • JS/TS: Support metavariables for JSX attribute values, as in
    <a href=$X>some text</a> (#2964)

Fixed

  • Python: correctly parsing fstring with multiple colons
  • Ruby: better matching for interpolated strings (#2826 and #2949)
  • Ruby: correctly matching numbers

Changed

  • Add required executionSuccessful attribute to SARIF output (#2983)
    Thanks to Simon Engledew
  • Remove jsx and tsx from languages, just use javascript or typescript (#3000)
  • Add limit max characters in output line (#2958) and add
    flag to control maxmium characters (defaults to 160).
    Thanks to Ankush Menat
semgrep - Release v0.48.0

Published by github-actions[bot] over 3 years ago

Added

  • Taint mode: Basic cross-function analysis (#2913)
  • Support for the new Java Record extension and Java symbols with accented characters (#2704)

Fixed

  • Capturing functions when used as both expressions and statements in JS (#1007)
  • Literal for ocaml tree sitter (#2885)

Changed

  • The extra lines data is now consistent across scan types
    (e.g. semgrep-core, spacegrep, pattern-regex)
semgrep - Release v0.47.0

Published by github-actions[bot] over 3 years ago

Added

  • support for(...) for Java
  • Rust: Semgrep patterns now support top-level statements (#2910)
  • support for utf-8 code with non-ascii chars (#2944)

Fixed

  • fixed single field pattern in JSON, allow $FLD: { ... } pattern
  • Config detection in files with many suffix delimiters, like this.that.check.yaml.
    More concretely: configs end with .yaml, YAML language tests end with .test.yaml,
    and everything else is handled by its respective language extension (e.g. .py).
  • Single array field in yaml in a pattern is parsed as a field, not a one element array
semgrep - Release v0.46.0

Published by github-actions[bot] over 3 years ago

Added

  • YAML language support to --test

Fixed

  • SARIF output now nests invocations inside runs.
  • Go backslashed carets in regexes can be parsed

Changed

  • Deep expression matches (<... foo ...>) now match within the bodies of anonymous
    functions (a.k.a. lambda-expressions) and arbitrary language-specific
    statements (e.g. the Golang go statement)
semgrep - Release v0.45.0

Published by github-actions[bot] over 3 years ago

Added

  • New --experimental flag for passing rules directly to semgrep-core (#2836)

Fixed

  • Ellipses in template strings don't match string literals (#2780)
  • Go: correctly parse select/switch clauses like in tree-sitter (#2847)
  • Go: parse correctly 'for ...' header in Go patterns (#2838)
semgrep - Release v0.44.0

Published by github-actions[bot] over 3 years ago

0.44.0 - 2021-03-25

Added

  • Support for YAML! You can now write YAML patterns in rules
    to match over YAML target files (including semgrep YAML rules, inception!)
  • A new Bloomfilter-based optimisation to speedup matching (#2816)
  • Many benchmarks to cover semgrep advertised packs (#2772)
  • A new semgrep-dev docker container useful for benchmarking semgrep (#2800)
  • Titles to rule schema definitions, which can be leveraged in
    the Semgrep playground (#2703)

Fixed

  • Fixed taint mode and added basic test (#2786)
  • Included formatted errors in SARIF output (#2748)
  • Go: handle correctly the scope of Go's short assignment variables (#2452)
  • Go: fixed the range of matched slices (#2763)
  • PHP: correctly match the PHP superglobal $_COOKIE (#2820)
  • PHP: allow ellipsis inside array ranges (#2819)
  • JSX/TSX: fixed the range of matched JSX elements (#2685)
  • Javascript: allow ellipsis in arrow body (#2802)
  • Generic: correctly match the same metavariable when used in different
    generic patterns

Fixed in semgrep-core only

These features are not yet available via the semgrep CLI,
but have been fixed to the internal semgrep-core binary.

  • Fixed all regressions on semgrep-rules when using -fast
  • Handle pattern-not: and pattern-not-inside: as in semgrep
  • Handle pattern: and pattern-inside: as in semgrep (#2777)
semgrep - Release v0.43.0

Published by github-actions[bot] over 3 years ago

0.43.0 - 2021-03-16

Added

  • Official Python 3.9 support
  • Support for generating patterns that will match multiple given code targets
  • Gitignore for compiled binaries

Fixed

  • Parsing enum class patterns (#2715)
  • Ocaml test metavar_equality_var (#2755)

Changed

  • Pfff java parser and tree-sitter-java parser are now more similar
  • Octal numbers parsed correctly in tree-sitter parsers
semgrep - Release v0.42.0

Published by github-actions[bot] over 3 years ago

0.42.0 - 2021-03-09

Added

  • Added propagation of metavariables to clauses nested under patterns:. Fixes #2548.
  • --json-time flag which reports runtimes for (rule, target file)
  • --vim flag for Syntastic
  • PHP - Support for partial if statements
  • CSharp - Many improvements to parsing

Fixed

  • Rust can be invoked with rs or rust as a language

Changed

  • The timeout for downloading config files from a URL was extended from 10s to 20s
semgrep - Release v0.41.1

Published by github-actions[bot] over 3 years ago

Fixed

  • Statically link pcre in semgrep-core for MacOS releases
semgrep - Release v0.41.0

Published by github-actions[bot] over 3 years ago

Added

  • Added basic typed metavariables for javascript and typescript (#2588)
  • Ability to match integers or floats by values
    e.g., the pattern '8' will now match code like 'x = 0x8'
  • Start converting the tree-sitter CST of R to the generic AST
    thx to Ross Nanopoulos!
  • Allow 'nosem' in HTML. (#2574)

Added in semgrep-core only

These features are not yet available via the semgrep CLI,
but have been added to the internal semgrep-core binary.

  • ability to process a whole rule in semgrep-core; this will allow
    whole-rule optimisations and avoid some fork and communication with the
    semgrep Python wrapper
  • handling the none (regexp) and generic (spacegrep) patterns in a rule
  • handling the metavariable-regexp, metavariable-comparison
  • correctly handle boolean formula using inclusion checks on metavariables
  • new semgrep-core -test_rules action to test rules; it reports only
    28/2800 mismatches on the semgrep-rules repository

Changed

  • update C# to latest tree-sitter-csharp
    thx to Sjord for the huge work adapting to the new C# grammar
  • Improve --generate-config capabilities (#2562)
  • optimise the matching of blocks with ellipsis (#2618)
    e.g., the pattern 'function(...) { ... }' will now be more efficient
  • Change pattern-not-regex to filter when regex overlaps with a match (#2572)

Fixed

  • remove cycle in named AST for Rust 'fn foo(self)' (#2584)
    and also typescript, which could cause semgrep to use giga bytes of memory
  • fix missing token location on Go type assertion (#2577)
semgrep - Release v0.40.0

Published by github-actions[bot] over 3 years ago

0.40.0 - 2021-02-17

Added

  • Documentation for contributing new languages.
  • New language Kotlin with experimental support.
  • Work on caching improvements for semgrep-core.
  • Work on bloom filters for matching performance improvement.

Changed

  • Typescript grammar upgraded.
  • Ruby parser updated from the latest tree-sitter-ruby.
  • New Semgrep logo!
  • metavariable_regex now supported with PCRE.
  • Rust macros now parsed. Thanks Ruin0x11!

Fixed

  • Constant propagaion support covers := short assignment in Go. (#2440)
  • Functions now match against functions inside classes for PHP. (#2470)
  • Import statements for CommonJS Typescript modules now supported. (#2234)
  • Ellipsis behave consistently in nested statements for PHP. (#2453)
  • Go Autofix does not drop closing parenthesis. (#2316)
  • Helpful errors added for Windows installation. (#2533)
  • Helpful suggestions provided on output encoding error. (#2514)
  • Import metavariables now bind to the entire Java path. (#2502)
  • Semgrep matches the short name for a type in Java. (#2400)
  • Interface types explicitly handled in Go patterns. (#2376)
  • TooManyMatches error generated instead of Timeout error when appropriate. (#2411)
semgrep - Release v0.39.1

Published by github-actions[bot] over 3 years ago

0.39.1 - 2021-01-26

No new changes in this version. This is a re-release of 0.39.0 due to an error in the release process.

0.39.0 - 2021-01-26

Added

  • Typed metavariables in C. Patterns like $X == $Y can now match specific types like so: (char *$X) == $Y. (#2431)

Added in semgrep-core only

These features are not yet available via the semgrep CLI, but have been added to the internal semgrep-core binary.

  • semgrep-core supports rules in JSON and Jsonnet format. (#2428)
  • semgrep-core supports a new nested format for combining patterns into a boolean query. (#2430)

Changed

  • When an unknown language is set on a rule, the error message now lists all supported languages. (#2448)
  • When semgrep is executed without a config specified, the error message now includes some suggestions on how to pick a config. (#2449)
  • -c is the new shorthand for --config in the CLI. -f is kept as an alias for backward-compatibility. (#2447)

Fixed

  • Disable timeouts if timeout setting is 0 (#2423).
  • Typed metavariables in go match literal strings (#2401).
  • Fix bug that caused m_compatible_type to only bind the type (#2441).
semgrep - Release v0.38.0

Published by github-actions[bot] over 3 years ago

Added

  • Added a new language: Rust. Support for basic semgrep patterns (#2391)
    thanks to Ruin0x11!
  • Added a new language: R. Just parsing for now (#2407)
    thanks to Ross Nanopoulos!
  • Parse more Rust constructs: Traits, type constraints (#2393, #2413)
    thanks to Ruin0x11!
  • Parse more C# constructs: Linq queries, type parameter constraints (#2378, #2408)
    thanks to Sjord!
  • new experimental semgrep rule (meta)linter (#2420) with semgrep-core -check_rules

Changed

  • new controlflow-sensitive intraprocedural dataflow-based constant propagation
    (#2386)

Fixed

  • matching correctly Ruby functions with rescue block (#2390)
  • semgrep crashing on permission error on a file (#2394)
  • metavariable interpolation for pattern-inside (#2361)
  • managing Lua assignment correctly (#2406) thanks to Ruin0x11!
  • correctly parse metavariables in PHP, and ellipsis in fields (#2419)
semgrep - Release v0.37.0

Published by github-actions[bot] almost 4 years ago

Added

  • pattern-not-regex added so findings can be filtered using regular expression (#2364)
  • Lua support for basic semgrep patterns (#2337, #2312)
  • C# support for basic semgrep patterns (#2336)
  • Parse event access, conditional access, async-await in C# (#2314, #2329, #2358)

Changed

  • Java and Javascript method chaining requires extra "." when using ellipsis (#2354)

Fixed

  • Semgrep crashing due to missing token information in AST (#2380)
semgrep - Release v0.36.0

Published by github-actions[bot] almost 4 years ago

0.36.0 - 2021-01-05

Added

  • Typed metavariables can now match field access when we can propagate
    the type of a field
  • Constant propagation for Java final fields (using this.field syntax)

Changed

  • Packaging and setup.py functionality (.whl and pip install unchanged):
    SEMGREP_SKIP_BIN, SEMGREP_CORE_BIN, and SPACEGREP_BIN now available

Fixed

  • correctly match the same metavariable for a field when used at a definition
    site and use site for Java
  • add classname attribute to junit.xml report
semgrep - Release v0.35.0

Published by github-actions[bot] almost 4 years ago

0.35.0 - 2020-12-16

Added

  • Support for ... in chains of method calls in JS, e.g. $O.foo() ... .bar()
  • Official Ruby GA support

Fixed

  • Separate out test and pattern files with --test (#1796)
semgrep - Release v0.34.0

Published by github-actions[bot] almost 4 years ago

0.34.0 - 2020-12-09

Added

  • Experimental support for matching multiple arguments in JS/TS. This is done with a 'spread metavariable' operator, that looks like $...ARGS.
  • Support for using ... inside a Golang switch statement.
  • Support for matching only the try, the catch, or the finally part of a try { } catch (e) { } finally { } construct in JS/TS.
  • Support for matching only the if () part of an if () { } construct in Java
  • Support for metavariables inside dictionary keys in Ruby. This looks like {..., $KEY: $VAL, ...}.
  • An experimental --json-stats flag. The stats output contains the number of files and lines of code scanned, broken down by language. It also contains profiling data broken down by rule ID. Please note that as this is an experimental flag, the output format is subject to change in later releases.
  • Regex-only rules can now use regex as their language. The previously used language none will keep working as well.

Changed

  • Matches are now truncated to 10 lines in Semgrep's output. This was done to avoid filling the screen with output when a rule captures a whole class or function. If you'd like to adjust this behavior, you can set the new --max-lines-per-finding option.
  • Fans of explicit & verbose code can now ignore findings with a // nosemgrep comment instead of the original // nosem. The two keywords have identical behavior.
  • Generic pattern matching is now 10-20% faster on large codebases.

Fixed

  • Semgrep would crash when tens of thousands of matches were found for the same rule in one file. A new internally used semgrep-core flag named -max_match_per_file prevents these crashes by forcing a 'timeout' state when 10,000 matches are reached. Semgrep can then gracefully report what combination of rules and paths causes too much work.
  • semgrep --debug works again, and now outputs even more debugging information from semgrep-core. The new debugging output is especially helpful to discover which rules have too many matches.
  • A pattern that looks like $X & $Y will now correctly match bitwise AND operations in Ruby.
  • Metavariables can now capture the name of a class and match its occurrences later in the class definition.
  • Semgrep used to crash when a metavariable matched over text that cannot be read as UTF-8 text. Such matches will now try to recover what they can from apparent broken unicode text.
semgrep - Release v0.33.0

Published by github-actions[bot] almost 4 years ago

0.33.0 - 2020-12-01

Added

  • Allow selecting rules based on severity with the --severity flag. Thanks @kishorbhat!

Changed

  • In generic mode, shorter matches are now always preferred over
    longer ones. This avoids matches like def bar def foo when the
    pattern is def ... foo, instead matching just def foo
  • In generic mode, leading dots must now match at the beginning of a
    block, allowing patterns like ... foo to match what comes before foo
  • Disabled link following for parity with other LINUX tools (e.g. ripgrep)
  • spacegrep timeouts are now reported as timeouts instead of another error

Fixed

  • Correctly bind a metavariable in an import to the fully-qualified name. Issue
  • Fix invalid match locations on target files containing both CRLF line
    endings UTF-8 characters (#2111)
  • Fix NoTokenLocation error when parsing Python f-strings
  • [C] Support include $X
  • [Go] Fix wrong order of imports
semgrep - Release v0.32.0

Published by github-actions[bot] almost 4 years ago

Added

  • JSON output now includes an attribute of findings named is_ignored.
    This is false under regular circumstances,
    but if you run with --disable-nosem,
    it will return true for findings
    that normally would've been excluded by a // nosem comment.

Changed

  • Added a default timeout of 30 seconds per file instead of none (#1981).