semgrep

Lightweight static analysis for many languages. Find bug variants with patterns that look like source code.

LGPL-2.1 License

Stars
9.7K
Committers
170

Bot releases are visible (Hide)

semgrep - Release v1.40.0

Published by github-actions[bot] about 1 year ago

Added

  • Dot files, for example .vscode and .vimrc, are now displayed in the skip report when using --verbose and --develop.
  • Added CLI output for secrets-related findings. (#8666)
  • Rules are now skipped with an [INFO]-level error message if they can't run due to an unavailable plugin such as the Semgrep Pro Engine. This enables Semgrep to finish its scan without failing due to some unavailable plugin. The intended use is for a public rule registry to provide all kinds of rules including some that require particular plugins. (#8668)
  • Allow Semgrep CI users to run a SAST scan through the Semgrep Code product using --code command-line option. This works the same as --supply-chain (#8679)
  • Semgrep VSCode Extension: Semgrep Language Server now does not show findings that have been ignored through Semgrep Code.
  • taint-mode: Semgrep now tracks taint via globals or class attributes that are effectively final (as in Java), for example:
    class Test {
      private String x = source();
    
      void test() {
        sink(x); // finding here !
      }
    }
    
    Semgrep recognizes that x must be tainted because it is a private class attribute that is initialized to source(), and it is not re-defined anywhere else. This also works if x is initialized in the constructor (if there is only one constructor), or in a static block. (#8652)
  • Constant propagation: Semgrep can now identify private class attributes as constants that are assigned just once in a class constructor, for example: https://semgrep.dev/playground/s/R1re. (#8624)
  • Added -dump_contributions flag to semgrep-core and include contributions when posting findings to Scan API.
  • There is a new semgrep show command to display information about Semgrep, for example semgrep show supported-languages. The goal is to clean up semgrep scan which is currently abused to not scan but also display Semgrep information, for example, semgrep scan --show-supported-languages. See semgrep show --help for more information.
  • Add exception handling for dump_contributions core command in pysemgrep.

Changed

  • Further improvements to timeouts and logging for semgrep ci (#8665)

Fixed

  • Semgrep VSCode Extension: Semgrep Language Server no longer duplicates some findings.
  • GitLab SAST output has now been updated to accommodate the new SAST schema as of GitLab 16.x, which means that findings in GitLab will now properly display descriptions of the findings. (#8657)
  • Julia: Ellipses can now properly match when used in conjunction with single statements. This is achieved by preserving a single statement as a block. (#8643) For instance, the pattern
  ...
  foo()

can now properly match a target of

foo()
  • Matching: Numeric capture group metavariables of the form $1, $2, and so on, no longer unify implicitly. Previously, these capture group metavariables could fail to match because the first metavariable ($1) failed to match. In the case of numeric capture group metavariables, which can occur implicitly in the rule-writing process, it is not intended that all metavariables must be combined to create a match. For this reason, Semgrep no longer requires that these these unnamed metavariables are taken as a whole ("unified") to create a match. (#8644)
  • Ruby: The CFG (context-free grammar) now supports case statements in Ruby, which does not fall through.
  • Constant propagation now handles implicit number-to-string conversions in Java and JS/TS. A Java expression such as "foo" + 123 now matches the string pattern "foo123".
semgrep - Release v1.39.0

Published by github-actions[bot] about 1 year ago

Added

  • Matching: Qualified names written as patterns can now match valid instances of identifiers which lie underneath a wildcard import (#8514). For instance, in Python, we could write the pattern A.B.C.x, and match the usage in the program:
from A.B import *
foo(C.x)
  • Ruby: Replaced old Ruby parser with the latest tree-sitter Ruby parser, meaning that there could be small edge cases of differences in how Semgrep matches Ruby programs. (#8539)

Fixed

  • Previously, requests to Semgrep servers could result in gateway timeout errors (504 errors) without attempting a retry, resulting in a failure. Now, 504 errors are automatically retried first rather than failed. (#8629)
  • The error message for skipped rules due to incompatible min-version or max-version constraints has been improved. (#8634)
  • When metavariable-type cannot be evaluated then it defaults to "false", that is, it filters out the range. The following rule:
        patterns:
          - pattern: private int $X;
          - metavariable-type:
              metavariable: $Y
              type: int
    
    now produces no matches because $Y is not bound to anything. (#8566)
  • Julia: using and import now match separately, instead of before, where if you wrote using $X, you would also match to imports. (#8567)
  • Semgrep VSCode Extension: Previously, findings could disappear as a user explored or highlighted them. This is now fixed. (#8633)
semgrep - Release v1.38.3

Published by github-actions[bot] about 1 year ago

1.38.3 - 2023-09-02

No significant changes.

semgrep - Release v1.38.2

Published by github-actions[bot] about 1 year ago

1.38.2 - 2023-09-01

Fixed

  • restore access to the --text option (gh-8610)
semgrep - Release v1.38.1

Published by github-actions[bot] about 1 year ago

1.38.1 - 2023-09-01

Fixed

semgrep - Release v1.38.0

Published by github-actions[bot] about 1 year ago

1.38.0 - 2023-08-31

Added

  • The CLI now returns the commit timestamp when using semgrep ci (cli-timestamp)
  • Add support for min-version and max-version fields for each rule,
    specifying a range of compatible Semgrep versions. If a rule is incompatible
    with the version of Semgrep being used, it is reported in the JSON output at
    the "info" level which doesn't cause an exit failure. (gh-8496)
  • Dependency data is now also sent to the /results endpoint of semgrep app. It is still sent to the /complete endpoint. (sc-async)

Changed

  • Adjust the count printed at the conclusion summary to match the top summary
    (only printing the count of rules actually run by semgrep and not just the number of rules received from the server). (counts)
  • The option to omit --config and to look for the presence of a .semgrep.yml
    or .semgrep/.semgrep.yml in the current directory has been removed. You now
    have to explicitly use --config. (dotsemgrep)
  • The deprecated --enable-metrics and --disable-metrics flags have finally been
    removed. Use --metrics=on or --metrics=off instead (or --metrics=auto). (enable_metrics)
  • The semgrep_main.py module has been renamed to run_scan.py and its
    invoke_semgrep() function renamed to run_scan_and_return_json().
    External tools (e.g., semgrep wrappers) using directly those functions
    should be updated. Note that this function will soon disappear as
    part of a migration effort converting Python code to OCaml. Thus,
    those tools should instead wrap the semgrep CLI and rely on
    semgrep_output_v1.atd for a more stable official API. (internals)

Fixed

  • Running just semgrep now displays the help message. Semgrep does not
    try anymore to look for a .semgrep.yml config file or .semgrep/ in the
    current directory, which used to cause issues when running from your
    home directory which can contain the .semgrep/settings.yml file (which
    is actually not a semgrep rule). (gh-4457)

  • Fixed CLI output to display matches from different rules with the same message. (gh-8557)

  • Semgrep PyPI package can now be pip install-ed on aarch64 libmusl platforms (e.g. Alpine) (gh-8565)

  • Updated --max-memory help description to make it more clear/concise. To say "Defaults to 0 for all CLI scans." implies a different default for non-CLI scans, where in practicality the default is 0 for all scans except when using Pro Engine, where the default is 5000. (max_memory_help)

  • Julia: Fixed a bug where let end blocks were not being parsed
    correctly, causing their contents to not strictly match while inside of
    a block.

    For instance, let ... end would not count as being inside of the let,
    and would match everything. (pa-3029)

  • Fixed bug where dependencies in (pnpm-lock.yaml at version 6.0 or above) files were not parsed. (sc-1033)

semgrep - Release v1.37.0

Published by github-actions[bot] about 1 year ago

1.37.0 - 2023-08-25

Added

  • semgrep scan is now more resilient to failures when fetching config from semgrep.dev. If it can't fetch a config from semgrep.dev it will use backup infrastructure to fetch the most recent successful config for that customers environment. (gh-8459)
  • C#: Added experimental NuGet ecosystem parser (gh-8484)
  • metavariable-comparison: You can now use "in" and "not in" for strings
    in the same sense as in Python, for substring checking. (pa-2979)
  • Julia: Added the deep expression operator, so now you can write patterns like
    foo(<... 42 ...>) to find instances of calls to foo that contain 42 somewhere
    inside of it. (pa-3018)
  • semgrep ci displays enabled products when scans are created and/or when the scan
    config is generated from Semgrep Cloud Platform. Additionally, if no products are
    enabled then a friendly error is raised. (scp-432)

Changed

  • The --dump-ast flag now requires the additional --experimental flag
    and does not require to pass a --config flag anymore.
    Example of use: semgrep --experimental --lang python --dump-ast foo.py (dumpast)
  • The 'semgrep shouldafound' command has been removed. It was not really used
    and it might be better to offer such a functionality in the IDE instead of
    in the CLI. (shouldafound)

Fixed

  • Parsing: Some parsing errors involving tree-sitter inserting fake "missing"
    nodes were previously unreported. They are now reported as errors although the
    parse tree is preserved, including the phony node inserted by tree-sitter.
    This should not result in different Semgrep findings. It results only in more
    reports of partial parsing. See the original issue at
    https://github.com/returntocorp/ocaml-tree-sitter-core/issues/8 for technical
    details. (gh-8190)

  • fix(extract): correctly map metavariable locations into source file (gh-8416)

  • fix(julia): correctly parse BitOr and BitAnd (gh-8449)

  • Implement missing pcre-ocaml stub (pcre_get_stringnumber_stub_bc) in JavaScript (gh-8520)

  • Julia: Fixed a bug where parenthesized expressions would sometimes
    not match in constructs like metavariable-comparison. (pa-2991)

  • Fixed a regression introduced three years ago in 0.9.0, when optimizing
    the evaluation of ... (ellipsis) to be faster. We made ... only match
    deeply (inside an if for example) if nothing matched non-deeply, thus
    causing that this pattern:

    foo()
    ...
    bar($A)
    

    would only produce a match rather than two on this code:

    foo()
    if cond:
        bar(x)
    bar(y)
    

    Semgrep matched from foo() to bar(y) and because of that it did not
    try to match inside the if, thus there was no match from foo() to bar(x).
    However, if we commented out bar(y), then Semgrep did match bar(x).

    Semgrep now produces the two expected matches. (pa-2992)

  • Julia: Type information from declarations can now be used in
    metavariable-type. For instance, the program:

    x :: Int64 = 2
    

    will now allow uses of x to match to the type Int64. (pa-3001)

  • Julia: Metavariables should now be able to appear anywhere that
    identifiers can.

    For instance, they were not able to appear as the argument to a
    do block. Now, we can write patterns like:

    map($Y) do $X
      ...
    end
    ``` (pa-3007)
    
  • Java: Fixed naming bug affecting Java and other OO languages that allowed a
    method parameter to shadow a class attribute, e.g. in:

    class Test {
    
        private int x;
    
        public void test2(int x) {
            foo(this.x);
        }
    
    }
    

    Semgrep was considering that this.x referred to the parameter x of test2
    rather than to the class attribute x. (pa-3010)

  • Fixed bug where packages in build.gradle files had their names incorrectly parsed without their group ID (sc-1012)

semgrep - Release v1.36.0

Published by github-actions[bot] about 1 year ago

1.36.0 - 2023-08-14

Added

  • Added general machinery to support languages with case insensitive identifiers and generalized php to use these case insensitive identifiers.

    For example, in php the pattern MyClass() will now match calls with different capitalization such as myclass() and Myclass(). (gh-8356)

Fixed

  • Convert all '@r2c.dev' email addresses to '@semgrep.com'. (gh-8437)
  • Semgrep LSP now compiled with tls, should no longer crash with not compiled with tls error (ls-conduit)
  • Fixed multiprocess testing crash due to new osemgrep entrypoint (pa-2963)
  • Pro: JS/TS: taint-mode: Fix bug introduced in 1.33.1 that had the side-effect of
    hurting performance of taint rules on JS/TS repos that used destructuring in
    functions formal parameters. (pro-119)
semgrep - Release v1.35.0

Published by github-actions[bot] about 1 year ago

1.35.0 - 2023-08-09

Added

  • Maven Dep Tree parsing now surfaces children dependencies per package (sc-996)

Fixed

  • fix(promql): make aggregation labels not depend on order

    "sum by (..., b, a, c, ...) (X)" should match "sum by (a,b,c) (X)" (gh-8399)

semgrep - Release v1.34.1

Published by github-actions[bot] about 1 year ago

1.34.1 - 2023-07-28

Added

  • feat(eval): add "parse_promql_duration" function to convert a promql duration into milliseconds. This makes it possible to write comparisons like this:

    - metavariable-comparison:
        metavariable: $RANGE
        comparison: parse_promql_duration(str($RANGE)) > parse_promql_duration("1d")
    ``` (gh-8381)
    
    
    

Fixed

  • fix(yaml): fix captures for sequences that contain mappings (gh-8388)
semgrep - Release v1.34.0

Published by github-actions[bot] about 1 year ago

1.34.0 - 2023-07-27

Added

  • Added support for naming propagation when the left-hand side (lhs) of a variable definition is an identifier pattern

    In certain languages like Rust, the variable definition is parsed as a pattern assignment, for example:

    let x: SomeType = SomeFunction();
    

    This commit ensures that the annotated type is propagated to the identifier pattern on the left-hand side (lhs) of the assignment, thus ensuring proper naming behavior. (gh-8365)

  • feat(metavar type): Metavariable type support for Julia

    Metavariable type is supported for Julia. (gh-8367)

  • New --legacy flag to force the use of the old Python implementation of
    Semgrep (also known as 'pysemgrep'). Note that by default most semgrep
    commands are still using the Python implementation (except 'semgrep
    interactive'), so in practice you don't need to add this flag, but as
    we port more commands to OCaml, the new --legacy flag might be useful
    if you find some regressions. (legacy)

  • Matching: Added the ability to use metavariables in parameters to match more
    sophisticated kinds of parameters.

    In particular, metavariables should now be able to match self parameters,
    such as in Rust.

    So fn $F($X, ...) { ... } should match fn $F(self) { }. (pa-2937)

  • taint-mode: Added experimental control: true option to pattern-sources,
    e.g.:

        pattern-sources:
          - control: true
            pattern: source(...)
    

    Such sources taint the "control flow" (or the program counter) so that it is
    possible to implement reachability queries that do not require the flow of any
    data. Thus, Semgrep reports a finding in the code below, because after source()
    the flow of control will reach sink(), even if no data is flowing between both:

    def test():
      source()
      foo()
      bar()
      #ruleid: test
      sink()
    ``` (pa-2958)
    
  • taint-mode: Taint sanitizers will be included in matching explanations. (pa-2975)

Changed

  • Started using ATD to define the schema for data sent to the /complete endpoint of semgrep app (app-4255)
  • Targets in a .yarn/ directory are now ignored by the default .semgrepignore patterns. (dotyarn)

Fixed

  • Aliengrep mode: Fix whitespace bug preventing correct matching of parentheses. (gh-7990)
  • yaml: exclude style markers from matched token in block scalars (gh-8348)
  • Fixed stack overflow caused by symbolic propagation. (pa-2933)
  • Rust: Macro calls which involve dereferencing and reference operators
    (such as foo!(&x) and foo!(*x)) now properly transmit taint (pa-2951)
  • Semgrep no longer crashes when running --test (pa-2963)
  • Exceptions raised during parsing of manifest files no longer interrupt general parser execution, which previously prevented lockfile parsing if a manifest failed to parse. (sc-exceptions)
semgrep - Release v1.33.2

Published by github-actions[bot] about 1 year ago

1.33.2 - 2023-07-21

No significant changes.

semgrep - Release v1.33.1

Published by github-actions[bot] about 1 year ago

1.33.1 - 2023-07-21

Added

  • Rust: Added support for ellipsis patterns in attribute argument position. (e.g. #[get(...)]) (gh-8234)
  • Promql: Initial language support (gh-8281)
  • .h files will now run when C or C++ are selected as the language. (pa-123)
  • .cjs and .mjs files will now run when javascript is selected as the language. (pa-124)
  • Tainting: Parameters to functions in languages with pattern matching in function
    arguments, such as Rust and OCaml, now transmit taint when they are sources.
    This works with nested patterns too. For instance, in Rust:
    fn f ((x, (y, z)): t) {
      let x = 2;
    }
    
    tainting the sole argument to this function will result in all of the identifiers
    x, y, and z now being tainted. (pa-2919)
  • Added rule option interfile: true, so this can be set under options: as it
    is the norm for rule options. This rule option shall replace setting interfile
    under metadata. Metadata is not mean to have any effect on how a rule is run. (pro-94)

Changed

  • Updated semgrep-interfaces, changed api_scans_findings to ci_scan_results, removed gitlab_token field and added ignores and renamed_paths field to ci_scan_results. (app-4252)

Fixed

  • Dockerfile language support: String matching is now done by contents, treating
    the strings foo, 'foo', or "foo" as equal. (gh-8229)

  • Fixed error where we were not filtering the logging of a new third party library. (gh-8310)

  • Julia: Fixed a bug where try-catch patterns would not match properly.
    Now, you can use an empty try-catch pattern, such as:

    try
      ...
    catch
      ...
    end
    

    to catch only Julia code which does not specify an identifier for the catch.

    Otherwise, if you want to match any kind of try-catch, you can specify an ellipsis
    for the catch identifier instead:

    try
      ...
    catch ...
      ...
    end
    

    and this will match any try-catch, including those that do not specify an
    identifier for the catch. It is strictly more general than the previous. (pa-2918)

  • Rust: Fixed an issue where implicit returns did not allow taint to flow,
    and various other small translation issues that would affect taint. (pa-2936)

  • Fixed bug in gradle.lockfile parser where we would error on empty= with nothing after it (sc-987)

semgrep - Release v1.33.0

Published by github-actions[bot] over 1 year ago

1.33.0 - 2023-07-19

Added

  • Rust: Added support for ellipsis patterns in attribute argument position. (e.g. #[get(...)]) (gh-8234)
  • Promql: Initial language support (gh-8281)
  • .h files will now run when C or C++ are selected as the language. (pa-123)
  • .cjs and .mjs files will now run when javascript is selected as the language. (pa-124)
  • Tainting: Parameters to functions in languages with pattern matching in function
    arguments, such as Rust and OCaml, now transmit taint when they are sources.
    This works with nested patterns too. For instance, in Rust:
    fn f ((x, (y, z)): t) {
      let x = 2;
    }
    
    tainting the sole argument to this function will result in all of the identifiers
    x, y, and z now being tainted. (pa-2919)
  • Added rule option interfile: true, so this can be set under options: as it
    is the norm for rule options. This rule option shall replace setting interfile
    under metadata. Metadata is not mean to have any effect on how a rule is run. (pro-94)

Changed

  • Updated semgrep-interfaces, changed api_scans_findings to ci_scan_results, removed gitlab_token field and added ignores and renamed_paths field to ci_scan_results. (app-4252)

Fixed

  • Dockerfile language support: String matching is now done by contents, treating
    the strings foo, 'foo', or "foo" as equal. (gh-8229)

  • Fixed error where we were not filtering the logging of a new third party library. (gh-8310)

  • Julia: Fixed a bug where try-catch patterns would not match properly.
    Now, you can use an empty try-catch pattern, such as:

    try
      ...
    catch
      ...
    end
    

    to catch only Julia code which does not specify an identifier for the catch.

    Otherwise, if you want to match any kind of try-catch, you can specify an ellipsis
    for the catch identifier instead:

    try
      ...
    catch ...
      ...
    end
    

    and this will match any try-catch, including those that do not specify an
    identifier for the catch. It is strictly more general than the previous. (pa-2918)

  • Fixed bug in gradle.lockfile parser where we would error on empty= with nothing after it (sc-987)

semgrep - Release v1.32.0

Published by github-actions[bot] over 1 year ago

1.32.0 - 2023-07-13

Added

  • feat(docker): Create a semgrep user for our docker container so that people can run it as a non-root user (gh-8116)

  • feat(typed metavar): Typed metavariable support for Rust

    Users can create TypedMetavar using Rust's type annotation syntax :.
    For example, the following rule works for matching HttpResponseBuilder
    type of variables:

    rules:
    - id: no-direct-response-write
      patterns:
      - pattern: '($BUILDER : HttpResponseBuilder).body(...)'
      - pattern-not: '($BUILDER : HttpResponseBuilder).body("...".to_string())'
      message: find dangerous codes
      severity: WARNING
      languages: [rust]
    ``` (gh-8200)
    
    
    

Fixed

  • baseline scans reporting on existing findings (baseline-supply-chain)
  • Fixed an issue leading to incorrect autofix results involving JS/TS async arrow functions (e.g. async () => {}, etc.). (gh-7353)
  • Workaround for rootless containers as git operations may fail due to dubious ownership of /src (gh-8267)
semgrep - Release v1.31.2

Published by github-actions[bot] over 1 year ago

1.31.2 - 2023-07-07

No significant changes.

semgrep - Release v1.31.1

Published by github-actions[bot] over 1 year ago

1.31.1 - 2023-07-07

No significant changes.

semgrep - Release v1.31.0

Published by github-actions[bot] over 1 year ago

1.31.0 - 2023-07-07

Added

  • Make CLI hit the new endpoint for the reliable fixed status on the Semgrep app. (cod-16)

  • feat(rule syntax): Metavariable Type Extension for Semgrep Rule Syntax 2.0

    This PR introduces the changes made in Semgrep rule syntax 1.0 to version 2.0 as well.

    rule syntax 2.0

    rules:

    • id: no-string-eqeq
      message: find errors
      severity: WARNING
      languages:
      • java
        match:
        all:
        • not: null == (String $Y)
        • $X == (String $Y)

    rule syntax 2.0 after proposed change

    rules:

    • id: no-string-eqeq
      message: find errors
      severity: WARNING
      languages:
      • java
        match:
        all:
        • not: null == $Y
        • $X == $Y
          where:
        • metavariable: $Y
          type: String (gh-8183)
  • Rust: Added the ability to taint macro calls through its arguments, in macro calls
    with multiple arguments. (pa-2902)

  • Add severity and suggested upgrade versions to Supply Chain findings (sc-772)

  • Added support for pnpm lockfile versions >= 6.0 (sc-824)

  • (sc-866)

Fixed

  • Fixed an issue leading to incorrect autofix results involving JS/TS arrow functions (e.g. () => {}). (gh-7353)
  • Dockerfile support: single-quoted strings are now parsed without an error. (gh-7780)
  • Fixes Go issue with patterns like make(...); make(...,$X); make($A,$B). (gh-8171)
  • Fixed rust attribute patterns to allow matching on simple attribute syntax. (pa-2903)
  • Rust: Fixed a bug where standalone metavariable patterns
    were not matching as expected (pa-2915)
  • Fixed python semgrep pattern parsing to also parse match statements, by chaining in the python tree-sitter parser, and adding metavariable support to the python tree-sitter parser. (pa-6442)
  • poetry.lock parsing: handle empty toml tables, quoted table keys, and arbitrarily placed comments (sc-834)
semgrep - Release v1.30.0

Published by github-actions[bot] over 1 year ago

1.30.0 - 2023-06-28

Added

  • feat(rule syntax): Support metavariable-type field for Kotlin, Go, Scala

    metavariable-type field is now supported for Kotlin, Go and Scala. (gh-8147)

  • feat(rule syntax): Support metavariable-type field for csharp, typescript, php, rust

    metavariable-type field is now supported for csharp, typescript, php, rust. (gh-8164)

  • Pattern syntax: You may now introduce metavariables from parts of regular
    expressions using pattern-regex, by using regular expression with
    named capturing groups (see https://www.regular-expressions.info/named.html)

    Now, such capture group metavariables must be explicitly named.
    So for instance, the pattern:

    pattern-regex: "foo-(?P<X>.*)"
    

    binds what is matched by the capture group to the metavariable $X,
    which can be used as normal.

    pattern-regex patterns with capture groups, such
    as

    pattern-regex: "(.*)"
    

    will still introduce metavariables of the form $1, $2, etc, but this
    should be considered deprecated behavior, and that functionality will be
    taken away in a future release. Named capturing groups should be primarily
    used, instead. (pa-2765)

  • Rule syntax: Errors during rule parsing are now better. For instance,
    parsing will now complain if you miss a hyphen in a list of patterns,
    or if you try to give a string to patterns or pattern-either. (pa-2877)

  • JS/TS: Now, patterns of records with ellipses, like:

    { $X: ... }
    

    properly match to records of anonymous functions, like:

    {
      func: () => { return 1; }
    }
    ``` (pa-2878)
    
    
    

Changed

  • engine: Removed matching cache optimization which had been previously disabled by
    default in 1.22.0 (we got no reports of any performance regression during this time). (cleanup-1)

Fixed

  • Language server no longer crashes when a user is logged in and opens a non git repo folder (pa-2886)
  • It is not required anymore to have semgrep (and pysemgrep) in the PATH. (pa-2895)
semgrep - Release v1.29.0

Published by github-actions[bot] over 1 year ago

1.29.0 - 2023-06-26

Added

  • feat(rule syntax): Metavariable Type Extension for Semgrep Rule Syntax

    We've added a dedicated field for annotating the type information of
    metavariables. By adopting this approach, instead of relying solely on
    language-specific casting syntax, we provide an additional way to enhance
    the overall usability by eliminating the need to write redundant type cast
    expressions for a single metavariable.

    Moreover, the new syntax brings other benefits, including improved support for
    target languages that lack built-in casting syntax. It also promotes a unified
    approach to expressing type, pattern, and regex constraints for metavariables,
    resulting in improved consistency across rule definitions.

    Current syntax:

    rules:
      - id: no-string-eqeq
        severity: WARNING
        message: find errors
        languages:
          - java
        patterns:
          - pattern-not: null == (String $Y)
          - pattern: $X == (String $Y)
    

    Added syntax:

    rules:
      - id: no-string-eqeq
        severity: WARNING
        message: find errors
        languages:
          - java
        patterns:
          - pattern-not: null == $Y
          - pattern: $X == $Y
          - metavariable-type:
              metavariable: $Y
              type: String
    ``` (gh-8119)
    
  • feat(rule syntax): Support metavariable-type field for Python

    metavariable-type field is now supported for Python too. (gh-8126)

  • New --experimental flag to switch to a new implementation of Semgrep entirely
    written in OCaml with faster startup time, incremental display of matches,
    AST and registry caching, a new interactive mode and more. Not all
    features of the legacy Python Semgrep have been ported though. (osemgrep)

  • Matching: Writing a pattern which is a sequence of statements, such as

    foo();
    ...
    bar();
    

    now allows matching to sequences of statements within objects, classes,
    and related language constructs, in all languages. (pa-2754)

Changed

  • taint-mode: Several improvements to taint_assume_safe_{booleans,numbers} options.
    Most notably, we will now use type info provided by explicit type casts, and we will
    also use const-prop info to infer types. (pa-2777)

Fixed

  • Added support for post-pip0614 decorators; now semgrep accepts decorators of
    the form @ named_expr_test NEWLINE, so for example with the pattern
    lambda $X:$X($X):
    #match 1
    @omega := lambda ha:ha(ha)
    def func():
      return None
    
    #match 2
    @omega[lambda a:a(a)].a.b.c.f("wahoo")
    def fun():
      return None
    ``` (gh-4946)
    
  • Fixed a typing issue with go; where semgrep with the pattern
    '($VAR : *tau.rho).$F()` wouldn't produce a match in the
    following:
    func f() {
      i_1 := &tau.rho{}
      i_2 := new(tau.rho)
    
      i_1.shift() //miss one
      i_2.left()  //miss two
    
      return 101
    }
    
    but now we don't miss those two findings! (gh-6733)
  • Constant propagation is now applied to stack array declarations in C; so
    a pattern $TYPE $NAME[101]; will now produce two matches in the following snippet:
    int main() {
    
      int bad_len = 101;
      /* match 1 */
      int arr1[101];
      /* match 2 */
      int arr2[bad_len];
      return 0;
    }
    ``` (gh-8037)
    
  • Solidity: allow metavariables for version, as in pragma solidity >= $VER; (gh-8104)
  • Added support for parsing patterns of the form
    #[Attr1]
    #[Attr2]
    
    In code such as
    #[Attr1]
    #[Attr2]
    function test ()
    {
        echo "Test";
    }
    
    Previously, to match against multiple attributes it was required to write
    #[Attr1, Attr2]
    ``` (pa-7398)