Lightweight static analysis for many languages. Find bug variants with patterns that look like source code.
LGPL-2.1 License
Bot releases are visible (Hide)
Published by github-actions[bot] about 2 years ago
pattern-propagators
to propagate taintforEach
in Java. For example:
pattern-propagators:
- pattern: $X.forEach(($Y) -> ...)
from: $X
to: $Y
``` (gh-5971)
Published by github-actions[bot] about 2 years ago
rules:
key. This change does not--sca
to --supply-chain
.--config sca
to --config supply-chain
(sca-ssc)new
operator, that had been broken sincenew A().foo()
to matcha.foo()
, with a = new A()
. (gh-6161)this
or this.x
to be a source of taint. (pa-1929)if
condition or a throw
(aka raise
) expression/statement. (pa-1933)Published by github-actions[bot] about 2 years ago
$X match { ... }
) (gh-6131)Published by github-actions[bot] about 2 years ago
focus-metavariable
, which allows Semgrep to highlight themetavariable-regex
. (gh-5987)$...ARGS
) in arguments (gh-6065)NamedTemporaryFile
objects while their correspondingPublished by github-actions[bot] about 2 years ago
x.a
and x.b
separately, so that e.g. x.a
can bex.b
is clean, hence sink(x.a)
would producesink(x.b)
would not. It is also possible for x
to be taintedx.a
is clean. We expect this to have an net positive effect by reducingimport world.Hello
, and create anew Hello.internal_class()
, you can match that withnew world.Hello.internal_class()
. (gh-6001)semgrep --test
now fails when encountering a parsing error in target code. (gh-6068)not in
operator. (gh-6072)TypeError: unbound method set.intersection() needs an argument
crashregex
or generic
). (gh-6093)Published by github-actions[bot] about 2 years ago
pattern-inside
. (gh-6059)Published by github-actions[bot] about 2 years ago
case 5: ...
) (pa-1788)/.../
can now match any regexp, including regexp templates such as /hello #{name}/
. (gh-5147)public Foo() { }
(gh-5558)match
statements (pa-1739)Published by github-actions[bot] about 2 years ago
Previously, the following error message appears when metrics are not uploaded within the set timeout timeframe:
Error in send: HTTPSConnectionPool(host='metrics.semgrep.dev', port=443): Read timed out. (read timeout=3)
As this causes users confusion when running the CLI, the log level of the message is reduced to appear for development and debugging purposes only. Note that metrics are still successfully uploaded, but the success status is not sent in time for the curent timeout set. (app-1398)
"some string".concat(x)
. Previously, when x
was tainted, the concat
Published by github-actions[bot] about 2 years ago
When a YAML rule file had a string that contained an ISO timestamp, that would be parsed as a datetime object, which would then be rejected by Semgrep's rule schema validator. This is now fixed by keeping strings that contain an ISO timestamp as strings. (app-2157)
When parsing PHP with tree-sitter, parse $this
similar to pfff, as an IdSpecial. This makes it possible to match $this
when the pattern is parsed with pfff and the program with tree-sitter. (gh-5594)
Parse die() as exit() in tree-sitter PHP. This makes pfff and tree-sitter parse die() in the same way. (gh-5880)
All: Applied a fix so that qualified identifiers can unify with metavariables. Notably, this
affected Python decorators, among others. (pa-1700)
Fixed a regression in DeepSemgrep after the experimental taint labels feature
was introduced in 0.106.0. This prevented DeepSemgrep from reporting taint
findings when e.g. the sink was wrapped by another function. (pa-1750)
Fixed metavariable unification in JSON when one of the patterns is a single field. (pa-1763)
Changed symbolic propagation such that "redundant" matches are no
longer reported as findings. For instance:
def foo():
x = g(5)
f(x)
If we are looking for the pattern g(5)
, we should not match on line 3,
since we will match on line 2 anyways, and this is just repeating information that
we already know.
This patch changes it so that we do not match on line 3 anymore. (pa-1772)
Semgrep now passes -j to DeepSemgrep engine so --deep became noticeably faster. (pa-1776)
taint-mode: Due to a mistake in the instantiation of a visitor, named function
definitions were being analyzed twice! This is now fixed and you may observe
significant speed ups in some cases. (pa-1778)
Extract mode: fixed a possible exception in normal usage introduced due to
changes in handling of search/taint rules. (pa-1786)
Changed the fail-open message body (pm-194)
macos-12
is unreliable and has begun failing withoutmacos-11
,Published by github-actions[bot] about 2 years ago
Published by github-actions[bot] about 2 years ago
semgrep ci
now defaults to fail open and will always exit with exit code 0, which is equivalent to passing --suppress-errors
.--no-suppress-errors
and semgrep will behave as it did previously, surfacing any exit codes that may result. (app-1951)--dataflow-traces
) should no longer report "strange"foo
was a tainted record and the code accessed some of its fields as infoo.bar.baz
. This was related to the use of auxiliary variables in the Dataflow IL..
operator. Now we do not include these variables in the taint trace. (pa-1672)macos-10.15
is deprecated and will be unsupported by 30AUG2022. We've tested and can upgrade to macos-12
to avoid issues with brownouts or end of support. (devop-586)Published by github-actions[bot] about 2 years ago
Fixed issue when scan fails due to pending changes in submodule. (cli-272)
Semgrep CI now accepts more formats of git url for metadata provided to semgrep.dev and lets the user provide a fallback for repo name (SEMGREP_REPO_NAME) and repo url (SEMGREP_REPO_URL) if they are undefined by CI. (cli-280)
Fixed a crash that occurred when reporting results when join mode and taint mode were used together (gh-5839)
JS: Allowed decorators to appear in Semgrep patterns for class methods and fields. (pa-1677)
Quick fix for a regression introduced in 0.107.0 (presumably by taint labels)
that could cause some taint rules to crash Semgrep with:
Invalid_argument "output_value: abstract value (Custom)" (pa-1724)
Increase timeout for network calls to semgrep.dev from 30s to 60s (timeout-1)
Published by github-actions[bot] about 2 years ago
obj. ... .bar()
) (gh-5819)semgrep-core
so that it can now be run with -rules
on .yaml
files which do not have a top-level rules: ...
key. This means you can now copy paste from the playground editor directly into a .yaml
file for use with semgrep-core
. (implicit-rules-sc-core)--dataflow-traces
flag, which directs the Semgrep CLI to explain how non-local values lead to a finding. Currently, this only applies to taint mode findings and it will trace the path from the taint source to the taint sink. (pa-1599)import
patterns (gh-5219)-filter_irrelevant_rules
was incorrectly skipping files when the PCRE engine threwPublished by github-actions[bot] about 2 years ago
metavariable-comparison
: The metavariable
field is now optional, except
if strip: true
. When strip: false
(the default) the metavaraible
field
has no use so it was pointless to require it. (metavariable-comparison-metavariable)
metavariable-comparison
now also works on metavariables that cannot be evaluated
to simple literals. In such cases, we take the string representation of the code
bound by the metavariable. The way to access this string representation is via
str($MVAR)
. For example:
- metavariable-comparison:
metavariable: $X
comparison: str($X) == str($Y)
Here $X
and $Y
may bind to two different code variables, and we check whether
these two code variables have the same name (e.g. two different variables but both
named x
). (pa-1659)
When running an SCA scan with semgrep ci --sca
,
SCA findings will no longer be considered blocking if they are unreachable. (sca-128)
Fixed a regression in name resolution that occurred with metavariable patterns (gh-5690)
Rust: Fixed a bug with matching for scoped identifiers
Basically, scoped identifiers were only looking at the last identifier. So something like A::B::C
would result in something like C
. (gh-5717)
Published by github-actions[bot] about 2 years ago
languages
value (pa-1648)C#: Improved error message when function parameters are declared with var
(gh-5068)
Scala/others: Added a fix allowing percolation of name information from class parameters
For example, classes which take in arguments like the following in Scala:
class ExampleClass(val x: TypeName) {
}
do not properly enter the context. So in our analysis, we would not know that the identifier
x
has type TypeName
, within the body of ExampleClass
. (gh-5506)
Fixed the logged message describing the endpoint where rules are fetched from when SEMGREP_URL is set (gh-5753)
Fixed what data was used for indexing match results to used match based id data (index)
Published by github-actions[bot] over 2 years ago
semgrep ci
will now not block builds on triage ignored issues (cli-162)Metavariable-pattern now uses the same metavariable context as its parent. This will potentially
cause breaking changes for rules that reuse metavariables in the pattern. For example, consider
the following formula:
- patterns:
- pattern-either:
- pattern-inside: $OBJ.output($RESP)
- pattern: $RESP
- metavariable-pattern:
metavariable: $RESP
pattern: `...{ $OBJ }...`
Previously, the $OBJ
in the metavariable-pattern would be a new metavariable. The formula would
behave the same if that $OBJ
was $A
instead. Now, $OBJ
will try to unify with the value bound
by $OBJ
in the pattern-inside. (gh-5060)
The semgrep test output used to produce expected lines and reported lines which is difficult to read and interpret. This change introduces missed lines and incorrect lines to make it easier for the users to pinpoint the differences in output. (gh-5600)
Separator lines are no longer drawn between findings that have no source code snippet. (sca-ui)
Using ellipses in XML/HTML elements is now more permissive of whitespace.
Previously, in order to have a element with an ellipsis no leading/trailing
whitespace was permitted in the element contents, i.e., <tag>...</tag>
was
the only permitted form. Now, leading or trailing whitespace is ignored when
the substantive content of the element is only an ellipsis. (xml-permissive-ellipsis)
os.stat
instead of os.access
to determine if a file is executable. (gh-5560)--debug
isn't passed--debug
is used (pa-1618)semgrep ci
e2e. (cli-253)towncrier
to avoid merge conflicts in changelog on release (cli-77)Published by github-actions[bot] over 2 years ago
foo();
) used to also match whenx = foo();
).options:
implicit_deep_exprstmt: false
(#5472)SEMGREP_GIT_COMMAND_TIMEOUT
environment variable.Published by github-actions[bot] over 2 years ago
for (...; $X <- $Y if $COND; ...) { ... }
to match nested for loops. (#5650)--verbose
no longer toggles the display of timing information, use--verbose --time
to display this information.Published by github-actions[bot] over 2 years ago
semgrep ci
: CI runs in GitHub Actions failed to checkout the commit assoociated with the head branch, and is fixed here.Published by github-actions[bot] over 2 years ago
semgrep ci
: CI runs were failing to checkout the PR head in GitHub Actions, which ispattern-propagators
now works correclty when thefrom
or to
metavariables match a function call. For example, givensqlBuilder.append(page.getOrderBy())
, we can now propagate taint frompage.getOrderBy()
to sqlBuilder
.