Lightweight static analysis for many languages. Find bug variants with patterns that look like source code.
LGPL-2.1 License
Bot releases are visible (Hide)
Published by github-actions[bot] over 1 year ago
interfile: true
when interfilePublished by github-actions[bot] over 1 year ago
--deep
(now --pro
), --interfile
(now --pro
),--interproc
(now --pro-intrafile
). Also removed already deprecated commandinstall-deep-semgrep
(now install-semgrep-pro
). (pa-2518)Published by github-actions[bot] over 1 year ago
metavariable-pattern
now persist to outside of the metavariable-pattern
(pa-2490)--pro
will now enable all Pro features, including Apex, inter-procedural taint--pro-languages
. For intra-file analysis--pro-intrafile
. Flags --interproc
and --interfile
are noworg.apache.logging.log4j:log4j-core
instead of just log4j-core
. This change is backwards incompatible, in that any Java Supply Chain rules not taking into account will stop producing any findings, since the packages parsed from lockfiles will include the org, but the old rules will not. (sc-maven-org)--dataflow-traces
in the CLI, to reduceinstall-semgrep-pro
to the list of commands in the semgrep --help
help text. (pa-2505)Published by github-actions[bot] over 1 year ago
semgrep ci
will addPublished by github-actions[bot] over 1 year ago
"=~/hello/"
now support thePublished by github-actions[bot] over 1 year ago
"=~/hello/"
now support thePublished by github-actions[bot] over 1 year ago
--test
to process entire file trees rather than single files (gh-5487)metavariable-pattern
operator on text that may look like (or in fact be)metavariable-pattern
operator, Generic mode--pro
and --interproc
. Using --pro
you can--fast-deep
you can enable intra-file inter-procedural--deep
has been renamed to --interfile
. Note that to usesemgrep install-semgrep-pro
while being*
and **
, thus bothsink(*tainted)
and sink(**tainted)
will result in findings. (gh-6920)semgrep-core-proprietary
executable. (pa-2417)Published by github-actions[bot] over 1 year ago
cond and X or Y
,True and X
and False or X
. So e.g. cond and "a" or "b"
willsemgrep install-semgrep-pro
. This engine is still invoked using the--deep
flag, but please expect changes to the CLI in the near future.$F(x)
match eval(x)
. Previously, eval
was special-cased and metavariable function call patterns would not match it. (gh-6877)--dataflow-traces
by default when --deep
is specified (pa-2274)Published by github-actions[bot] almost 2 years ago
mvn dependency:tree -DoutputFile=maven_dep_tree.txt
(sc-pom)Use the GitHub REST API when possible to compute the merge base for semgrep ci
, improving performance on shallow clones of large repositories. (gha-mergebase)
YAML: Fixed a bug where metavariables matching YAML double-quoted strings would not capture the entire range of the string, and would
not contain the double-quotes. Also added the ability to properly use patterns like "$FOO"
, which will unpack the contents of the matched string. (pa-2332)
Fixed a race condition related to the parsing cache that could lead to internal errors (pa-2335)
YAML: Fixed a bug where literal or folded blocks would not be parsed properly.
So for instance, in:
key: |
string goes here
A metavariable matching the contents of the string value might not be correct. (pa-2347)
Julia: Greatly improved parsing support (pa-2362)
Published by github-actions[bot] almost 2 years ago
func()
) (gh-6715)foo(cond ? new A() : this.a)
(pa-2328)Published by github-actions[bot] almost 2 years ago
foo("$VAR")
) (gh-6311)super(...)
patterns (gh-6638)new $X.Foo()
will now matchnew a.b.Foo()
. (pa-2296)require
calls (require-match)Published by github-actions[bot] almost 2 years ago
max_memory_bytes
field to the semgrep --time
output which corresponds to the amount of memory allocated during the OCaml phase of Semgrep. This is useful for telemetry purposes. (pa-2075)taint-mode: In 0.94.0 we made that when a pattern-source
(or pattern-sanitizer
)
matched a variable exactly, this was understood as that variable being tainted
(sanitized, resp.) by side-effect. For example, given tainted(x)
we would taint x
by side-effect, and subsequent occurrences of x
were also considered tainted.
This allowed to write rules like c.lang.security.use-after-free.use-after-free
in a very succint way, and it also addressed some limitations of the workarounds that
were being used to simulate this until then.
This worked well initially, or so we thought, until in 0.113.0 we added
field-sensitivity to taint-mode, and in subsequent versions we made sources and
sanitizers apply by side-effect to more kinds of l-values than just simple variables.
It was then that we started to see regressions that were fairly unintuitive for users.
For example, if $_GET['foo']
was a taint source, this would make $_GET
itself to
be tainted by side-effect, and a subsequent expression like $_GET['bar']
was also
considered tainted.
We now correct the situation by adding the by-side-effect
option to sources and
sanitizers, and requiring this option to be explicitly enabled
(that is, by-side-effect: true
) in order to apply the source or the sanitizer by
side-effect. Otherwise, the default is that sources and sanitizers matching l-values
apply only to the precise occurrences that they match. (pa-1629)
taint-mode: Fixed matching of pattern-sinks
to be more precise, so that e.g.
it will no longer report sink(ok1 if tainted else ok2)
as a tainted sink, as
the expression passed to the sink
is actually not tainted. (pa-2142)
CLI: Separated experimental rules from normal rules in semgrep --debug
output. (pa-2159)
Taint: Fixed an issue where findings with the same sink would be identified as the same, and cause
only one of them to be reported, even if they had different sources. (pa-2208)
DeepSemgrep: When the "DeepSemgrep" setting is enabled in Semgrep App, semgrep ci
will try to run the analysis using the DeepSemgrep engine. But if this engine was
not installed, semgrep ci
failed. Now semgrep ci
will automatically try to
install DeepSemgrep if it is not already present. Note that, if DeepSemgrep is
already installed, semgrep ci
does not attempt to upgrade it to a newer version. (pa-2226)
CLI: Made the number of jobs when using semgrep --deep
default to 1. (pa-2231)
Autofix: If multiple autofixes are targeting an overlapping range, then one of them is picked arbitrarily to occur, to prevent autofixes which may produce incorrect code. (pa-2276)
DeepSemgrep: Time data now outputs properly when running semgrep --deep --time
(pa-2280)
DeepSemgrep: Added a message which suggests that users update their version of DeepSemgrep, if the DeepSemgrep binary crashes (pa-2283)
Yarn 2 parse failure on versions like @storybook/react-docgen-typescript-plugin@canary. This is only present as some kind special version range specifier and never appears as a concrete version. It would only be used to check if the dependency was in the manifest file, so we just parse the version as "canary"
Yarn 2 parse failure on versions like @types/ol-ext@npm:@siedlerchr/[email protected]
Yarn 2 parse failure on versions like resolve@patch:resolve@^1.1.7#~builtin<compat/resolve>. These are now just ignored, as they appear to always come with a non-patch version as well. (sc-406)
Published by github-actions[bot] almost 2 years ago
semgrep ci
will automatically run the DeepSemgrepPublished by github-actions[bot] almost 2 years ago
--dataflow-traces
(pa-2116)semgrep ci
.x.a.b[i].c
got tainted, Semgrep would track x.a.b
as tainted, and thusx.a.b[i].d
would be incorrectly considered as tainted too. Now Semgrep willx.a.b[*].c
as tainted, and x.a.b[i].d
willprivate
, singly-assigned class variables now permit constant propagation (pa-2230)$X(...)
match this()
and super()
. (this-match)Published by github-actions[bot] almost 2 years ago
Published by github-actions[bot] almost 2 years ago
pattern-not
, pattern-inside
, and pattern-not-inside
to take in arbitrary patterns (such as patterns
, pattern-either
, and friends) (pa-1723)Published by github-actions[bot] almost 2 years ago
No significant changes.
Published by github-actions[bot] almost 2 years ago
this.x
.x.a[i]
was tainted, then x
itself was tainted;x.a
will be considered tainted. (pa-2086)metavariable-comparison
and friends (pa-2088)Published by github-actions[bot] almost 2 years ago
focus-metavariable
. (focus-metavariable-autofix)$x-> ... ->bar()
). (gh-6183)RUN --mount=type=$TYPE,target=$TARGET ...
. (gh-6353)x.a.b
was specified as a source/sanitizer/sink. For example, if x
had beensink(x.a.b)
where x.a
matched ax.a.b
was incorrectly consideredPublished by github-actions[bot] about 2 years ago
Taint mode will now track taint coming from the default values of function
parameters. For example, given def test(url = "http://example.com"):
,
if "http://example.com"
is a taint source (due to not using TLS), then
url
will be marked as tainted during the analysis of test
. (gh-6298)
taint-mode: Added two new rule options
that help minimizing false positives.
First one is taint_assume_safe_indexes
, which makes Semgrep assume that an
array-access expression is safe even if the index expression is tainted. Otherwise
Semgrep assumes that e.g. a[i]
is tainted if i
is tainted, even if a
is not.
Enabling this option is recommended for high-signal rules, whereas disabling it
may be preferred for audit rules. Currently, it is disabled by default for pure
backwards compatibility reasons, but this may change in the near future after some
evaluation.
The other one is taint_assume_safe_functions
, which makes Semgrep assume that
function calls do NOT propagate taint from their arguments to their output.
Otherwise, Semgrep always assumes that functions may propagate taint. This is
intended to replace not conflicting sanitizers (added in v0.69.0) in the future.
This option is still experimental and needs to be complemented by other changes
to be made in future releases. (pa-1541)
--scan-unknown-extensions
option is now set to false by default.--skip-unknown-extensions
is the default.foo("xyz $X")
. (autofix-string-metavar)