Lightweight static analysis for many languages. Find bug variants with patterns that look like source code.
LGPL-2.1 License
Bot releases are visible (Hide)
Published by github-actions[bot] over 2 years ago
pattern-propagators
feature that allows to specify--config auto
no longer sends the name of the repository being scanned to the Semgrep Registry.semgrep scan
options:--json-stats
, --json-time
, --debugging-json
, --save-test-output-tar
, --synthesize-patterns
,--generate-config/-g
, --dangerously-allow-arbitrary-code-execution-from-rules
,--apply
(which was an easter egg for job applications, not the same as --autofix
)with
context expressions where the value is notPublished by github-actions[bot] over 2 years ago
SEMGREP_ENABLE_VERSION_CHECK=0
{...foo}
) are now translated into the Dataflow ILsemgrep lsp --config auto
!semgrep --config auto
run on the semgrep Python package in 14s instead of 16s.--disable-version-check
would still send a request/src
without notice.$X()
no longer matches new Foo()
, for consistency with other languages (#5510)($X: C)
matches new C()
. (#5540)Published by github-actions[bot] over 2 years ago
package $X
, which is useful to bind the packagesemgrep ci
should be clear it is exiting with error code 0yarn.lock
files with no depenencies, and with dependencies that lack URLs, now parsePublished by github-actions[bot] over 2 years ago
generic_ellipsis_max_span
for controllinggeneric_comment_style
for ignoring:include
instruction in a .semgrepignore
file.semgrep scan --output=secret.txt
we might send "option/output"
but will NOT send "option/output=secret.txt"
.Published by github-actions[bot] over 2 years ago
fixes
sectionr2c-internal-project-depends-on
: support for poetry and gradle lockfilesSEMGREP_BASELINE_REF
as alias for SEMGREP_BASELINE_COMMIT
r2c-internal-project-depends-on
:
ci
CLI command will now include ignored matches in output formats$X
in a message to interpolate the variable captured$X
, but there was no way to access the underlying value.value($X)
to interpolate the underlyingx = 42
log(x)
Now take a rule to find that log command:
- id: example_log
message: Logged $SECRET: value($SECRET)
pattern: log(42)
languages: [python]
Before, this would have given you the message Logged x: value(x)
. Now, itLogged x: 42
.return
for taint analysis (#4975)Published by github-actions[bot] over 2 years ago
metavariable-regex
now supports an optional constant-propagation
key.true
, information learned from constant propagationfalse
ENV
shouldafound
- False Negative reporting via the CLItaint(x)
makes x
tainted by side-effect.x
inside taint(x); ...
was as taint source. If x
was overwritten withtaint(x)
if
block, any occurrence of x
outside that blockfocus-metavariable
), the taint engine will handlesanitize(x)
sanitizes x
by side-effect.x
inside sanitize(x); ...
was sanitized. If x
later overwritten withx
as safe. Now, if youfocus-metavariable
),semgrep scan --config auto
on the semgrep repo itself:include .gitignore
and .git/
.semgrepignore
patterns.override
keyword (#4220, #4798)(null)(foo)
(#4468)func foo() (..., error, ...) {}
) (#4896)with
context expressionswith (open(x) as a, open(y) as b): pass
) (#5092)Published by github-actions[bot] over 2 years ago
semgrep ci
used to incorrectly report the base branch as a CI job's branchpull_request_target
event in GitHub Actions.on: pull_request_target
jobs.PRIVACY.md
had already documented a timestamp field.Published by github-actions[bot] over 2 years ago
class Foo(...) {}
(#5180)fixed_lines
is once again included in JSON output when running with --autofix --dryrun
Published by github-actions[bot] over 2 years ago
semgrep scan
is now fully specified usingsemgrep scan
now contains a "version": field with thefocus-metavariable
can be used tolet {x} = E
, Semgrep will now infer that x
E
is tainted.Published by github-actions[bot] over 2 years ago
--core-opts
flag to send options to semgrep-core. For internal use: no guarantees made for semgrep-core options (#5111)Published by github-actions[bot] over 2 years ago
rules:
key underneath the join:
key.Published by github-actions[bot] over 2 years ago
echo $...ARGS
(#4887)({ params }: Request) => { }
with ({$VAR} : $REQ) => {...}
. (#5004)Published by github-actions[bot] over 2 years ago
-> (P) {Q}
where P
and Q
are sub-patterns. (#4950)semgrep install-deep-semgrep
command for DeepSemgrep beta (#4993)lang.json
file not found error while building the docker imageEXPOSE 12345
will now parse 12345
as an int instead of a string,metavariable-comparison
with integers (#4875)def f[@an A, @an B](x : A, y : B) = ...
)r2c-internal-project-depends-on
:
Published by github-actions[bot] over 2 years ago
focus-metavariable
operator that lets you focus (or "zoom in") the matchsemgrep ci
uses "GITHUB_SERVER_URL" to generate urls if it is availableNO_COLOR=1
to force-disable colored outputpattern-sinks
, plus the subset of metavariables bound by pattern-sources
pattern-sinks
. We do not expectpattern-inside
to be unified, thus limiting the usefulness of the feature.taint_unify_mvars: true
in the rule's options
.r2c-internal-project-depends-on
: this is now a rule key, and not part of the pattern language.depends-on-either
key can be used analgously to pattern-either
r2c-internal-project-depends-on
: each rule with this key will now distinguish betweenextra
field of semgrep's JSON output, using the dependency_match_only
dependency_matches
fields, respectively.r2c-internal-project-depends-on
: a finding will only be considered reachable if the filesemgrep
as the entrypoint.semgrep
is no longer prepended automatically to any command you run in the image.-
is now parsed as a valid identifier in Scalanew $OBJECT(...)
will now work properly as a taint sink (#4858)...{$X}...
will no longer match str
pattern-inside
are now available to theSEMGREP_URL
or SEMGREP_APP_URL
Published by github-actions[bot] over 2 years ago
Published by github-actions[bot] over 2 years ago
--gitlab-sast
and --gitlab-secrets
.<script>$...JS</script>
)semgrep ci
subcommand that auto-detects settings from your CI environmenttests
from published python wheel'xxxxxxxxxxxxxx'
are no longer reported has having high entropy (#4833)++
and --
as side-effectful# nosemgrep
is supposed to be the samesemgrep-agent
.--timeout-threshold
default set to 3 instead of 0Published by github-actions[bot] over 2 years ago
CMD ...
to match both CMD ls
and CMD ["ls"]
(#4770).Fixed Deep expression matching and metavariables interaction. Semgrep will
not stop anymore at the first match and will enumarate all possible matchings
if a metavariable is used in a deep expression pattern
(e.g., <... $X ...>
). This can introduce some performance regressions.
JSX: ellipsis in JSX body (e.g., <div>...</div>
) now matches any
children (#4678 and #4717)
ℹ️ During a
--baseline-commit
scan,
Semgrep temporarily deletes files that were created since the baseline commit,
and restores them at the end of the scan.
Previously, when scanning a subdirectory of a git repo with --baseline-commit
,
Semgrep would delete all newly created files under the repo root,
but restore only the ones in the subdirectory.
Now, Semgrep only ever deletes files in the scanned subdirectory.
Previous releases allowed incompatible versions (21.1.0 & 21.2.0)
of the attrs
dependency to be installed.
semgrep
now correctly requires attrs 21.3.0 at the minimum.
package-lock.json
parsing defaults to packages
instead of dependencies
as the source of dependencies
package-lock.json
parsing will ignore dependencies with non-standard versions, and will succesfully parse
dependencies with no integrity
field
File targeting logic has been mostly rewritten. (#4776)
These inconsistencies were fixed in the process:
ℹ️ "Explicitly targeted file" refers to a file
that's directly passed on the command line.
Previously, explicitly targeted files would be unaffected by most global filtering:
global include/exclude patterns and the file size limit.
Now .semgrepignore
patterns don't affect them either,
so they are unaffected by all global filtering,
ℹ️ With
--skip-unknown-extensions
,
Semgrep scans only the explicitly targeted files that are applicable to the language you're scanning.
Previously, --skip-unknown-extensions
would skip based only on file extension,
even though extensionless shell scripts expose their language via the shebang of the first line.
As a result, explicitly targeted shell files were always skipped when --skip-unknown-extensions
was set.
Now, this flag decides if a file is the correct language with the same logic as other parts of Semgrep:
taking into account both extensions and shebangs.
Semgrep scans with --baseline-commit
are now much faster.
These optimizations were added:
ℹ️ When
--baseline-commit
is set,
Semgrep first runs the current scan,
then switches to the baseline commit,
and runs the baseline scan.
The current scan now excludes files
that are unchanged between the baseline and the current commit
according to git status
output.
The baseline scan now excludes rules and files that had no matches in the current scan.
When git ls-files
is unavailable or --disable-git-ignore
is set,
Semgrep walks the file system to find all target files.
Semgrep now walks the file system 30% faster compared to previous versions.
The output format has been updated to visually separate lines
with headings and indentation.
Published by github-actions[bot] over 2 years ago
--validate
will check that metavariable-x doesn't use an invalid<foo />
used to<foo >some child</foo>
.options:
xml_singleton_loose_matching: false
(#4730)xml_attrs_implicit_ellipsis
that allows...
that was added to JSX attributes patterns.--strict
--config auto
(#4674)--dry-run
s where one fix changes the line numbers in a file that also has a second autofix.yarn.lock
dependencies that do not specify a hashproject-depends-on
rules with only pattern-inside
at their leavesPublished by github-actions[bot] over 2 years ago
~/.semgrep/last.log
-->
, for join mode rules for recursivelypaths.scanned
key.--verbose
, the skipped paths are also listed under thepaths.skipped
key.metavariable-analysis
featureredos
analyzer, #4700)entropy
analyzer, #4672).semgrep publish
allows users to upload private,metavariable-regex
ormetavariable-pattern
. Previously, Semgrep had problems analyzing e.g. embeddedescapeshellarg
andhtmlspecialchars_decode
, if these functions are given constant arguments,SEMGREP_LOGIN_TOKEN
to SEMGREP_APP_TOKEN
Published by github-actions[bot] over 2 years ago
--baseline-commit GIT_COMMIT
to only--verbose
mode will list all skipped paths along with the reason they were skippedimport
module file names, thusimport { $X } from 'foo'
E as T
will be matched correctly. E.g. previouslyv as $T
would match v
but not v as any
, now itv as any
but not v
. (#4515)