Skip to content

codacy-security/gitlab-sast-rules

Repository files navigation

Semgrep rules

This is the central Semgrep rule repository that hosts the Semgrep rules for the GitLab semgrep analyzer.

We follow the testing methodology laid out in this blog post.

The repository is structured as illustrated below:

.
├── mappings
│   └── analyzer.yml
├── dist
│   └── pack.yml
├── c
│   ├── buffer
│   │   ├── rule-strcpy.yml
│   │   ├── test-strcpy.c
│   │   ├── rule-memcpy.yml
│   │   └── test-memcpy.c
│   └── ...
└── javascript
│   └── ...
└── python
│    ├── assert
│    │   ├── rule-assert.yml
│    │   └── test-assert.py
│    └── exec
│    │   ├── rule-exec.yml
│    │   ├── test-exec.yml
│    │   ├── rule-something.yml
│    │   └── test-something.yml
│    └── permission
│    │   ├── rule-chmod.yml
│    │   └── test-chmod.py
│    └── ...
└── ...

The structure above follows the pattern: <language>/<ruleclass>/{rule-<rulename>.yml, test-<rulename>\..*} where language denotes the target programming language, <ruleclass> is a descriptive name for the class of issues the rule aims to detect and <rulename> is a descriptive name for the actual rule.

We can have multiple test cases per rule (all prefixed with test-) and rule files rule-<rulename>.yml that are prefixed with rule-; a rule file contains a single Semgrep rule.

The mappings and dist directories include the rule-pack configuration which define the rules that should included into rule-packs and the resulting, assembled rule-packs.

Updating rules

Please see our update process for more details.

Formatting guidelines

Rules contained in this repository have to adhere to the following format:

  • Use " for strings, otherwise the YAML literal block |
  • No collapsing of array elements
  • max line-length/text-width: 100 characters
  • indentation: 2 spaces
  • every rule has to have a corresponding test-case
  • if provided, comments-section at the top of the rule file
  • every YAML files starts with ---

The script ci/autoformat.rb automatically formats/rewrites all the rules files so that they adhere to our guidelines listed above. It can be executed by running ci/autoformat.rb within the sast-rules directory after installing the gems psych yaml fileutils with gem install psych yaml fileutils.

Mappings

The mappings directory in this repository contains YAML configuration files that map native analyzer ids to the corresponding Semgrep rules. These mappings are digested by the testing framework to perform an automated gap analysis; the goal of this analysis is to check whether there is an unexpected deviation between Semgrep (with the rules in this repository) and a given analyzer.

In addition to that mappings are also used to automatically assemble rule-packs. The snippet below illustrates an example mapping files for the bandit analyzer. The native_id section includes some information about the native id mappings. The actual rule mappings are defined in the mappings section. Each mapping defines of which Semgrep rules in this repository, a bandit rules is composed. Note that the order of the rules in the files are listed does matter at the moment, so that new mappings should be appended at the end.

bandit:
  native_id:
    type: "bandit_test_id"
    name: "Bandit Test ID: $ID"
    value: "$ID"
  mappings:
  - id: "B301"
    rules:
      - "python/deserialization/rule-cpickle"
      - "python/deserialization/rule-shelve"
      - "python/deserialization/rule-pickle"
      - "python/deserialization/rule-dill"
  - id: "B101"
  # ...

Data sources

The rules and test-cases in this repository are partially sourced from the sources listed below:

  1. https://github.com/returntocorp/semgrep-rules
  2. https://github.com/PyCQA/bandit
  3. https://github.com/nodesecurity/eslint-plugin-security
  4. https://github.com/jsx-eslint/eslint-plugin-react
  5. https://github.com/david-a-wheeler/flawfinder/blob/master/flawfinder.py

The details are listed in the headers of all the rule end test-files including the licensing information and proper attribution.

Contributing

If you know about a pattern that isn't present in this repo or refinements that could be applied to the rules in this repository, you can contribute by opening an issue, or even submit an improvement to the rule files/test cases in this repository.

Contribution instructions

After making changes to rules or mappings, make sure to run ./ci/deploy.sh <semantic version> and commit your updates to the /dist directory where <semantic version> should correspond to the latest published version in CHANGELOG.md>

Versioning and Changelog

We apply the following semantic versioning scheme to this repository:

  1. patch version increment: for updated/patched/added rules.
  2. minor version increment: backwards-compatible YAML schema changes (e.g., adding/removing optional fields).
  3. major version increment: non-backwards-compatible YAML schema changes (e.g., adding/removing required fields)

Credits

We would like to thank the following authors very much for their valuable contributions.

Author MRs/Issues
@masakura !99, !107
@gregory.mcdaniel #32
@niklas.volcz. !183

Rule deployment

Rules that are not covered at the moment

Bandit

Excluded patterns (1)

Adjusted patterns (3)

  • B503: ssl_with_bad_defaults Our Semgrep pattern captures both B503 and B502 because they are very similar and are both practically capturing insecure setting using outdated versions of encryption algorithms.
  • B110: try_except_pass The Semgrep rule checks the whole try except block whereas bandit reports every except case. The Semgrep rule approximates the original rule behaviour looking at various permutations of except pass cases embedded in a try ... except block.
  • B112: try_except_continue The Semgrep rule checks the whole try except block whereas bandit reports every except case. The Semgrep rule approximates the original rule behaviour looking at various permutations of except continue cases embedded in a try ... except block.

ESLint

Patterns we were unable to migrate (1)

Gosec

Patterns we were unable to migrate (2)

find-sec-bugs

Java, Scala

Adjusted patterns

Rule ID Description Status Comment
HARD_CODE_PASSWORD Hardcoded Password (Scala) The behaviour is not completely on par with find-sec-bugs; we excluded some patterns that are prone to FPs.

Out of scope patterns (25)

Out of scope patterns w.r.t. https://gitlab.com/gitlab-org/gitlab/-/issues/354762#rules-with-completion-status are all those patterns that are unrelated to Java.

Rule ID Description Status Comment
PREDICTABLE_RANDOM_SCALA Predictable pseudorandom number generator (Scala) Scala not supported
SCALA_COMMAND_INJECTION Potential Command Injection (Scala) Scala not supported
SCALA_PATH_TRAVERSAL_IN Potential Path Traversal using Scala API (file read) Scala not supported
SCALA_PLAY_SSRF Scala Play Server-Side Request Forgery (SSRF) Scala not supported
SCALA_SENSITIVE_DATA_EXPOSURE Potential information leakage in Scala Play Scala not supported
SCALA_SQL_INJECTION_ANORM Potential Scala Anorm Injection Scala not supported
SCALA_SQL_INJECTION_SLICK Potential Scala Slick Injection Scala not supported
SCALA_XSS_MVC_API Potential XSS in Scala MVC API engine Scala not supported
SCALA_XSS_TWIRL Potential XSS in Scala Twirl template engine Scala not supported
PLAY_UNVALIDATED_REDIRECT Unvalidated Redirect (Play Framework) Scala not supported
ANDROID_BROADCAST Broadcast (Android) Android not supported
ANDROID_EXTERNAL_FILE_ACCESS External file access (Android) Android not supported
ANDROID_GEOLOCATION WebView with geolocation activated (Android) Android not supported
ANDROID_WEB_VIEW_JAVASCRIPT_INTERFACE WebView with JavaScript interface (Android) Android not supported
ANDROID_WEB_VIEW_JAVASCRIPT WebView with JavaScript enabled (Android) Android not supported
ANDROID_WORLD_WRITABLE World writable file (Android) Android not supported
SQL_INJECTION_ANDROID Potential Android SQL Injection Android not supported
GROOVY_SHELL Potential code injection when using GroovyShell Groovy not supported
JSP_INCLUDE Dynamic JSP inclusion JSP not supported
JSP_JSTL_OUT Escaping of special XML characters is disabled JSP not supported
JSP_SPRING_EVAL Dynamic variable in Spring expression JSP not supported
JSP_XSLT A malicious XSLT could be provided to the JSP tag JSP not supported
XSS_JSP_PRINT Potential XSS in JSP JSP not supported
XSS_REQUEST_PARAMETER_TO_JSP_WRITER XSS: Servlet reflected cross site scripting vulnerability JSP not supported
REQUESTDISPATCHER_FILE_DISCLOSURE RequestDispatcher File Disclosure JSP not supported

Excluded patterns (6)

We excluded the patterns below because they are overly verbose; they are triggered by existing entry-points and do not indicate any vulnerability.

Rule ID Description Status Comment
STRUTS1_ENDPOINT Found Struts 1 endpoint 🚫 the endpoint rules only provide general information about potential security issue which seems noisy -- I think we can skip them
STRUTS2_ENDPOINT Found Struts 2 endpoint 🚫 the endpoint rules only provide general information about potential security issue which seems noisy -- I think we can skip them
SPRING_ENDPOINT Found Spring endpoint 🚫 We cannot cope with annotations; in addition endpoints should probably not end up in the final security report anyway
TAPESTRY_ENDPOINT Found Tapestry page 🚫 We cannot cope with annotations; in addition endpoints should probably not end up in the final security report anyway.
JAXRS_ENDPOINT Found JAX-RS REST endpoint 🚫 the endpoint rules only provide general information about potential security issue which seems noisy -- I think we can skip them
JAXWS_ENDPOINT Found JAX-WS SOAP endpoint 🚫 the endpoint rules only provide general information about potential security issue which seems noisy -- I think we can skip them
HARD_CODE_KEY Secret detection rule 🚫 Secret Detection is taken care of by a dedicated analyzer

Patterns we were unable to migrate (12)

The patterns below could not be migrated, because they required features not supported by Semgrep. See https://gitlab.com/gitlab-org/gitlab/-/issues/357679 for more information.

Rule ID Description Status Comment
SPRING_CSRF_UNRESTRICTED_REQUEST_MAPPING Spring CSRF unrestricted RequestMapping 🚫 No support for parsing annotations
SPRING_UNVALIDATED_REDIRECT Spring Unvalidated Redirect 🚫 No support for annotations
WICKET_ENDPOINT Found Wicket WebPage 🚫 the endpoint rules only provide general information about potential security issue which seems noisy -- I think we can skip them
UNSAFE_HASH_EQUALS Unsafe hash equals 🚫 this rule is highly prone to FPs -- it checks for unsecure hash functions by looking for keywords (e.g., sha) in variable or parameter names. As we are already covered by secret detection, we can probably omit this particular rule.
STATIC_IV Static IV 🚫 https://gitlab.com/gitlab-org/gitlab/-/issues/357679#note_905023485
DESERIALIZATION_GADGET This class could be used as deserialization gadget 🚫 Multiple logical flows involved. Cannot be achieved in Semgrep.
ENTITY_LEAK Unexpected property leak 🚫 Annotations of classes are processed to determine the result. This cannot be achieved in Semgrep.
ENTITY_MASS_ASSIGNMENT Mass assignment 🚫 Annotations of classes are processed to determine the result. This cannot be achieved in Semgrep.
ESAPI_ENCRYPTOR Use of ESAPI Encryptor 🚫 Config files related. We currently support only files with .java extensions.
JACKSON_UNSAFE_DESERIALIZATION Unsafe Jackson deserialization configuration 🚫 Reason
OBJECT_DESERIALIZATION Object deserialization is used 🚫 This problem is solved by determining Interface supersets and Annotation metadata. This cannot be accomplished in Semgrep
REDOS Regex DOS (ReDOS) 🚫 This problem is solved by applying set of conditional logic on each character of a target string. This cannot be accomplished in Semgrep

security-code-scan

Modified patterns (1)

Rule ID Description Comment
SCS0018 Path Traversal We adapted the pattern to not cover arguments passed to Main as sources because this often lead to FPs for CLI apps.

Excluded patterns (1)

We excluded the patterns below because they are overly verbose.

Rule ID Description Status Comment
SCS0015 Hardcoded Password 🚫 This is better served by Secrets Detection as there are a multitude of ways that hardcoded passwords can be specified.

Patterns we were unable to migrate (5)

The patterns below could not be migrated, because they required features not supported by Semgrep.

Rule ID Description Status Comment
SCS0021 Request Validation Disabled (Configuration File) 🚫 XML configuration file.
SCS0022 Event Validation Disabled 🚫 XML configuration file.
SCS0023 View State Not Encrypted 🚫 XML configuration file.
SCS0024 View State MAC Disabled 🚫 XML configuration file.
SCS0008 Cookie Without SSL Flag 🚫 The SCS rule also detects vulnerabilities in ASP.NET config files which is not supported by Semgrep. We also haven't been able to detect these with SCS within the gapanalysis job as the HttpCookie class requires .NET Framework.
SCS0009 Cookie Without HttpOnly Flag 🚫 The SCS rule also detects vulnerabilities in ASP.NET config files which is not supported by Semgrep. We also haven't been able to detect these with SCS within the gapanalysis job as the HttpCookie class requires .NET Framework.
SCS0002 SQL Injection 🚫 The SCS rule also detects vulnerabilities in ASP.NET UI code, which Semgrep does not support.
SCS0003 XPath Injection 🚫 The SCS rule also detects vulnerabilities in ASP.NET UI code, which Semgrep does not support.
SCS0003 XPath Injection 🚫 The SCS rule also detects vulnerabilities in ASP.NET UI code, which Semgrep does not support.
SCS0030 Request validation is enabled only for pages (Configuration File) 🚫 This rule relates to changes in the Configuration File(XML) format. Semgrep does not have GA support for HTML/XML format.

Rule synchronization from Upstream scanners

Semgrep rules should be kept in-sync with upstream scanners regularly; here's the process:

  • Pull the newly added rules from the analyzer's Upstream source (excluding the rules which could not be translated due to Semgrep limitations - see above).
  • Translate newly identified rules into Semgrep-equivalent rules
  • Map them against native analyzer's IDs in this repository.
  • Generate a new ruleset distribution using the instructions described above.
  • Add all the un-translatable rules into this file along with the reason against the downstream analyzer/
  • Copy over the new ruleset distribution into Semgrep/rules to reflect rule changes in the analyzer.

For better tracking purposes, create a dedicated issue on rule synchronization cadence and create a sub-task for each semgrep-translated analyzer. The subtask should contain all the new rules that should be synchronized. Here's an example issue that has followed the mentioned process.

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Packages

No packages published

Contributors 9