Skip to content

Conversation

@owen-mc
Copy link
Contributor

@owen-mc owen-mc commented Feb 11, 2026

  • Make Concepts.qll for java. (I was a bit surprised it doesn't have one.)
  • Add an abstract class called RegexExecution which is a copy of this concept from python (and ruby).
    • Note that there is also this concept in go, which is less flexible because it only works for function calls. I may update it as a follow-up to match the others languages.
    • It's possible other languages have something similar - I haven't checked in detail.
      • The shared libraries have a file of shared concepts (synced using the script at the moment). I think this is a candidate for going in there and being shared between languages in future. But I don't propose to do that as part of this work.
  • Instantiate the abstract class with direct regex executions like String.matches  and compiled regex executions like Pattern.compile followed by Pattern.matches, all of which are currently modelled.
  • Update code that uses regex matches to use the concept and check that all tests pass.

I expect some reduction in FPs due to the final commit expanded the scope of some sanitizers to include more regex execution methods. However, I don't want to run DCA/QA/MRVA to quantify that yet because I am doing some follow-up work to add the @javax.validation.constraints.Pattern annotation as well.

I welcome discussion about how to best deal with the fact that I want the concept to be the same as other languages, which use data flow nodes, but actually java mostly works at the levels of expressions. I've kind of done both for now. I think as we spread the use of shared libraries this difference in approach will eventually go away, or at least become easier to fix.

Note: currently lacking a change note. If I finish the follow-up work quickly I may end up doing a combined one in that PR, which will build on this one. But it is still worth reviewing this PR as it breaks the review work up into smaller units.

Copilot AI review requested due to automatic review settings February 11, 2026 13:20
@owen-mc owen-mc requested a review from a team as a code owner February 11, 2026 13:20
@github-actions github-actions bot added the Java label Feb 11, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a shared “RegexExecution” concept for Java CodeQL libraries and migrates existing regex-related models/queries to use it, aiming to unify regex execution modeling across languages and reduce false positives.

Changes:

  • Added semmle.code.java.Concepts with RegexExecution / RegexExecutionExpr concepts and wired it into java.qll.
  • Extended existing Java regex API models (String.matches, Pattern.compile, Pattern.matches, Matcher.matches) to implement the new concept.
  • Updated experimental and security libraries to use the new concept types instead of bespoke regex helpers.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
java/ql/src/experimental/Security/CWE/CWE-625/Regex.qll Removes the (deprecated) local regex helper module.
java/ql/src/experimental/Security/CWE/CWE-625/PermissiveDotRegexQuery.qll Switches to framework regex models/concepts for sinks and matching logic.
java/ql/lib/semmle/code/java/security/regexp/RegexInjection.qll Updates sanitizer modeling to use the new PatternCompileCall abstraction.
java/ql/lib/semmle/code/java/security/Sanitizers.qll Reworks regexp-guard matching to use RegexExecutionExpr::Range.
java/ql/lib/semmle/code/java/security/PathSanitizer.qll Updates directory-character matching guard logic to use the new regex execution concept.
java/ql/lib/semmle/code/java/frameworks/Regex.qll Adds call/model classes (e.g., PatternCompileCall, PatternMatchesCall, MatcherMatchesCall) implementing RegexExecutionExpr::Range.
java/ql/lib/semmle/code/java/JDK.qll Updates StringMatchesCall to implement RegexExecutionExpr::Range.
java/ql/lib/semmle/code/java/Concepts.qll New Java Concepts library including RegexExecution and related modeling modules.
java/ql/lib/java.qll Exports the new Concepts library by importing it.
Comments suppressed due to low confidence (1)

java/ql/lib/semmle/code/java/frameworks/Regex.qll:96

  • Doc comment punctuation: this comment is missing a trailing period, unlike most other doc comments in this file.
/** A call to the `matches` method of `java.util.regex.Matcher` */
class MatcherMatchesCall extends MethodCall, RegexExecutionExpr::Range {

/** Gets the expression for the regex being executed by this node. */
abstract Expr getRegex();

/** Gets a expression for the string to be searched or matched against. */
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammar: "Gets a expression" should be "Gets an expression".

Suggested change
/** Gets a expression for the string to be searched or matched against. */
/** Gets an expression for the string to be searched or matched against. */

Copilot uses AI. Check for mistakes.
Comment on lines +74 to +85
/** A call to the `compile` method of `java.util.regex.Pattern` */
class PatternCompileCall extends MethodCall {
PatternCompileCall() { this.getMethod() instanceof PatternCompileMethod }
}

/** A call to the `matcher` method of `java.util.regex.Pattern` */
class PatternMatcherCall extends MethodCall {
PatternMatcherCall() { this.getMethod() instanceof PatternMatcherMethod }
}

/** A call to the `matches` method of `java.util.regex.Pattern` */
class PatternMatchesCall extends MethodCall, RegexExecutionExpr::Range {
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These doc comments are missing trailing periods, while other doc comments in this file include them. Consider adding periods for consistency.

This issue also appears on line 95 of the same file.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant