Bound.qll - Replace utility for range analysis duplicate across java and cs with shared file#21900
Bound.qll - Replace utility for range analysis duplicate across java and cs with shared file#21900BazookaMusic wants to merge 14 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
This PR reduces Java/C# duplication in range analysis by introducing a shared Bound library under codeql/rangeanalysis and switching the language-specific libraries to instantiate it via per-language definition modules.
Changes:
- Added
shared/rangeanalysis/codeql/rangeanalysis/Bound.qllas the shared implementation. - Updated Java and C# bound libraries to use the shared implementation via
BoundSpecific::BoundDefs. - Updated packaging/config to support the new shared dependency and removed the now-obsolete “identical files” entry.
Show a summary per file
| File | Description |
|---|---|
| shared/rangeanalysis/codeql/rangeanalysis/Bound.qll | Introduces shared bound abstractions and a parameterized Bound module. |
| java/ql/lib/semmle/code/java/dataflow/internal/rangeanalysis/BoundSpecific.qll | Defines Java bindings (BoundDefs) implementing the shared bound signature. |
| java/ql/lib/semmle/code/java/dataflow/Bound.qll | Replaces the Java-specific bound implementation with an instantiation of the shared module. |
| csharp/ql/lib/semmle/code/csharp/dataflow/internal/rangeanalysis/BoundSpecific.qll | Defines C# bindings (BoundDefs) implementing the shared bound signature. |
| csharp/ql/lib/semmle/code/csharp/dataflow/Bound.qll | Replaces the C#-specific bound implementation with an instantiation of the shared module. |
| csharp/ql/lib/qlpack.yml | Adds the codeql/rangeanalysis dependency required by the new shared import. |
| config/identical-files.json | Removes the Java/C# “Bound” identical-files entry since the code is now shared. |
Copilot's findings
- Files reviewed: 7/7 changed files
- Comments generated: 2
michaelnebel
left a comment
There was a problem hiding this comment.
Very nice! Well done 😄
We should also run DCA after the review process is done.
| interestingExprBound(e) and | ||
| not exists(SsaVariable v | e = v.getAUse()) | ||
| } | ||
| private module BoundImpl = SharedBound::Bound<CS::Location, BoundSpecific::BoundDefs>; |
There was a problem hiding this comment.
| private module BoundImpl = SharedBound::Bound<CS::Location, BoundSpecific::BoundDefs>; | |
| import SharedBound::Bound<CS::Location, BoundSpecific::BoundDefs> |
There was a problem hiding this comment.
Actually, there's a reason for naming the resulting module before importing it. It can help provide shorter names in RA/DIL. There's a mechanism in the compiler that makes use of such module aliases to try to shorten names, so we could have BoundImpl::foo as opposed to SharedBound::Bound<CS::Location, BoundSpecific::BoundDefs>::foo.
And it's actually not insignificant - before introducing that mechanism we had a brief period of time where the evaluator log associated with certain runs of the data flow library could yield OOM - simply due to very long names.
There was a problem hiding this comment.
Interesting that this mechanism doesn't do something like SharedBound[stable hash of generic arguments appended] so we don't have to name it ourselves. But I will add it back.
I guess we'd need to keep a separate mapping from aliases to modules if we did name generation and there's already one for aliases.
There was a problem hiding this comment.
Uh, I didn't know. @aschackmull : Since this is not insignificant, do you know, whether the foundations team will address the issue (either by somehow shortening the names or making it impossible to import without first making an alias)?
michaelnebel
left a comment
There was a problem hiding this comment.
LGTM 👍
We should run DCA for Java and C# before merging.
We have this doc of identical files across languages: https://github.com/github/codeql/blob/main/config/identical-files.json
As an exercise, I created a shared library and removed the duplication between CS and Java.