[SPARK-57249][SQL] Skip the dead result null assignment in DecimalDivideWithOverflowCheck codegen when overflow cannot produce null#56304
Open
LuciferYang wants to merge 1 commit into
Conversation
…ideWithOverflowCheck codegen when overflow cannot produce null DecimalDivideWithOverflowCheck.doGenCode always emitted `ev.isNull = ev.value == null;` after the `toPrecision` call in the non-null-left branch. The result can only become null in the nullOnOverflow path: `Decimal.toPrecision` returns null only when `nullOnOverflow` is true and overflow occurs; otherwise it throws on overflow and never returns null. On that branch `ev.isNull` is already initialized from `eval1.isNull` (false, since the branch runs only when the left operand is non-null). So when nullOnOverflow is false the assignment is a dead `ev.isNull = false` write. Gate the assignment on nullOnOverflow. This drops the dead write for the common ANSI path (the expression is produced by decimal Average, with nullOnOverflow = evalMode != ANSI, so false by default). Behavior is unchanged. Also adds a direct unit test, which the expression previously lacked.
Member
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
This is a sub-task of SPARK-56908 (reduce generated Java size in whole-stage codegen).
DecimalDivideWithOverflowCheck.doGenCodealways emitted the result null check after thetoPrecisioncall in the non-null-left branch:The result can only become null in the
nullOnOverflowpath:Decimal.toPrecisionreturns null only whennullOnOverflowis true and overflow occurs; otherwise it throws on overflow and never returns null. On this branchev.isNullis already initialized fromeval1.isNull(which is false here, since the branch runs only when the left operand is non-null). So whennullOnOverflowis false,ev.value == nullis always false and the assignment is a deadev.isNull = false;write.This PR gates the assignment on
nullOnOverflow:nullOnOverflowis the right predicate:nullable = nullOnOverflow, and on this branch the left operand is non-null, so child nullability is irrelevant. WhennullOnOverflowis true the assignment is kept (it is what turns a division overflow into a null result). The null-left handling and the throw-on-overflow path are unchanged.Why are the changes needed?
To reduce the size of the generated Java in whole-stage codegen, as tracked by SPARK-56908. The expression is produced by decimal
Average(DecimalDivideWithOverflowCheck(sum, count, resultType, ctx, evalMode != ANSI)), so under ANSI mode (the default)nullOnOverflowis false and this dead write was emitted on the decimalavgcodegen path.This completes the decimal dead-write cleanup that also covers
MakeDecimalandCheckOverflow(the remaining== nullchecks indecimalExpressions.scala, e.g.CheckOverflowInSum, are live and not removable).Does this PR introduce any user-facing change?
No. This is a codegen-only change;
eval,nullable,dataType,toString, and results are unchanged, so SQL output and plan/golden files are unaffected.How was this patch tested?
Adds a direct unit test in
DecimalExpressionSuite(the expression previously had no direct test), coveringnullOnOverflowfalse/true with no overflow, with an overflowing result (null whennullOnOverflow, error otherwise), and a null left operand (null vs error).checkEvaluationruns interpreted and codegen (mutable and unsafe projections). Also ran the decimalAVGaggregate tests inDataFrameAggregateSuite.Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude Code (Claude Opus 4.8)