Name: Lexega
Author: Lexega

This masking policy compiles, attaches to its columns, and registers in the governance inventory as a control. An existence-check audit confirms it's there:

CREATE OR REPLACE MASKING POLICY email_mask AS (val STRING)
  RETURNS STRING ->
  CASE
    WHEN CURRENT_ROLE() IN ('PII_ADMIN') THEN val
    WHEN CURRENT_ROLE() IN ('SUPPORT_TIER2') THEN val
    ELSE val
  END;

Every branch returns val: both role checks and the ELSE. Every row's actual value for that column gets returned to everyone, not just the privileged roles. The policy is a no-op.

A linter won't catch this, and it shouldn't be expected to. This is the boundary between linting and security analysis. SQLFluff and TSQLLint check how the SQL is written (layout, naming, dialect hygiene), and they're good at it. Keep them. What they can't tell you is whether a masking policy masks, whether dynamic SQL is injectable, whether a nullable join silently changes the result set, or whether a one-line migration just removed a load-bearing control.

A SQL linter mainly answers: is this SQL shaped correctly?

Lexega answers: can this SQL expose data, weaken a control, misstate a result, or destroy records?

Those are different gates. You want both.

Answering the second question is semantic SQL security analysis: following taint into dynamic SQL, evaluating what a policy body returns on each branch, reasoning about how NULLs move through a predicate, working out who a grant actually reaches. That's the gap this post walks through: six classes of failure across three axes (control, confidentiality, integrity), each shown as SQL that lints clean next to the finding that fires on it.

A masking policy that masks nothing

Confidentiality — exposure

Finding the bug above means evaluating what the policy body returns on each path:

[CRITICAL] MASK-ALLOW-ALL — Masking Policy body passes through the original value
without masking. Policy is effectively a no-op and sensitive data may be exposed.

Paste the policy into the Lexega Playground and watch it fail security review. That finding block is real output, like every one in this post (shown at Medium severity and above). No install, no account.

There's a second angle on the same exposure. Columns tagged as PII get traced through lineage to wherever they surface in an output (Q-FLOW-TAINT), so the finding lands at the point data leaves, even when nobody touched a policy at all.

SQL injection through dynamic SQL

Control — injection

CREATE OR REPLACE PROCEDURE purge_user_sessions(user_name VARCHAR)
  RETURNS VARCHAR
  LANGUAGE SQL
AS
$
BEGIN
  EXECUTE IMMEDIATE
    'DELETE FROM user_sessions WHERE user_name = ''' || user_name || '''';
  RETURN 'ok';
END;
$;

The parameter goes straight into the statement text through ||, so whoever controls user_name controls the statement that runs. It's textbook injection, just inside a stored procedure instead of an app. The safe version passes the value through EXECUTE IMMEDIATE ... USING so it's bound at execution rather than spliced into the string.

[CRITICAL] DYNSQL-CONCAT — Dynamic SQL argument is built via string concatenation
or FORMAT(...) interpolation. SQL injection risk — switch to a parameterized
query (USING clause / sp_executesql parameter binding).
[MEDIUM]   DYNSQL-NO-PARAM — Dynamic SQL executed without parameter binding when
the surface supports it (Snowflake USING, sp_executesql @params, EXECUTE … USING).
Bind runtime values rather than interpolating.

Two more signals fire on this statement at High (DYNSQL and PROC-DYNSQL, the statement-level and procedure-level views of the same vector); the two above carry the distinct guidance.

A SAST tool that taint-tracks SQL built up inside application code will catch the app-layer version of this. The blind spot is the SQL that ships as SQL (stored procedures, migrations, dbt models), where there's no host language for it to follow. Lexega tracks the same taint at the SQL level: across a procedure that calls another, and into sp_executesql calls where the parameter binding is there for appearances and the real value is still concatenated in. The Procedure That Looked Parameterized follows one of those chains end to end.

The LEFT JOIN that quietly becomes an INNER JOIN

Integrity — misstatement

SELECT
    t.transaction_id,
    t.amount,
    a.risk_rating
FROM transactions AS t
LEFT JOIN accounts AS a
    ON t.account_id = a.account_id
WHERE a.risk_rating = 'high';

This query keeps every transaction with a LEFT JOIN, then filters on a.risk_rating in the WHERE. For any transaction with no matching account, a.risk_rating is NULL, and NULL = 'high' is never true, so those rows drop and the LEFT JOIN collapses into the INNER JOIN nobody wrote. The rows that disappear are the transactions with no known account, which in a fraud or AML control are usually the ones that matter most. Nothing errors; the result set is just short.

SQLFluff parses this query, finds the style clean, and reports nothing. Lexega flags it because it follows the nullable side of the join: it knows a.risk_rating can be NULL specifically because of the LEFT JOIN, and that an equality test against a nullable column drops those rows. That's a property of what the query computes, not of how it's written.

[CRITICAL] Q-JOIN-LEFT-FILT — LEFT JOIN nullable side filtered in WHERE clause.
This effectively converts the LEFT JOIN to an INNER JOIN, likely a bug.

The other NULL hazard on this axis is the control that fails open:

DELETE FROM api_tokens
WHERE user_id NOT IN (
    SELECT user_id FROM active_employees
);

This DELETE revokes tokens for users no longer in active_employees. If that subquery ever returns a single NULL, NOT IN evaluates to UNKNOWN for every row, and the statement deletes nothing. No error, the run completes. A token-cleanup job written this way stops revoking the day a NULL appears in the employee list, and nothing about the run looks different.

The same shape breaks access reviews: a "who has access but isn't on the approved list" query written with NOT IN returns empty when the list contains a NULL, and empty is indistinguishable from a clean certification.

[HIGH] Q-NULL-NOTIN — NULL-logic hazard: NOT IN with subquery on column
'USER_ID'. If subquery returns any NULL, entire predicate evaluates to UNKNOWN
and filters all rows. Use NOT EXISTS or ensure subquery has WHERE ... IS NOT NULL.

Both findings turn on the analyzer knowing a column can be NULL and what one NULL does to a NOT IN. That's also what keeps it quiet on the safe versions: add WHERE user_id IS NOT NULL to the subquery, or handle the NULL side of the join, and neither signal fires.

GRANT ALL to PUBLIC, and the removal side of privilege

Control — privilege

GRANT ALL PRIVILEGES ON DATABASE analytics TO ROLE PUBLIC;

This hands every user in the account every privilege on the database, in one line. Nothing tests a GRANT, and as a one-line migration diff it reads like plumbing, so it clears review.

[CRITICAL] GRT-TO-PUBLIC — Avoid granting privileges to PUBLIC. The PUBLIC role
includes all users in the account, which may expose data unintentionally.
[HIGH]     GRT-ALL-PRIV — Avoid GRANT ALL PRIVILEGES. Use specific privilege
grants to follow the principle of least privilege.

You could nearly grep for this exact statement. The privilege surface stops being greppable one step out, though: ownership transfers (GRT-OWNER-XFER), broad role-scoped grants (GRT-BROAD-PRIV), and the removal side. DROP MASKING POLICY pii_email_mask; unmasks every column the policy governed, for every role that could already query the table (MASK-DROP, Critical). When an AI Agent Removes a Masking Policy walks through that shape, where an agent removed a policy while trying to fix a broken pipeline.

Cloud credentials hardcoded in DDL

Confidentiality — credentials

CREATE STAGE finance_export
    URL = 's3://acme-finance-exports/'
    CREDENTIALS = (AWS_KEY_ID = 'AKIAIOSFODNN7EXAMPLE'
                   AWS_SECRET_KEY = 'wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY');

This CREATE STAGE hardcodes an AWS key pair in its CREDENTIALS clause. A secret scanner usually catches the AKIA access-key ID by its prefix. The secret key beside it has no prefix to match on, so prefix-based scanning misses it. Lexega has parsed the statement and knows that literal sits in the AWS_SECRET_KEY position of a stage's credentials, so that alone is enough to flag it. The same covers credential slots in COPY INTO, ALTER STAGE, and Redshift COPY/UNLOAD:

[CRITICAL] CRED-AWS-LEAK — Hardcoded AWS access key detected (AKIA.../ASIA...).
Never commit credentials to source code. Use secure parameter passing, secrets
managers, or storage integrations instead.
[CRITICAL] CRED-PWD-LEAK — Hardcoded password detected in SQL. Use secure
parameter passing (e.g., :password_param) or secrets management instead of
literal passwords.

The DELETE with a WHERE clause that still wipes the table

Integrity — destructive writes

DELETE FROM login_audit
WHERE retention_class = 'expired'
   OR retention_class <> 'expired'
   OR retention_class IS NULL;

This DELETE has three OR branches: = 'expired', <> 'expired', and IS NULL. Between them they cover every possible value of retention_class, so the WHERE matches every row and the table is emptied. A bare DELETE with no WHERE is a familiar enough footgun that some linters check for it, and WHERE 1 = 1 is the standard way past that check. But the check is really for whether a WHERE node is present, and this statement has one, built from real column names and three conditions that each look reasonable on their own.

Lexega evaluates what the predicate actually matches:

[CRITICAL] DML-WRITE-UNBOUNDED — Unbounded write operation detected - the
statement has no WHERE clause or its WHERE clause matches every row. This
affects ALL rows in the target table(s).
[MEDIUM]   Q-PRED-TAUTOLOGY — Tautological predicate: condition is always true
for non-NULL values (e.g., WHERE x=1 OR x<>1). The WHERE clause filters at most
NULL rows.

Same NULL discipline as everywhere else here: delete the IS NULL branch and the Critical goes away, because rows where retention_class is NULL would now survive, so the predicate no longer covers every row, leaving just the Medium. It's the same three-valued logic that made the NOT IN delete nothing, pointed the other way.

Unguarded DROP TABLE (TBL-DROP) and TRUNCATE (TBL-TRUNCATE) sit in the same bucket. On an audit or log table, a stray "cleanup" that empties it is what a regulator would call spoliation.

A finding isn't a verdict

Every signal in this post is raised unconditionally. The analyzer reports what's true about the SQL, and reports it the same way every run — it doesn't decide your merge. That decision is a separate layer, and it's yours to set.

Linters come by their false-positive reputation honestly: style is judgment calls all the way down, and the standard escape hatch is an inline -- noqa on the line under review. It works, and it's also invisible — the suppression ships inside the same diff, added by the person who wanted the warning gone.

Lexega keeps those two decisions apart. A policy maps each rule to block, warn, or allow, and can vary by environment — so the same finding can stop a production migration and merely annotate a feature branch:

policies:
  - rule_id: DML-WRITE-UNBOUNDED
    action: block
    envs: ["prod"]
    requires_exception: true   # blocked unless a reviewed exception exists
  - rule_id: GRT-TO-PUBLIC
    action: warn               # surfaced everywhere, blocks nothing

When a block is a genuine false positive — or a real risk that's been reviewed and accepted — the override is an exception, not a comment in the SQL. It carries the rule, a reason, a ticket, who approved it, an optional expiry, and the scope it covers:

exceptions:
  - exception_id: EX-2026-014
    policy_id: prod-data-safety
    rule_id: DML-WRITE-UNBOUNDED
    approved_by: [email protected]
    approved_at: "2026-06-05T00:00:00Z"
    ticket: DATA-1234
    reason: "One-time backfill, reviewed in DATA-1234"
    scope:
      scoped:
        path_prefixes: ["migrations/2026_06_backfill/"]
        expires_at: "2026-07-01T00:00:00Z"

Suppression still exists — every gate needs it — but it lives in a reviewed artifact with an owner, a scope, and an expiry, not a pragma a developer drops into a migration to clear CI. A blanket exception has to be declared blanket, on purpose. And every decision is recorded: which signals fired, which exceptions were used, against which bundle. Waving one through becomes a line in the audit trail instead of a silent edit. The knobs are there; the quiet evasion isn't.

Run both

Run SQLFluff across every example above and it does its job: an unindented line here, an uppercase identifier there, nothing about the risk in any of them, because that was never the job. The risk gate is the semantic analysis this post walked through, and Lexega runs 600+ rules of it across Snowflake, PostgreSQL, BigQuery, MySQL, MSSQL, Databricks, and Redshift. It doesn't replace the linter; run it next to one as a separate gate:

- run: sqlfluff lint models/                                # style gate
- run: lexega-sql review main..HEAD models/ -r --pr-comment # risk gate

For how signals, rules, and policy decisions fit together behind that second gate, see How Lexega Turns SQL Into Signals.

Try it, including the SQL you think is safe

1. Try the examples in the playground. Paste any example above into the Lexega Playground and the signals fire. The better test is making them stop: add WHERE user_id IS NOT NULL to the NOT IN subquery, handle the NULL side of the LEFT JOIN, bind the dynamic SQL through USING. When the SQL is actually safe, the findings go away. Then try one of your own migrations.

2. Get a free trial key. 30 days, full functionality, no credit card, no account. Instant activation.

3. Run it locally and in CI.

curl -fsSL https://lexega.com/install.sh | sh
export LEXEGA_LICENSE_KEY=<your-key>
lexega-sql analyze --dialect snowflake migration.sql --min-severity medium