Simple and out of the box Bigquery SQL Parser for python. Convert your SQL into python object which you can modify programatically.
pip3 install bigquery-sql-parserSince the SQL will be converted to Python object, hence the variable name will be very important to understand.
Please see how we use the variable and terms below.
Below is what we call syntax of query, and Query is the object python name.
# path/query.sql
SELECT
id,
name AS customer_name,
FROM
customer_table AS customerid, customer_name is columns and we have a Column python object for it.
idis theColumn.valueas long as theColumn.name, butnameisColumn.valueandcustomer_nameis theColumn.name
Got the different of value and name of Column object? I hope so.
Python sample:
from bigquery_sql_parser.query import Query
query = Query(your_query_here)
columns = query.columns
for column in query.columns
print(column.value, column.name)
#> id, id
#> name, customer_nameYou can just get the table name by calling Query Object Query.full_table_ids
from bigquery_sql_parser.query import Query
query = Query(your_query_here)
print(query.full_table_ids)Well, if you use CTE like this
WITH
cte1 AS (
xxxx
),
cte2 AS (
xxxx
)
SELECT *
FROM cte2,cte1It can't using Query object directly, you need to use Script instead
from bigquery_sql_parser.script import Script
from bigquery_sql_parser.query import Query
script = Script.from_file('query_path.sql')
cte_list = script.ctes
for cte in cte_list:
print(cte.name)
print(cte.text)
query = Query(cte.text)
query... # any query variable can call hereSince BigQuery support scripting, sometime you might want to scan query that has a lot of blocks and use Semicolon as separator.
CREATE FUNCTION x AS (());
DELETE TABLE x;
INSERT INTO x;
WITH cte1 AS (
SELECT * FROM table
)
SELECT
a AS b,
c AS d
FROM table_1It can't using Query object directly, you need to use Script instead
from bigquery_sql_parser.script import Script
from bigquery_sql_parser.query import Query
script = Script.from_file('query_path.sql')
blocks = script.blocks
for block in blocks:
if block.is_cte_statement and block.is_select_statement:
print(cte.name)
print(cte.text)
cte_query = Query(cte.text)
cte_query...
final_query = Query(block.final_cte_query)
final_query...And keep counting for others to supported...
- Where/Filter statement Object
- Column Functions, Complex Column Composition
- DML Object
- Adding context from BigQuery actual table like its partition, clusters, description, etcs
- Write back from python Object to actual query
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
Please make sure to update tests as appropriate.