These functions are low-level functions designed to be used by experts. Each of these low-level functions is paired with a high-level function that you should use instead:
bq_perform_copy()
:bq_table_copy()
.bq_perform_query()
:bq_dataset_query()
,bq_project_query()
.bq_perform_upload()
:bq_table_upload()
.bq_perform_load()
:bq_table_load()
.bq_perform_extract()
:bq_table_save()
.
Usage
bq_perform_extract(
x,
destination_uris,
destination_format = "NEWLINE_DELIMITED_JSON",
compression = "NONE",
...,
print_header = TRUE,
billing = x$project
)
bq_perform_upload(
x,
values,
fields = NULL,
create_disposition = "CREATE_IF_NEEDED",
write_disposition = "WRITE_EMPTY",
...,
billing = x$project
)
bq_perform_load(
x,
source_uris,
billing = x$project,
source_format = "NEWLINE_DELIMITED_JSON",
fields = NULL,
nskip = 0,
create_disposition = "CREATE_IF_NEEDED",
write_disposition = "WRITE_EMPTY",
...
)
bq_perform_query(
query,
billing,
...,
parameters = NULL,
destination_table = NULL,
default_dataset = NULL,
create_disposition = "CREATE_IF_NEEDED",
write_disposition = "WRITE_EMPTY",
use_legacy_sql = FALSE,
priority = "INTERACTIVE"
)
bq_perform_query_dry_run(
query,
billing,
...,
default_dataset = NULL,
parameters = NULL,
use_legacy_sql = FALSE
)
bq_perform_copy(
src,
dest,
create_disposition = "CREATE_IF_NEEDED",
write_disposition = "WRITE_EMPTY",
...,
billing = NULL
)
Arguments
- x
A bq_table
- destination_uris
A character vector of fully-qualified Google Cloud Storage URIs where the extracted table should be written. Can export up to 1 Gb of data per file. Use a wild card URI (e.g.
gs://[YOUR_BUCKET]/file-name-*.json
) to automatically create any number of files.- destination_format
The exported file format. Possible values include "CSV", "NEWLINE_DELIMITED_JSON" and "AVRO". Tables with nested or repeated fields cannot be exported as CSV.
- compression
The compression type to use for exported files. Possible values include "GZIP", "DEFLATE", "SNAPPY", and "NONE". "DEFLATE" and "SNAPPY" are only supported for Avro.
- ...
Additional arguments passed on to the underlying API call. snake_case names are automatically converted to camelCase.
- print_header
Whether to print out a header row in the results.
- billing
Identifier of project to bill.
- values
Data frame of values to insert.
- fields
A bq_fields specification, or something coercible to it (like a data frame). Leave as
NULL
to allow BigQuery to auto-detect the fields.- create_disposition
Specifies whether the job is allowed to create new tables.
The following values are supported:
"CREATE_IF_NEEDED": If the table does not exist, BigQuery creates the table.
"CREATE_NEVER": The table must already exist. If it does not, a 'notFound' error is returned in the job result.
- write_disposition
Specifies the action that occurs if the destination table already exists. The following values are supported:
"WRITE_TRUNCATE": If the table already exists, BigQuery overwrites the table data.
"WRITE_APPEND": If the table already exists, BigQuery appends the data to the table.
"WRITE_EMPTY": If the table already exists and contains data, a 'duplicate' error is returned in the job result.
- source_uris
The fully-qualified URIs that point to your data in Google Cloud.
For Google Cloud Storage URIs: Each URI can contain one `'*'`` wildcard character and it must come after the 'bucket' name. Size limits related to load jobs apply to external data sources.
For Google Cloud Bigtable URIs: Exactly one URI can be specified and it has be a fully specified and valid HTTPS URL for a Google Cloud Bigtable table. For Google Cloud Datastore backups: Exactly one URI can be specified. Also, the '*' wildcard character is not allowed.
- source_format
The format of the data files:
For CSV files, specify "CSV".
For datastore backups, specify "DATASTORE_BACKUP".
For newline-delimited JSON, specify "NEWLINE_DELIMITED_JSON".
For Avro, specify "AVRO".
For parquet, specify "PARQUET".
For orc, specify "ORC".
- nskip
For
source_format = "CSV"
, the number of header rows to skip.- query
SQL query string.
- parameters
Named list of parameters match to query parameters. Parameter
x
will be matched to placeholder@x
.Generally, you can supply R vectors and they will be automatically converted to the correct type. If you need greater control, you can call
bq_param_scalar()
orbq_param_array()
explicitly.See https://cloud.google.com/bigquery/docs/parameterized-queries for more details.
- destination_table
A bq_table where results should be stored. If not supplied, results will be saved to a temporary table that lives in a special dataset. You must supply this parameter for large queries (> 128 MB compressed).
- default_dataset
A bq_dataset used to automatically qualify table names.
- use_legacy_sql
If
TRUE
will use BigQuery's legacy SQL format.- priority
Specifies a priority for the query. Possible values include "INTERACTIVE" and "BATCH". Batch queries do not start immediately, but are not rate-limited in the same way as interactive queries.
Value
A bq_job.
Examples
ds <- bq_test_dataset()
bq_mtcars <- bq_table(ds, "mtcars")
job <- bq_perform_upload(bq_mtcars, mtcars)
bq_table_exists(bq_mtcars)
#> [1] FALSE
bq_job_wait(job)
bq_table_exists(bq_mtcars)
#> [1] TRUE
head(bq_table_download(bq_mtcars))
#> # A tibble: 6 × 11
#> am carb vs qsec wt drat disp hp cyl gear mpg
#> <int> <int> <int> <dbl> <dbl> <dbl> <dbl> <int> <int> <int> <dbl>
#> 1 0 2 1 20 3.19 3.69 147. 62 4 4 24.4
#> 2 0 2 1 22.9 3.15 3.92 141. 95 4 4 22.8
#> 3 0 1 1 20.0 2.46 3.7 120. 97 4 3 21.5
#> 4 0 1 1 19.4 3.22 3.08 258 110 6 3 21.4
#> 5 0 1 1 20.2 3.46 2.76 225 105 6 3 18.1
#> 6 0 4 1 18.3 3.44 3.92 168. 123 6 4 19.2