Skip to content

Querying Large CSV Datasets

 

 

Use the Filter

From the CSV view page, you can select 1-add filter, 2-then select the column that has multiple similar entries, 3-then download.

 

Use the API

Data Source

Any type of Large file size CSV with more than one million rows.

  1.  

Using the datastore URL to Query large datasets:

Downloading chunks defined by offset and limit parameters from the datastore using the following url syntax:

"https://data.cnra.ca.gov/datastore/dump/{resource Id}?limit=20&offset=20"

Simply change the resource id, and set the limit and offset numbers as follows:

  • Limit = the number of records to download,
  • Offset = the position of the first record. 

 

So, to download the first 10 records of the dataset, use the limit parameter: 

https://data.cnra.ca.gov/datastore/dump/bfa9f262-24a1-45bd-8dc8-138bc8107266?limit=10

(in this case, offset parameter is not used)

 

To download the second 10 records of the dataset, use the limit and offset parameters like this:

https://data.cnra.ca.gov/datastore/dump/bfa9f262-24a1-45bd-8dc8-138bc8107266?limit=10&offset=10

 

Downloading the File in Chunks of 500k Records with Three Calls:

  • get first chunk of 500k rows

https://data.cnra.ca.gov/datastore/dump/bfa9f262-24a1-45bd-8dc8-138bc8107266?limit=500000

 

  • get second chunk of 500k rows

https://data.cnra.ca.gov/datastore/dump/bfa9f262-24a1-45bd-8dc8-138bc8107266?limit=500000&offset=500000

 

  • get third chunk of 500k rows:

https://data.cnra.ca.gov/datastore/dump/bfa9f262-24a1-45bd-8dc8-138bc8107266?limit=500000&offset=1000000

 

CKAN Datastore Documentation:

https://docs.ckan.org/en/2.9/maintaining/datastore.html#downloading-resources