51
votes

I'm new in cassandra, and I have to export the result of a specific query to a csv file.

I found the COPY command, but (from what I understand) it allows you only to copy an already existing table to a csv file, and what I want is to copy directly the stdout of my query to the csv file. is there any way to do it with COPY command or with another way ?

My command is style (select column1, column2 from table where condition = xy) and I'm using cqlsh.

14

14 Answers

64
votes

If you don't mind your data using a pipe ('|') as a delimiter, you can try using the -e flag on cqlsh. The -e flag allows you to send a query to Cassandra from the command prompt, where you could redirect or even perform a grep/awk/whatever on your output.

$ bin/cqlsh -e'SELECT video_id,title FROM stackoverflow.videos' > output.txt
$ cat output.txt

 video_id                             | title
--------------------------------------+---------------------------
 2977b806-df76-4dd7-a57e-11d361e72ce1 |                 Star Wars
 ab696e1f-78c0-45e6-893f-430e88db7f46 | The Witches of Whitewater
 15e6bc0d-6195-4d8b-ad25-771966c780c8 |              Pulp Fiction

(3 rows)

Older versions of cqlsh don't have the -e flag. For older versions of cqlsh, you can put your command into a file, and use the -f flag.

$ echo "SELECT video_id,title FROM stackoverflow.videos;" > select.cql
$ bin/cqlsh -f select.cql > output.txt

From here, doing a cat on output.txt should yield the same rows as above.

45
votes
  1. Use CAPTURE command to export the query result to a file.
cqlsh> CAPTURE
cqlsh> CAPTURE '/home/Desktop/user.csv';
cqlsh> select *from user;
Now capturing query output to '/home/Desktop/user.csv'.

Now, view the output of the query in /home/Desktop/user.csv

  1. Use DevCenter and execute a query. Right click on the output and select "Copy All as CSV" to paste the output in CSV.

enter image description here

10
votes

I just wrote a tool to export CQL query to CSV and JSON format. Give it a try :)

https://github.com/tenmax/cqlkit

5
votes
4
votes

In windows, double quotes should be used to enclose the CQL.

cqlsh -e"SELECT video_id,title FROM stackoverflow.videos" > output.txt

4
votes

You can use the COPY command to create the CSV file. e.g. copy table with selected columns. Columns are optional, if you select them, every column will be picked.

COPY TABLE (COL1, COL2) TO 'filename.csv' HEADER=TRUE/FALSE

For more reference https://docs.datastax.com/en/cql/3.3/cql/cql_reference/cqlshCopy.html

4
votes

In 2020th you can use DSBulk to export or import data to/from CSV (by default), or JSON. It could be as simple as:

dsbulk unload -k keyspace -t table -u user -p password -url filename

DSBulk is heavily optimized for fast data export, without putting too much load onto the coordinator node that happens when you just run select * from table.

You can control what columns to export, and even provide your own query, etc. See following blog posts for examples:

3
votes

If I am understanding correctly you want to redirect your output to stdout?

Put your cql command in a file. My files is called select.cql and contents are:

select id from wiki.solr limit 100;

Then issue the following and you get it to stdout:

cqlsh < select.cql

I hope this helps. From there on you can pipe it and add commas, remove headers etc.

3
votes

Cannot comment... To deal with "MORE" issue when there are more than 100 rows, simply add "paging off" before the SQL.

Something like

$ bin/cqlsh -e'PAGING OFF;SELECT video_id,title FROM stackoverflow.videos' > output.txt

This will cause a little messy at the beginning of the output file but can easily be removed afterwards.

3
votes

With bash:

If you need to query the data (not possible with COPY TO) and if you need the final product to be importable (ie with COPY FROM):

cqlsh -e "SELECT * FROM bar WHERE column = 'baz' > raw_output.txt

Then you can reformat the output with sed

sed 's/\ //g; /^----.*/d; /^(/d; /^\s*$/d;' raw_output.txt | tee clean_output.csv

Which pretty much says

sed 'remove spaces; remove the column boarder; remove lines beginning with (COUNT X); and remove blank lines' | write output into clean_output.csv

The sed regexp's could be cleaned up to better suite your specific case, but thats the general idea.

0
votes

CQL COPY is good option for importing or exporting data. But if you want to analyze some small query output you can run below command and save the output in a file.

cqlsh -e "SELECT * FROM table WHERE column = 'xyz' > queryoutput.txt

However, you can use CAPTURE also for saving output of the query to analyze something

0
votes

Follow the below steps to selectively export & import the Cassandra data.

Exporting:

  • Write all the select queries in a file named dump.cql like below

    paging off;

    select * from student where id=10;

    select * from student where id=15;

Note: Paging off is mandatory above the queries to avoid limiting the query results to default 100 records

  • Creating a dump

cqlsh -u user_name -p 'password' ip_address -k keyspace_name -f dump.cql > dump.csv;

(for remote machine)

or

cqlsh -k keyspace_name -f dump.cql > dump.csv;

(for local machine)

  • Removing whitespace characters from dump(It avoids removing whitespace withing json data)

sed -r 's/(\".*\")|\s*/\1/g' dump.csv > data_without_spaces.csv

Importing:

cqlsh -e "copy keyspace_name.table_name from 'data_without_spaces.csv' with delimiter = '|';"

0
votes

As the other guys have suggested, export the standard query output using ./cqlsh -e 'SELECT ...' > data.csv.

Once you have this you can easiliy replace the pipes ( | ) with commas using Excel (if you have it installed).

  1. First open your file in a text editor (vi/notepad++) and delete the separator that Cass puts in (-----+-------+---), as well as the info on the number of rows from the bottom.
  2. Open a new Excel workbook.
  3. Click on the Data tab.
  4. Click on "From Text/CSV" (top left).
  5. Select your file, specifiy the pipe symbol as a delimiter, click Load.
  6. That will create an .xlsx file so you'll have to Save As .csv manually. That will strip Excel's formatting and leave you with commas.
-1
votes

The person asking asked for CSV not text.

I did this hack get my results. It worked for me and I moved on with my day.

me:~/MOOSE2# echo "USE ████it; select * from samples_daily_buffer where dog_id=██48;" | cqlsh --cqlversion="3.4.4" cassandra0.stage.███████ | sed -e "s/ | */,/g" | sed -e "s/^ *//g" | tail -n +4 > ./myfile.csv