1
votes

I want to bulk upload CSV into cassandra 2.0.3. Right now I have successfuly converted the CSV into sstables.

However when I run the sstableloader , there is an error message as below. Is this error affecting my bulkload as I do not find the imported data in the cassandra 2.0.3??

VirtualBox:~/apache-cassandra-2.0.3$ ./bin/sstableloader -d localhost airlines/flight/
ERROR 16:08:04,832 Unable to initialize MemoryMeter (jamm not specified as javaagent).  This means Cassandra will be unable to measure object sizes accurately and may consequently OOM.
Established connection to initial hosts
Opening sstables and calculating sections to stream
Streaming relevant part of airlines/flight/airlines-flight-jb-1-Data.db to [/127.0.0.1, /127.0.0.2]
progress: [/127.0.0.2 1/1 (100%)] [/127.0.0.1 1/1 (100%)] [total: 100% - 0MB/s (avg: 0MB/s)]
1
I have had this error show up but the data streamed in correctly. How many rows(aprrox) are you loading? (looks like you're doing it with few rows since the speed says 0MB/s)Raghuram Onti Srinivasan

1 Answers

1
votes

I wrapped my sstableloader job in a bash script, and initially, had the exact same error. I did some digging and found setting the JAVA_TOOL_OPTIONS environment variable fixed my issue.

Here's my script:

#!/bin/bash

# ------------------------
# paths to the cassandra source tree, cassandra jar and java
CASSANDRA_HOME="/usr/share/cassandra"
JAVA_AGENT="-javaagent:$CASSANDRA_HOME/lib/jamm-0.2.5.jar"
export JAVA_TOOL_OPTIONS=$JAVA_AGENT
# ------------------------

# ------------------------
# Initialize Parameters
SSTLOADER=`which sstableloader`
SSDATADIR=/usr/share/cassandra/scripts/sstable_load/data/<schema_name>/<column family>

CASSNODE="192.168.2.1"

# ------------------------
log_dir=/usr/share/cassandra/scripts/sstable_load/logs
dt=`date +'%Y%m%d_%H%M%S'`
logdest=$log_dir/sstabloader_"$dt".log
# ------------------------

exec 1>$logdest
echo "Job Started: " `date`
echo "Job Logged To: " $logdest
echo

# ------------------------
# Run the SSTableLoader Command
$SSTLOADER -v -d $CASSNODE -u <user> -pw <password> $SSDATADIR


echo
echo "Job Completed: " `date`

exit 0

Replace the script entries in <> with you're appropriate information.

Hope this works for you.

Please up vote.