It seems there are different ways to read and write data of files in Java.
I want to read ASCII data from a file. What are the possible ways and their differences?
It seems there are different ways to read and write data of files in Java.
I want to read ASCII data from a file. What are the possible ways and their differences?
ASCII is a TEXT file so you would use Readers
for reading. Java also supports reading from a binary file using InputStreams
. If the files being read are huge then you would want to use a BufferedReader
on top of a FileReader
to improve read performance.
Go through this article on how to use a Reader
I'd also recommend you download and read this wonderful (yet free) book called Thinking In Java
In Java 7:
new String(Files.readAllBytes(...))
(docs) or
Files.readAllLines(...)
In Java 8:
Files.lines(..).forEach(...)
My favorite way to read a small file is to use a BufferedReader and a StringBuilder. It is very simple and to the point (though not particularly effective, but good enough for most cases):
BufferedReader br = new BufferedReader(new FileReader("file.txt"));
try {
StringBuilder sb = new StringBuilder();
String line = br.readLine();
while (line != null) {
sb.append(line);
sb.append(System.lineSeparator());
line = br.readLine();
}
String everything = sb.toString();
} finally {
br.close();
}
Some has pointed out that after Java 7 you should use try-with-resources (i.e. auto close) features:
try(BufferedReader br = new BufferedReader(new FileReader("file.txt"))) {
StringBuilder sb = new StringBuilder();
String line = br.readLine();
while (line != null) {
sb.append(line);
sb.append(System.lineSeparator());
line = br.readLine();
}
String everything = sb.toString();
}
When I read strings like this, I usually want to do some string handling per line anyways, so then I go for this implementation.
Though if I want to actually just read a file into a String, I always use Apache Commons IO with the class IOUtils.toString() method. You can have a look at the source here:
http://www.docjar.com/html/api/org/apache/commons/io/IOUtils.java.html
FileInputStream inputStream = new FileInputStream("foo.txt");
try {
String everything = IOUtils.toString(inputStream);
} finally {
inputStream.close();
}
And even simpler with Java 7:
try(FileInputStream inputStream = new FileInputStream("foo.txt")) {
String everything = IOUtils.toString(inputStream);
// do something with everything string
}
The easiest way is to use the Scanner
class in Java and the FileReader object. Simple example:
Scanner in = new Scanner(new FileReader("filename.txt"));
Scanner
has several methods for reading in strings, numbers, etc... You can look for more information on this on the Java documentation page.
For example reading the whole content into a String
:
StringBuilder sb = new StringBuilder();
while(in.hasNext()) {
sb.append(in.next());
}
in.close();
outString = sb.toString();
Also if you need a specific encoding you can use this instead of FileReader
:
new InputStreamReader(new FileInputStream(fileUtf8), StandardCharsets.UTF_8)
Here's another way to do it without using external libraries:
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
public String readFile(String filename)
{
String content = null;
File file = new File(filename); // For example, foo.txt
FileReader reader = null;
try {
reader = new FileReader(file);
char[] chars = new char[(int) file.length()];
reader.read(chars);
content = new String(chars);
reader.close();
} catch (IOException e) {
e.printStackTrace();
} finally {
if(reader != null){
reader.close();
}
}
return content;
}
I had to benchmark the different ways. I shall comment on my findings but, in short, the fastest way is to use a plain old BufferedInputStream over a FileInputStream. If many files must be read then three threads will reduce the total execution time to roughly half, but adding more threads will progressively degrade performance until making it take three times longer to complete with twenty threads than with just one thread.
The assumption is that you must read a file and do something meaningful with its contents. In the examples here is reading lines from a log and count the ones which contain values that exceed a certain threshold. So I am assuming that the one-liner Java 8 Files.lines(Paths.get("/path/to/file.txt")).map(line -> line.split(";"))
is not an option.
I tested on Java 1.8, Windows 7 and both SSD and HDD drives.
I wrote six different implementations:
rawParse: Use BufferedInputStream over a FileInputStream and then cut lines reading byte by byte. This outperformed any other single-thread approach, but it may be very inconvenient for non-ASCII files.
lineReaderParse: Use a BufferedReader over a FileReader, read line by line, split lines by calling String.split(). This is approximatedly 20% slower that rawParse.
lineReaderParseParallel: This is the same as lineReaderParse, but it uses several threads. This is the fastest option overall in all cases.
nioFilesParse: Use java.nio.files.Files.lines()
nioAsyncParse: Use an AsynchronousFileChannel with a completion handler and a thread pool.
nioMemoryMappedParse: Use a memory-mapped file. This is really a bad idea yielding execution times at least three times longer than any other implementation.
These are the average times for reading 204 files of 4 MB each on an quad-core i7 and SSD drive. The files are generated on the fly to avoid disk caching.
rawParse 11.10 sec
lineReaderParse 13.86 sec
lineReaderParseParallel 6.00 sec
nioFilesParse 13.52 sec
nioAsyncParse 16.06 sec
nioMemoryMappedParse 37.68 sec
I found a difference smaller than I expected between running on an SSD or an HDD drive being the SSD approximately 15% faster. This may be because the files are generated on an unfragmented HDD and they are read sequentially, therefore the spinning drive can perform nearly as an SSD.
I was surprised by the low performance of the nioAsyncParse implementation. Either I have implemented something in the wrong way or the multi-thread implementation using NIO and a completion handler performs the same (or even worse) than a single-thread implementation with the java.io API. Moreover the asynchronous parse with a CompletionHandler is much longer in lines of code and tricky to implement correctly than a straight implementation on old streams.
Now the six implementations followed by a class containing them all plus a parametrizable main() method that allows to play with the number of files, file size and concurrency degree. Note that the size of the files varies plus minus 20%. This is to avoid any effect due to all the files being of exactly the same size.
rawParse
public void rawParse(final String targetDir, final int numberOfFiles) throws IOException, ParseException {
overrunCount = 0;
final int dl = (int) ';';
StringBuffer lineBuffer = new StringBuffer(1024);
for (int f=0; f<numberOfFiles; f++) {
File fl = new File(targetDir+filenamePreffix+String.valueOf(f)+".txt");
FileInputStream fin = new FileInputStream(fl);
BufferedInputStream bin = new BufferedInputStream(fin);
int character;
while((character=bin.read())!=-1) {
if (character==dl) {
// Here is where something is done with each line
doSomethingWithRawLine(lineBuffer.toString());
lineBuffer.setLength(0);
}
else {
lineBuffer.append((char) character);
}
}
bin.close();
fin.close();
}
}
public final void doSomethingWithRawLine(String line) throws ParseException {
// What to do for each line
int fieldNumber = 0;
final int len = line.length();
StringBuffer fieldBuffer = new StringBuffer(256);
for (int charPos=0; charPos<len; charPos++) {
char c = line.charAt(charPos);
if (c==DL0) {
String fieldValue = fieldBuffer.toString();
if (fieldValue.length()>0) {
switch (fieldNumber) {
case 0:
Date dt = fmt.parse(fieldValue);
fieldNumber++;
break;
case 1:
double d = Double.parseDouble(fieldValue);
fieldNumber++;
break;
case 2:
int t = Integer.parseInt(fieldValue);
fieldNumber++;
break;
case 3:
if (fieldValue.equals("overrun"))
overrunCount++;
break;
}
}
fieldBuffer.setLength(0);
}
else {
fieldBuffer.append(c);
}
}
}
lineReaderParse
public void lineReaderParse(final String targetDir, final int numberOfFiles) throws IOException, ParseException {
String line;
for (int f=0; f<numberOfFiles; f++) {
File fl = new File(targetDir+filenamePreffix+String.valueOf(f)+".txt");
FileReader frd = new FileReader(fl);
BufferedReader brd = new BufferedReader(frd);
while ((line=brd.readLine())!=null)
doSomethingWithLine(line);
brd.close();
frd.close();
}
}
public final void doSomethingWithLine(String line) throws ParseException {
// Example of what to do for each line
String[] fields = line.split(";");
Date dt = fmt.parse(fields[0]);
double d = Double.parseDouble(fields[1]);
int t = Integer.parseInt(fields[2]);
if (fields[3].equals("overrun"))
overrunCount++;
}
lineReaderParseParallel
public void lineReaderParseParallel(final String targetDir, final int numberOfFiles, final int degreeOfParalelism) throws IOException, ParseException, InterruptedException {
Thread[] pool = new Thread[degreeOfParalelism];
int batchSize = numberOfFiles / degreeOfParalelism;
for (int b=0; b<degreeOfParalelism; b++) {
pool[b] = new LineReaderParseThread(targetDir, b*batchSize, b*batchSize+b*batchSize);
pool[b].start();
}
for (int b=0; b<degreeOfParalelism; b++)
pool[b].join();
}
class LineReaderParseThread extends Thread {
private String targetDir;
private int fileFrom;
private int fileTo;
private DateFormat fmt = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
private int overrunCounter = 0;
public LineReaderParseThread(String targetDir, int fileFrom, int fileTo) {
this.targetDir = targetDir;
this.fileFrom = fileFrom;
this.fileTo = fileTo;
}
private void doSomethingWithTheLine(String line) throws ParseException {
String[] fields = line.split(DL);
Date dt = fmt.parse(fields[0]);
double d = Double.parseDouble(fields[1]);
int t = Integer.parseInt(fields[2]);
if (fields[3].equals("overrun"))
overrunCounter++;
}
@Override
public void run() {
String line;
for (int f=fileFrom; f<fileTo; f++) {
File fl = new File(targetDir+filenamePreffix+String.valueOf(f)+".txt");
try {
FileReader frd = new FileReader(fl);
BufferedReader brd = new BufferedReader(frd);
while ((line=brd.readLine())!=null) {
doSomethingWithTheLine(line);
}
brd.close();
frd.close();
} catch (IOException | ParseException ioe) { }
}
}
}
nioFilesParse
public void nioFilesParse(final String targetDir, final int numberOfFiles) throws IOException, ParseException {
for (int f=0; f<numberOfFiles; f++) {
Path ph = Paths.get(targetDir+filenamePreffix+String.valueOf(f)+".txt");
Consumer<String> action = new LineConsumer();
Stream<String> lines = Files.lines(ph);
lines.forEach(action);
lines.close();
}
}
class LineConsumer implements Consumer<String> {
@Override
public void accept(String line) {
// What to do for each line
String[] fields = line.split(DL);
if (fields.length>1) {
try {
Date dt = fmt.parse(fields[0]);
}
catch (ParseException e) {
}
double d = Double.parseDouble(fields[1]);
int t = Integer.parseInt(fields[2]);
if (fields[3].equals("overrun"))
overrunCount++;
}
}
}
nioAsyncParse
public void nioAsyncParse(final String targetDir, final int numberOfFiles, final int numberOfThreads, final int bufferSize) throws IOException, ParseException, InterruptedException {
ScheduledThreadPoolExecutor pool = new ScheduledThreadPoolExecutor(numberOfThreads);
ConcurrentLinkedQueue<ByteBuffer> byteBuffers = new ConcurrentLinkedQueue<ByteBuffer>();
for (int b=0; b<numberOfThreads; b++)
byteBuffers.add(ByteBuffer.allocate(bufferSize));
for (int f=0; f<numberOfFiles; f++) {
consumerThreads.acquire();
String fileName = targetDir+filenamePreffix+String.valueOf(f)+".txt";
AsynchronousFileChannel channel = AsynchronousFileChannel.open(Paths.get(fileName), EnumSet.of(StandardOpenOption.READ), pool);
BufferConsumer consumer = new BufferConsumer(byteBuffers, fileName, bufferSize);
channel.read(consumer.buffer(), 0l, channel, consumer);
}
consumerThreads.acquire(numberOfThreads);
}
class BufferConsumer implements CompletionHandler<Integer, AsynchronousFileChannel> {
private ConcurrentLinkedQueue<ByteBuffer> buffers;
private ByteBuffer bytes;
private String file;
private StringBuffer chars;
private int limit;
private long position;
private DateFormat frmt = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
public BufferConsumer(ConcurrentLinkedQueue<ByteBuffer> byteBuffers, String fileName, int bufferSize) {
buffers = byteBuffers;
bytes = buffers.poll();
if (bytes==null)
bytes = ByteBuffer.allocate(bufferSize);
file = fileName;
chars = new StringBuffer(bufferSize);
frmt = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
limit = bufferSize;
position = 0l;
}
public ByteBuffer buffer() {
return bytes;
}
@Override
public synchronized void completed(Integer result, AsynchronousFileChannel channel) {
if (result!=-1) {
bytes.flip();
final int len = bytes.limit();
int i = 0;
try {
for (i = 0; i < len; i++) {
byte by = bytes.get();
if (by=='\n') {
// ***
// The code used to process the line goes here
chars.setLength(0);
}
else {
chars.append((char) by);
}
}
}
catch (Exception x) {
System.out.println(
"Caught exception " + x.getClass().getName() + " " + x.getMessage() +
" i=" + String.valueOf(i) + ", limit=" + String.valueOf(len) +
", position="+String.valueOf(position));
}
if (len==limit) {
bytes.clear();
position += len;
channel.read(bytes, position, channel, this);
}
else {
try {
channel.close();
}
catch (IOException e) {
}
consumerThreads.release();
bytes.clear();
buffers.add(bytes);
}
}
else {
try {
channel.close();
}
catch (IOException e) {
}
consumerThreads.release();
bytes.clear();
buffers.add(bytes);
}
}
@Override
public void failed(Throwable e, AsynchronousFileChannel channel) {
}
};
FULL RUNNABLE IMPLEMENTATION OF ALL CASES
https://github.com/sergiomt/javaiobenchmark/blob/master/FileReadBenchmark.java
Here are the three working and tested methods:
BufferedReader
package io;
import java.io.*;
public class ReadFromFile2 {
public static void main(String[] args)throws Exception {
File file = new File("C:\\Users\\pankaj\\Desktop\\test.java");
BufferedReader br = new BufferedReader(new FileReader(file));
String st;
while((st=br.readLine()) != null){
System.out.println(st);
}
}
}
Scanner
package io;
import java.io.File;
import java.util.Scanner;
public class ReadFromFileUsingScanner {
public static void main(String[] args) throws Exception {
File file = new File("C:\\Users\\pankaj\\Desktop\\test.java");
Scanner sc = new Scanner(file);
while(sc.hasNextLine()){
System.out.println(sc.nextLine());
}
}
}
FileReader
package io;
import java.io.*;
public class ReadingFromFile {
public static void main(String[] args) throws Exception {
FileReader fr = new FileReader("C:\\Users\\pankaj\\Desktop\\test.java");
int i;
while ((i=fr.read()) != -1){
System.out.print((char) i);
}
}
}
Scanner
classpackage io;
import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;
public class ReadingEntireFileWithoutLoop {
public static void main(String[] args) throws FileNotFoundException {
File file = new File("C:\\Users\\pankaj\\Desktop\\test.java");
Scanner sc = new Scanner(file);
sc.useDelimiter("\\Z");
System.out.println(sc.next());
}
}
The methods within org.apache.commons.io.FileUtils
may also be very handy, e.g.:
/**
* Reads the contents of a file line by line to a List
* of Strings using the default encoding for the VM.
*/
static List readLines(File file)
I documented 15 ways to read a file in Java and then tested them for speed with various file sizes - from 1 KB to 1 GB and here are the top three ways to do this:
java.nio.file.Files.readAllBytes()
Tested to work in Java 7, 8, and 9.
import java.io.File;
import java.io.IOException;
import java.nio.file.Files;
public class ReadFile_Files_ReadAllBytes {
public static void main(String [] pArgs) throws IOException {
String fileName = "c:\\temp\\sample-10KB.txt";
File file = new File(fileName);
byte [] fileBytes = Files.readAllBytes(file.toPath());
char singleChar;
for(byte b : fileBytes) {
singleChar = (char) b;
System.out.print(singleChar);
}
}
}
java.io.BufferedReader.readLine()
Tested to work in Java 7, 8, 9.
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
public class ReadFile_BufferedReader_ReadLine {
public static void main(String [] args) throws IOException {
String fileName = "c:\\temp\\sample-10KB.txt";
FileReader fileReader = new FileReader(fileName);
try (BufferedReader bufferedReader = new BufferedReader(fileReader)) {
String line;
while((line = bufferedReader.readLine()) != null) {
System.out.println(line);
}
}
}
}
java.nio.file.Files.lines()
This was tested to work in Java 8 and 9 but won't work in Java 7 because of the lambda expression requirement.
import java.io.File;
import java.io.IOException;
import java.nio.file.Files;
import java.util.stream.Stream;
public class ReadFile_Files_Lines {
public static void main(String[] pArgs) throws IOException {
String fileName = "c:\\temp\\sample-10KB.txt";
File file = new File(fileName);
try (Stream linesStream = Files.lines(file.toPath())) {
linesStream.forEach(line -> {
System.out.println(line);
});
}
}
}
The buffered stream classes are much more performant in practice, so much so that the NIO.2 API includes methods that specifically return these stream classes, in part to encourage you always to use buffered streams in your application.
Here is an example:
Path path = Paths.get("/myfolder/myfile.ext");
try (BufferedReader reader = Files.newBufferedReader(path)) {
// Read from the stream
String currentLine = null;
while ((currentLine = reader.readLine()) != null)
//do your code here
} catch (IOException e) {
// Handle file I/O exception...
}
You can replace this code
BufferedReader reader = Files.newBufferedReader(path);
with
BufferedReader br = new BufferedReader(new FileReader("/myfolder/myfile.ext"));
I recommend this article to learn the main uses of Java NIO and IO.
Using BufferedReader:
import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
BufferedReader br;
try {
br = new BufferedReader(new FileReader("/fileToRead.txt"));
try {
String x;
while ( (x = br.readLine()) != null ) {
// Printing out each line in the file
System.out.println(x);
}
}
catch (IOException e) {
e.printStackTrace();
}
}
catch (FileNotFoundException e) {
System.out.println(e);
e.printStackTrace();
}
This is basically the exact same as Jesus Ramos' answer, except with File instead of FileReader plus iteration to step through the contents of the file.
Scanner in = new Scanner(new File("filename.txt"));
while (in.hasNext()) { // Iterates each line in the file
String line = in.nextLine();
// Do something with line
}
in.close(); // Don't forget to close resource leaks
... throws FileNotFoundException
The most intuitive method is introduced in Java 11 Files.readString
import java.io.*;
import java.nio.file.Files;
import java.nio.file.Paths;
public class App {
public static void main(String args[]) throws IOException {
String content = Files.readString(Paths.get("D:\\sandbox\\mvn\\my-app\\my-app.iml"));
System.out.print(content);
}
}
PHP has this luxury for decades! ☺
The most simple way to read data from a file in Java is making use of the File class to read the file and the Scanner class to read the content of the file.
public static void main(String args[])throws Exception
{
File f = new File("input.txt");
takeInputIn2DArray(f);
}
public static void takeInputIn2DArray(File f) throws Exception
{
Scanner s = new Scanner(f);
int a[][] = new int[20][20];
for(int i=0; i<20; i++)
{
for(int j=0; j<20; j++)
{
a[i][j] = s.nextInt();
}
}
}
PS: Don't forget to import java.util.*; for Scanner to work.
You can use readAllLines and the join
method to get whole file content in one line:
String str = String.join("\n",Files.readAllLines(Paths.get("e:\\text.txt")));
It uses UTF-8 encoding by default, which reads ASCII data correctly.
Also you can use readAllBytes:
String str = new String(Files.readAllBytes(Paths.get("e:\\text.txt")), StandardCharsets.UTF_8);
I think readAllBytes is faster and more precise, because it does not replace new line with \n
and also new line may be \r\n
. It is depending on your needs which one is suitable.
I don't see it mentioned yet in the other answers so far. But if "Best" means speed, then the new Java I/O (NIO) might provide the fastest preformance, but not always the easiest to figure out for someone learning.
http://download.oracle.com/javase/tutorial/essential/io/file.html
Guava provides a one-liner for this:
import com.google.common.base.Charsets;
import com.google.common.io.Files;
String contents = Files.toString(filePath, Charsets.UTF_8);
This might not be the exact answer to the question. It's just another way of reading a file where you do not explicitly specify the path to your file in your Java code and instead, you read it as a command-line argument.
With the following code,
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.io.IOException;
public class InputReader{
public static void main(String[] args)throws IOException{
BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
String s="";
while((s=br.readLine())!=null){
System.out.println(s);
}
}
}
just go ahead and run it with:
java InputReader < input.txt
This would read the contents of the input.txt
and print it to the your console.
You can also make your System.out.println()
to write to a specific file through the command line as follows:
java InputReader < input.txt > output.txt
This would read from input.txt
and write to output.txt
.
For JSF-based Maven web applications, just use ClassLoader and the Resources
folder to read in any file you want:
Put the Apache Commons IO dependency into your POM:
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-io</artifactId>
<version>1.3.2</version>
</dependency>
Use the code below to read it (e.g. below is reading in a .json file):
String metadata = null;
FileInputStream inputStream;
try {
ClassLoader loader = Thread.currentThread().getContextClassLoader();
inputStream = (FileInputStream) loader
.getResourceAsStream("/metadata.json");
metadata = IOUtils.toString(inputStream);
inputStream.close();
}
catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return metadata;
You can do the same for text files, .properties files, XSD schemas, etc.
Use Java kiss if this is about simplicity of structure:
import static kiss.API.*;
class App {
void run() {
String line;
try (Close in = inOpen("file.dat")) {
while ((line = readLine()) != null) {
println(line);
}
}
}
}
import java.util.stream.Stream;
import java.nio.file.*;
import java.io.*;
class ReadFile {
public static void main(String[] args) {
String filename = "Test.txt";
try(Stream<String> stream = Files.lines(Paths.get(filename))) {
stream.forEach(System.out:: println);
} catch (IOException e) {
e.printStackTrace();
}
}
}
Just use java 8 Stream.
This code I programmed is much faster for very large files:
public String readDoc(File f) {
String text = "";
int read, N = 1024 * 1024;
char[] buffer = new char[N];
try {
FileReader fr = new FileReader(f);
BufferedReader br = new BufferedReader(fr);
while(true) {
read = br.read(buffer, 0, N);
text += new String(buffer, 0, read);
if(read < N) {
break;
}
}
} catch(Exception ex) {
ex.printStackTrace();
}
return text;
}
String fileName = 'yourFileFullNameWithPath';
File file = new File(fileName); // Creates a new file object for your file
FileReader fr = new FileReader(file);// Creates a Reader that you can use to read the contents of a file read your file
BufferedReader br = new BufferedReader(fr); //Reads text from a character-input stream, buffering characters so as to provide for the efficient reading of characters, arrays, and lines.
The above set of line can be written into 1 single line as:
BufferedReader br = new BufferedReader(new FileReader("file.txt")); // Optional
Adding to string builder(If you file is huge, it's advised to use string builder else use normal String object)
try {
StringBuilder sb = new StringBuilder();
String line = br.readLine();
while (line != null) {
sb.append(line);
sb.append(System.lineSeparator());
line = br.readLine();
}
String everything = sb.toString();
} finally {
br.close();
}