“Streams in node are one of the rare occasions when doing something the fast way is actually easier. SO USE THEM. not since bash has streaming been introduced into a high level language as nicely as it is in node.” @dominictarr at high level node style guide.
Streams emits Events, the native observer pattern of NodeJS.
At this moment exists 3 iterations of the Stream implementation that depend of your version of node/iojs.
Instead of use the native API (that depend of your node version) better use readable-stream or through2. Both are backward compatibility and works fine in browser build. (this last is more lightweight because just expose a Duplex Stream).
.pipe()
is just a function that takes a readable source stream and hooks the output to a destination writable stream (as UNIX commands):
tweetStream.pipe(process.stdout)
Using .pipe()
has other benefits too, like handling backpressure automatically so that node won’t buffer chunks into memory needlessly when the remote client is on a really slow or high-latency connection.
We have 4 types of Streams: Duplex, Readable, Transform and Writable.
A good library that collect stream utilities are mississippi.
You can implement a Stream using inheritance or composition.
Streams from which data can be read (e.g fs.createReadStream()
).
const toReadableStream = input => (
new Readable({
read () {
this.push(input)
this.push(null)
}
})
)
Readable streams produce data that can be fed into a writable, transform, or duplex stream by calling .pipe()
For emit chunks of data you need to create a object that implement the ._read method.
It emits data
events each time they get a chunk of data. From the implementation this is synonymous of this.push(data)
.
It emits end
when it has no more data this.push(null)
. In others words, the event end
is triggered when the last chunk of data arrives, signifying that this is it and there is no more data after this last piece.
When you are using a Readable Stream you can use resume()
and pause()
methods to control the data flow of the stream.
Streams to which data can be written (e.g fs.createWriteStream()
).
A writable stream is a stream you can .pipe()
to but not from.
For emit chunks of data you need to create a object that implement the ._write method.
.end
to close the stream and also you can pass the last chunk to .write
.
Just provide the callback if you want to wait, but the order of the successive calls is guaranteed.
The event finish
is triggered when all the data has been processed (after end has been run and been processed).
Streams that are both Readable and Writable. Both are independent and each have separate internal buffer. (e.g net.Socket
).
Duplex Stream
------------------|
Read <----- External Source
You ------------------|
Write -----> External Sink
------------------|
You don't get what you write. It is sent to another source.
Duplex streams where the output is in some way related to the input (e.g zlib streams).
Transform Stream
--------------|--------------
You Write ----> ----> Read You
--------------|--------------
You write something, it is transformed, then you read something.
They are a subclass of Readable/Writable streams because they interact with the filesystem, emitting special kind of events
open
event to control the file state of the fs.ReadStream
/fs.WriteStream
streams.Also it’s an especial kind of streams. They particulary fire exit
event that is different from close
.
It uses stdio
to setup stream communication between the child_process and where the output have to be write/read (by default stdin
, stdout
and stderr
that are align with UNIX standard streams).
You can convert whatever stream interface into a callback. See my stream-callback library that makes easy this conversion.
It’s also possible transform an async callback function into a stream interface. You need to be sure to handle correctly the backpressure of the stream. In my experience in this area I use from2. Check fetch-timeline or totalwind-api as examples.
Interested libraries to use with streams are:
Written by Kiko Beats
Kiko Beats
Web is the Platform. Programmer, Computer Science & Software Engineer.