Node.js - tutorial - Node.js modules

revision: July 24, 2023

the Node.js fs module

The fs module provides a lot of very useful functionality to access and interact with the file system. There is no need to install it. Being part of the Node.js core, it can be used by simply requiring it.

Once you do so, you have access to all its methods, which include:

fs.access(): check if the file exists and Node.js can access it with its permissions.
fs.appendFile(): append data to a file. If the file does not exist, it's created.
fs.chmod(): change the permissions of a file specified by the filename passed. Related: fs.lchmod(), fs.fchmod().
fs.chown(): change the owner and group of a file specified by the filename passed. Related: fs.fchown(), fs.lchown().
fs.close(): close a file descriptor.
fs.copyFile(): copies a file.
fs.createReadStream(): create a readable file stream.
fs.createWriteStream(): create a writable file stream.
fs.link(): create a new hard link to a file.
fs.mkdir(): create a new folder.
fs.mkdtemp(): create a temporary directory.
fs.open(): set the file mode.
fs.readdir(): read the contents of a directory.
fs.readFile(): read the content of a file. Related: fs.read().
fs.readlink(): read the value of a symbolic link.
fs.realpath(): resolve relative file path pointers (., ..) to the full path.
fs.rename(): rename a file or folder.
fs.rmdir(): remove a folder.
fs.stat(): returns the status of the file identified by the filename passed. Related: fs.fstat(), fs.lstat().
fs.symlink(): create a new symbolic link to a file.
fs.truncate(): truncate to the specified length the file identified by the filename passed. Related: fs.ftruncate().
fs.unlink(): remove a file or a symbolic link.
fs.unwatchFile(): stop watching for changes on a file.
fs.utimes(): change the timestamp of the file identified by the filename passed. Related: fs.futimes().
fs.watchFile(): start watching for changes on a file. Related: fs.watch().
fs.writeFile(): write data to a file. Related: fs.write()

One peculiar thing about the fs module is that all the methods are asynchronous by default, but they can also work synchronously by appending Sync.

For example:
- fs.rename()
- fs.renameSync()
- fs.write()
- fs.writeSync()

the Node.js path module

The path module provides a lot of very useful functionality to access and interact with the file system. There is no need to install it. Being part of the Node.js core, it can be used by simply requiring it.

This module provides path.sep which provides the path segment separator (\ on Windows, and / on Linux / macOS), and path.delimiter which provides the path delimiter (; on Windows, and : on Linux / macOS).

These are the path methods:

path.basename(): returns the last portion of a path. A second parameter can filter out the file extension.

path.dirname(): returns the directory part of a path.

path.extname(): returns the extension part of a path.

path.format(): returns a path string from an object, This is the opposite of path.parse. "path.format" accepts an object as argument with the follwing keys:

root: the root.
dir: the folder path starting from the root.
base: the file name + extension.
name: the file name.
ext: the file extension.

root is ignored if dir is provided; ext and name are ignored if base exists.

path.isAbsolute(): returns true if it's an absolute path.

path.join(): joins two or more parts of a path:

path.normalize(): tries to calculate the actual path when it contains relative specifiers like . or .., or double slashes.

path.parse(): parses a path to an object with the segments that compose it:

root: the root.
dir: the folder path starting from the root.
base: the file name + extension.
name: the file name.
ext: the file extension.

path.relative(): accepts 2 paths as arguments. Returns the relative path from the first path to the second, based on the current working directory.

path.resolve(): you can get the absolute path calculation of a relative path using path.resolve(). By specifying a second parameter, resolve will use the first as a base for the second. If the first parameter starts with a slash, that means it's an absolute path.

the Node.js os module

This module provides many functions to retrieve information from the underlying operating system and the computer the program runs on, and interact with it.

There are a few useful properties that tell us some key things related to handling files:

os.EOL gives the line delimiter sequence. It's \n on Linux and macOS, and \r\n on Windows.
os.constants.signals tells us all the constants related to handling process signals, like SIGHUP, SIGKILL and so on.
os.constants.errno sets the constants for error reporting, like EADDRINUSE, EOVERFLOW and more.

The main methods that os provides are:

os.arch(): returnsthe string that identifies the underlying architecture, like arm, x64, arm64.

os.cpus(): returns information on the CPUs available on your system.

os.endianness(): returns BE or LE depending if Node.js was compiled with Big Endian or Little Endian.

os.freemem(): returns the number of bytes that represent the free memory in the system.

os.homedir(): returns the path to the home directory of the current user.

os.hostname(): returns the host name.

os.loadavg(): returns the calculation made by the operating system on the load average. It only returns a meaningful value on Linux and macOS.

os.networkInterfaces(): returns the details of the network interfaces available on your system.

os.platform(): returns the platform that Node.js was compiled for: darwin, freebsd, linux, openbsd, win32, ...more.

os.release(): returns a string that identifies the operating system release number.

os.tmpdir(): returns the path to the assigned temp folder.

os.totalmem(): returns the number of bytes that represent the total memory available in the system.

os.type(): identifies the operating system: Linux, Darwin on macOS, Windows_NT on Windows.

os.uptime(): returns the number of seconds the computer has been running since it was last rebooted.

os.userInfo(): returns an object that contains the current username, uid, gid, shell, and homedir.

the Node.js events module

The events module provides us the EventEmitter class, which is key to working with events in Node.js.

The event listener has these in-built events: "newListener" when a listener is added; "removeListener" when a listener is removed.

The most useful methods are as follows:

emitter.addListener(): alias for emitter.on().

emitter.emit(): emits an event. It synchronously calls every event listener in the order they were registered.

emitter.eventNames(): returns an array of strings that represent the events registered on the current EventEmitter object.

emitter.getMaxListeners(): get the maximum amount of listeners one can add to an EventEmitter object, which defaults to 10 but can be increased or lowered by using setMaxListeners().

emitter.listenerCount(): get the count of listeners of the event passed as parameter.

emitter.listeners(): gets an array of listeners of the event passed as parameter.

emitter.off(): alias for emitter.removeListener() added in Node.js 10.

emitter.on(): adds a callback function that's called when an event is emitted.

emitter.once(): adds a callback function that's called when an event is emitted for the first time after registering this. This callback is only going to be called once, never again.

emitter.prependListener(): when you add a listener using on or addListener, it's added last in the queue of listeners, and called last. Using prependListener it's added, and called, before other listeners.

emitter.prependOnceListener(): when you add a listener using once, it's added last in the queue of listeners, and called last. Using prependOnceListener it's added, and called, before other listeners.

emitter.removeAllListeners(): removes all listeners of an EventEmitter object listening to a specific event.

emitter.removeListener(): removes a specific listener. You can do this by saving the callback function to a variable, when added, so you can reference it later.

emitter.setMaxListeners(): sets the maximum amount of listeners one can add to an EventEmitter object, which defaults to 10 but can be increased or lowered.

the Node.js http module

The HTTP core module is a key module to Node.js networking.It can be included using " const http = require('http')". The module provides some properties and methods, and some classes.

properties:

http.METHODS: this property lists all the HTTP methods supported.

http.STATUS_CODES: this property lists all the HTTP status codes and their description.

http.globalAgent: points to the global instance of the Agent object, which is an instance of the http.Agent class. It's used to manage connections persistence and reuse for HTTP clients, and it's a key component of Node.js HTTP networking.

Methods

http.createServer(): returns a new instance of the http.Server class.

http.request(): makes an HTTP request to a server, creating an instance of the http.ClientRequest class.

http.get(): similar to http.request(), but automatically sets the HTTP method to GET, and calls req.end() automatically.

classes

The HTTP module provides 5 classes: http.Agent, http.ClientRequest, http.Server, http.ServerResponse, http.IncomingMessage

http.Agent: Node.js creates a global instance of the http.Agent class to manage connections persistence and reuse for HTTP clients, a key component of Node.js HTTP networking. This object makes sure that every request made to a server is queued and a single socket is reused. It also maintains a pool of sockets. This is key for performance reasons.

http.ClientRequest: an http.ClientRequest object is created when http.request() or http.get() is called. When a response is received, the response event is called with the response, with an http.IncomingMessage instance as argument. The returned data of a response can be read in 2 ways: you can call the response.read() method; in the response event handler you can setup an event listener for the data event, so you can listen for the data streamed into.

http.Server: this class is commonly instantiated and returned when creating a new server using http.createServer(). Once you have a server object, you have access to its methods: close() stops the server from accepting new connections; listen() starts the HTTP server and listens for connections.

http.ServerResponse: created by an http.Server and passed as the second parameter to the request event it fires. Commonly known and used in code as res. The method you'll always call in the handler is end(), which closes the response, the message is complete and the server can send it to the client. It must be called on each response. These methods are used to interact with HTTP headers:

getHeaderNames() get the list of the names of the HTTP headers already set;
getHeaders()> get a copy of the HTTP headers already set;
setHeader('headername', value) sets an HTTP header value;
getHeader('headername') gets an HTTP header already set;
removeHeader('headername') removes an HTTP header already set;
hasHeader('headername') return true if the response has that header set;
headersSent() return true if the headers have already been sent to the client.

After processing the headers you can send them to the client by calling response.writeHead()v, which accepts the statusCode as the first parameter, the optional status message, and the headers object.
To send data to the client in the response body, you use write(). It will send buffered data to the HTTP response stream.
If the headers were not sent yet using response.writeHead(), it will send the headers first, with the status code and message that's set in the request, which you can edit by setting the statusCode and statusMessage properties values.

http.IncomingMessage: an http.IncomingMessage object is created by: http.Server when listening to the request event; http.ClientRequest when listening to the response event.
It can be used to access the response:

status using its statusCode and statusMessage methods;
headers using its headers method or rawHeaders;
HTTP method using its method method;
HTTP version using the httpVersion method;
URL using the url method;
underlying socket using the socket method;

The data is accessed using streams, since http.IncomingMessage implements the Readable Stream interface.

Node.js buffers

A buffer is an area of memory. It represents a fixed-size chunk of memory (can't be resized) allocated outside of the V8 JavaScript engine. You can think of a buffer like an array of integers, which each represent a byte of data. It is implemented by the Node.js Buffer class.

Buffers were introduced to help developers deal with binary data, in an ecosystem that traditionally only dealt with strings rather than binaries. Buffers in Node.js are not related to the concept of buffering data. That is what happens when a stream processor receives data faster than it can digest.

A buffer is created using the Buffer.from(), >Buffer.alloc(), and Buffer.allocUnsafe() methods. You can also just initialize the buffer passing the size.

While both alloc and allocUnsafe allocate a Buffer of the specified size in bytes, the Buffer created by alloc will be initialized with zeroes and the one created by allocUnsafe will be "uninitialized". This means that while allocUnsafe would be quite fast in comparison to alloc, the allocated segment of memory may contain old data which could potentially be sensitive. Older data, if present in the memory, can be accessed or leaked when the Buffer memory is read. This is what really makes allocUnsafe unsafe and extra care must be taken while using it.

A buffer, being an array of bytes, can be accessed like an array.

example

JS
      const buf = Buffer.from('Hey!')
      console.log(buf[0]) //72
      console.log(buf[1]) //101
      console.log(buf[2]) //121

Those numbers are the UTF-8 bytes that identify the characters in the buffer (H → 72, e → 101, y → 121). This happens because Buffer.from() uses UTF-8 by default. Keep in mind that some characters may occupy more than one byte in the buffer (é → 195 169).

You can print the full content of the buffer using the toString() method. buf.toString() also uses UTF-8 by default.

Get the length of a buffer: use the length property.

Changing the content of a buffer: you can write to a buffer a whole string of data by using the write() method. Just like you can access a buffer with an array syntax, you can also set the contents of the buffer in the same way.

Slice a buffer: if you want to create a partial visualization of a buffer, you can create a slice. A slice is not a copy: the original buffer is still the source of truth. If that changes, your slice changes. Use the subarray() method to create it. The first parameter is the starting position, and you can specify an optional second parameter with the end position.

Copy a buffer: copying a buffer is possible using the set() method. By default you copy the whole buffer. If you only want to copy a part of the buffer, you can use .subarray() and the offset argument that specifies an offset to write to.

Node.js streams

Streams are one of the fundamental concepts that power Node.js applications. They are a way to handle reading/writing files, network communications, or any kind of end-to-end information exchange in an efficient way. Using streams you read a file piece by piece, processing its content without keeping it all in memory.

Streams basically provide two major advantages over using other data handling methods: 1/ memory efficiency: you don't need to load large amounts of data in memory before you are able to process it; 2/ time efficiency: it takes way less time to start processing data, since you can start processing as soon as you have it, rather than waiting till the whole data payload is available

A typical example is reading files from a disk. Using the Node.js fs module, you can read a file, and serve it over HTTP when a new connection is established to your HTTP server.

example

JS
      const http = require('http')
      const fs = require('fs')
      const server = http.createServer(function(req, res) {
        fs.readFile(__dirname + '/data.txt', (err, data) => {
          res.end(data)
        })
      })
      server.listen(3000)

"readFile()" reads the full contents of the file, and invokes the callback function when it's done."res.end(data)" in the callback will return the file contents to the HTTP client.

If the file is big, the operation will take quite a bit of time. Here is the same thing written using streams:

JS
      const http = require('http')
      const fs = require('fs')
      
      const server = http.createServer((req, res) => {
        const stream = fs.createReadStream(__dirname + '/data.txt')
        stream.pipe(res)
      })
      server.listen(3000)

Instead of waiting until the file is fully read, we start streaming it to the HTTP client as soon as we have a chunk of data ready to be sent.

pipe()

the pipe() method takes the source, and pipes it into a destination. You call it on the source stream, so in this case, the file stream is piped to the HTTP response. The return value of the pipe() method is the destination stream, which is a very convenient thing that lets us chain multiple pipe() calls, like this:

JS
      src.pipe(dest1).pipe(dest2)

This construct is the same as doing:

JS
      src.pipe(dest1)
      dest1.pipe(dest2)

Due to their advantages, many Node.js core modules provide native stream handling capabilities, most notably:

process.stdin returns a stream connected to stdin.
process.stdout returns a stream connected to stdout.
process.stderr returns a stream connected to stderr.
fs.createReadStream() creates a readable stream to a file.
fs.createWriteStream() creates a writable stream to a file.
net.connect() initiates a stream-based connection.
http.request() returns an instance of the http.ClientRequest class, which is a writable stream.
zlib.createGzip() compress data using gzip (a compression algorithm) into a stream.
zlib.createGunzip() decompress a gzip stream..
zlib.createDeflate() compress data using deflate (a compression algorithm) into a stream.
zlib.createInflate() decompress a deflate stream

There are four classes of streams:

Readable: a stream you can pipe from, but not pipe into (you can receive data, but not send data to it). When you push data into a readable stream, it is buffered, until a consumer starts to read the data.
Writable: a stream you can pipe into, but not pipe from (you can send data, but not receive from it).
Duplex: a stream you can both pipe into and pipe from, basically a combination of a Readable and Writable stream.
Transform: a Transform stream is similar to a Duplex, but the output is a transform of its input.

Create a readable stream

We get the readable stream from the stream module, and we initialize it and implement the readable._read() method.

First create a stream object, then implement _read. You can also implement _read using the read option. Now that the stream is initialized, we can send data to it.

Create a writable stream

To create a writable stream we extend the base Writable object, and we implement its _write() method.

First create a stream object, then implement _write. You can now pipe a readable stream in.

Get data from a readable stream

We read data from a readable stream by using a writable stream. You can also consume a readable stream directly, using the readable event,

Send data to a writable stream

Using the stream write() method

Signaling a writable stream that you ended writing

Use the end() method.

Create a transform stream

We get the Transform stream from the stream module, and we initialize it and implement the transform._transform() method. First create a transform stream object, then implement _transform.