Skip to main content

Guide For NodeJS - the Hard Parts

1. Introduction

Key Node.js Features

Some of the critical features of Node.js include:

  • 1 Easy: With tons of tutorials and a large community, Node.js is relatively easy to start with — it’s a go-to choice for web development beginners.
  • 2 Scalable: Node.js is single-threaded, which means it can handle a massive number of simultaneous connections with high throughput and provides vast scalability for applications.
  • 3 Speed: Non-blocking thread execution makes Node.js fast and efficient.
  • 4 Packages: A vast set of open source Node.js packages is available that can simplify your work. There are more than one million packages in the NPM ecosystem today.
  • 5 Strong backend: Node.js is written in C and C++, making it faster for running a server and adding features like networking support.
  • 6 Multi-platform: Cross-platform support allows you to create websites for SaaS products, desktop apps, and even mobile apps.
  • 7 Maintainable: Node.js is an easy choice for developers since both the frontend and backend can use JavaScript.

Node JS is one of the most powerful technologies to emerge within the last 15 years. It allows us to build applications that enable us to handle millions of users at once. Some of the largest companies in the world, therefore, use it:

  • LinkedIn
  • Uber
  • Netflix
  • IBM All of them use Node. But not only that! Node if going to allow us to build Desktop apps, compatible with windows, Mac and linux operating systems. Here are some softwares that use Node (packaged up as Electron):
  • Slack
  • Twitch
  • Vs Code
  • Atom But most important for us full stack developers, it allows us to build apps end-to-end in just one language - which is JavaScript. This concept was introduced as - Isomorphic JavaScript, which means, writing client-side code & server-side code in the same language. Up until some time ago, before Node came to be, the most popular languages to write server code were:
  • PHP
  • Java
  • Ruby
  • C/C++ And as you can see, JavaScript ain't one of them. JavaScript was a Client side language for so many years. It doesn't have access to our computer's internal features. Our dream, for many years, was to be able to use JavaScript for accessing all our computer's internal features. For instance, the networking ability, the ability to receive a message, to look at a received message, inspect it, and decide what to send back to the client. Well, actually, there are a bunch of internal features of our computer we might want to use, such as:
  • Network socket - Receive and send back messages over the internet
  • Filesystem - that's where the html/css/js files are stored in files
  • CPU - for cryptography and optimizing hashing passwords
  • Kernel - I/O management

2. The Process Object

The process object in Node.js is a global object that can be accessed inside any module without requiring it. There are very few global objects or properties provided in Node.js and process is one of them. It is an essential component in the Node.js ecosystem as it provides various information sets about the runtime of a program. This process object is an instance of the EventEmitter class. It does contain its own pre-defined events such as exit which can be used to know when a program in Node.js has completed its execution. Process also provides various properties to interact with. Some of them can be used in a Node application to provide a gateway to communicate between the Node application and any command line interface. This is very useful if you are building a command line application or utility using Node.js. • process.stdin: a readable stream • process.stdout: a writable stream • process.stderr: a wriatable stream to recognize errors Using argv you can always access arguments that are passed in a command line. argv is an array which has the running program as the first element (which is always going to be Node) and the absolute full path of the file as the second element. From the third element onwards it can have as many arguments as you want.

- process operation 1: "on" or "addListener"

addListener and on are actually aliases. Doing process.on('some-event', cb) is the same as doing process.addListener('some-event', cb).

Events Related to process:

• Event 1: 'exit'

Run the below program and you can observe that the result comes up with status code 0. In Node.js (and in any other programming language) this status code means that a program has run successfully.

process.on('exit', code => {
setTimeout(() => console.log('Will not get displayed'), 0);

console.log('Exited with status code:', code);
});
console.log('Execution Completed');

• Event 2: 'unhandledRejection'

Run the following code:

process.on('exit', code => {})

• Event 3: 'unhandledException'

Run the following code:

process.on('exit', code => {})
  • process operation 2: Read / Write You have Read / Write operations just like console.log on the process object. process.stdout.write('Hello World!' + '\n');

  • process operation 3: exit Terminate a program, and giving it an exit code. process.exit(1);

  • process operation 999: other useful things process.pid process.argv process.env process.cwd() process.memoryUsage()


3. The Mental Model

- Background

In this guide, we're going to learn what it means to "open a web application". We're basically, in the most fundamental way, talking about communication between 2 computers. Just to have a clear distinction from now on, let's call computers who initiate the communication Clients, and say that we have many of those, and call the computer which holds the application the Server, which is called like that because it serves something back. A "web application", is, at it core, an application running on a server (computer) that can receive messages from... other computer, clients. Th "web application" is (or at least should be) always "on", always connected to the internet, and always ready... to receive messages sent out by other people's computers. And what do I do as a developer to tell the computer to look at the message and send something back? I write code!

But there's a problem...

Where does this message, the one being sent to me, arrive on? It arrives on the Network Card.
In what languages can I write that has access to my network card?

  • PHP
  • Java
  • Ruby
  • C/C++

So we've established that a language who wants to write server code, needs to have access to the Network Card.
Also, this language will need to have access to files!
Because eventually, where do we write code? In files!
So this language must have the ability to read from a file, and maybe even write to a file.
It needs to have access to that computer's filesystem. We put the Network Card and the FileSystem under a group that from now on we would call Internal Features. JavaScript doesn't have access to our computer's Internal Features. What language does have the ability to access our computer's Internal Features? C++

JavaScript is gonna have to work in hand with C++, so that we could write JavaScript code to control C++ built-in features that allow us to control our computer's internal features. And these two together are known as NodeJS. Why is it called Node JS when they're so much C++ in it still baffles me, but nevertheless - it is known as Node.

The model we are going to see over and over and over, again and again repeatedly, is: JavaScript affects Node (which is C++), which affect a computer's internal feature. We're going to spend the rest of this following course in writing JavaScript, to control indirectly via C++ the computer features we need to have access get our inbound message, and then send back a response (the right data, or the right html file).

Does that mean I need to know C++?
It turns out that - no.

We're going to get from JavaScript a TON of labels, built in to JavaScript that are gonna give us control. labels like "http", "fs", and not so many more, that are not built-in per-say, but you'd have to sort of summon on-demand (import). And those labels will give us access to C++ features, that give us access to the computer's internals. We don't need to know C++ code to do so, but we do need to understand the "Mental Models" of how it's working, and how these JavaScript labels are gonna trigger C++ features. That's what we're gonna do. We'll get an intimate understanding of how JavaScript labels take command through C++, and get a ton of help from Node C++ (much of Node is a C++ code) in order to take command of the internal features of our computer.

- Our Code Mental Model

We write JavaScript code, or more accurately - we use JavaScript label that Node had prepared for us, in order to trigger Node C++ code, that eventually reaches an Internal Feature. These JavaScript label look like they're JavaScrip functions, but no, they are in fact facades, for in reality they are commands to Node C++ features. The vast majority of all the interesting stuff, of Node's hard work, is happening down over at C++ world, so we better understand what's going on there, and also we better understand what's going over at our computer's internal features - like the socket. But first in foremost, we had better have a good understanding about JavaScript. Important note! C++ isn't gonna go directly to the network card itself. It's actually gonna interact with some abstraction layers of the operating system. Things like: E.Po, K.Q. we'll learn all about those stuff later on so hold on on that for now. In our mental model, JavaScript is going to have 3 major parts:

  1. Save data - numbers, strings, arrays, objects. And also - functionality, which means code that's gonna run later on.
  2. Runs code on data. Run a functionality on a piece of data. Run a function (function = a saved code that has not been used yet).
  3. Has a TON of built-in labels that are gonna trigger Node features, that are written in C++, to use our computers internals.

JavaScript's main data store is called "the Global Memory". The store of data I known as the "global memory". JavaScript has the ability to go line by line, and that's called the "Thread of Execution". We mentioned earlier that a "web application" is an application able to receive messages from outside users/clients, and that we use code to look at those messages.
So first, How do we bundle up code? We wrap it in a function! So although it may seem trivial, functions will turn out to be the most important construct in JavaScript. Functions = code that's bundled up which we're saving to run at a later time. In simple JavaScript, in order to run a function, we, ourselves, the developers, put on the parenthesis at the end of a function's label manually. I have a sneaking suspicion... that Node might end up being the one who puts the parens on the end of our function code. And also, the one who inserts the input automatically for us. And that's gonna turn out to be the entire paradigm... of Node. Because, when a request comes in, I don't know when it's gonna come! Node would be the one to know, so therefore Node must be the one to trigger... executing the function.


4. The http module

(HTTP = HyperText Transfer Protocol) Let's remind ourselves the final goal first. Our dream is to write JavaScript code, that can look at an incoming message off the internet, inspect it, and send back the right response. That's our dream. So, there's better be a label in JavaScript, that accesses, or sets up a Node C++ feature, that can access the networking feature of our computer. There better be one! Well actually, there is - it's called "Net". But there's a more specialized one, called "http". "http" = a Node feature that's gonna access the network card (effectively) and be able to receive messages in the HTTP format. Later on we're gonna see HTTP in a greater detail, but for now know that it's a format by which you send messages (or "requests") from a web browser, and we need our network open, and we're gonna discover that what we actually open is a socket, which is an open connection to the internet, an open channel. A two-way open channel. And we're gonna discover that this open connection, we need to have it formatted such that it ready to receive HTTP formatted messages. How are we gonna do that from JavaScript? Via labels, that trigger a Node C++ feature, that triggers an internal feature of our computer. A powerful feature of Node C++ is "http". We are now going to see an "http" feature of Node being used to set up and open a socket connection to the internet. Socket is a posh word for saying "an open channel for data to go in and out of a place over the internet". Sounds complicated right?? All we ever wanted was to build a cool looking app! But folks, if we get this principle down... we'll discover there's nothing else to Node. In this guide we're going to see ALL features of Node, besides one - Multiple threaded tasks. Multiple threaded tasks meaning, that in some way we could handle more than one, multiple JavaScript instances at a time.


5. createServer

It turns out, that http has a built-in method called createServer. createServer can also be referred to as a label... for a Node C++ feature, that sets up an open channel to the internet. http.createServer is a command for a Node C++ feature. To do what? To set up a network feature of Node, specializing in http protocol, ready to receive messages. Well, that's not really interesting, cause what's really interesting is what it's gonna do in the computer's internals. With the help of libuv. libuv is a bunch of pre-written C++ code, technically built separately from Node, BUT! Its most prominent use is in Node. libuv is a bunch of C++ code written to ensure that we can run Node on any operating system, and link up effectively between c++ code written in Node, with any computer internal structure, whether it's a Mac, linux, or windows. So, let's repeat that one more time: http.createServer is a command for a Node C++ feature, that with the help of libuv, is going to set up in the computer's internals an open socket, an open channel to the internet, ready to receive messages. That's it! One line! const server = http.createServer(); Our computer is now ready (almost) to receive messages. In one line! In one label! It opened that channel. In and out messages. One issue though. There are about 64,000 numbers that represent entry points to my computer. That's a bit of an issue, because when a message arrives at my computer, which entry point it would come in at? The default port for ANY http sent message coming from a browser is (of course) 80. So that message is gonna try and arrive at port 80. Well, One might say "Damn, I already created a server with that one line, without setting the port number". "How the hell do we continue to edit it? It is even possible?". Luckily, Node realized that we're not gonna do all of the commands for the underlying C++ feature in one line, so what does it does? The other thing that createServer dies is immediately, and this is crucial to understand, IN JAVASCRIPT returns out an object full of functions, methods, including ones like "listen" and "on", all of which, when run, will allow us to continue edit the instance of the http feature in Node that we've set up. Let's repeat that one more time: What http.createServer does in Node is divided into 2 parts. One part is related to what it's doing in Node, and the second part relates to what it's doing in JavaScript. In Node, createServer sets up the "http" feature of Node, which is actually behind the scenes sending a message to the computer's internals, where it's going to turn on in the networking portion of our computer an open socket, which is a fancy word for saying "an open channel to the internet that is two-way", meaning that it can receive data and send data back. Node's output of running createServer is setting up a socket. In JavaScript, createServer returns an object full of methods, which we call "edit functions", since they let us "edit" this particular Node HTTP instance, that are linked directly to the particular socket that has been opened by createServer. This object being returned is a JavaScript output of running createServer, which allow us to ongoingly modify the server. Here are some of the main functions that are available to us on the return object:

  • listen
  • on We've also mentioned that a socket needs a port number specified. The listen method is an edit function that lets us edit the port number of the http server instance. The "on" method (function) let's us setup what functions we want to auto-run when a certain event occurs.

6. The "listen" method

The "listen" function is a method on the returned JavaScript object which comes back from createServer. The "listen" method is used to have the HTTP server start listening for connections. The "listen" method has an edit access to the open socket created by createServer. The "listen" method doesn't do anything in JavaScript. That thing that the "listen" method can edit is the port number. It sets the port to whatever you tell it to. We use it like so: server.listen(80); And there we have it, our computer is now ready to receive messages from the internet, in two lines!! A lot of shot is happening behind the scenes, but... it's 2 lines in JavaScript, to trigger a ton of sophisticated stuff, like opening a channel at a specific entry point.


7. Auto-Run a Function

- Auto-Run A Function

const server = http.createServer(); server.listen(80); A recap from earlier - we set up a channel ready to receive data in 2 lines. So, let's first review a scenario: A message comes into our computer, an inbound message comes, and we want to do something like this: Pseudo-code: if (inboundMsg) --> send back proper response There's a problem here. When is this code going to run? In fact, I have no idea! A message could arrive at any giver time day or night! Who does know when an inbound message arrives? Node knows! And so, perhaps... we're going to rely on Node to AUTOMATICALLY run this line of code for us. But how could we bundle up the code in order for it to be triggered to auto-run by Node? In a function. And that's what we're gonna do again and again and again. We're gonna bundle up code in a function, that we want to have auto-run by Node to do stuff like - look at an inbound message, and send data back, when Node sees (with the help of libuv) that a message has arrived. We are going to save code, wrap it in a function, give that function a name like doOnIncoming, give that function to Node. And in return, Node is gonna auto-run that function for us, when a request (inbound messages / request for data) arrives from a user. const server = http.createServer(doOnIncoming); server.listen(80); It turns out, createServer does an extra thing! It accepts a function as an argument, that would be auto-run by Node on an incoming message. Whatever function we insert there, is what's going to be auto-run when a message comes in. Keep in mind, As we're going to see this again and again, Any task that will take a long time, will be set up in Node, and then have a function attached to it, that will be automatically triggered to run when the background task either completes, or has activity. Among these tasks are: talking to a database, talking to the file system.

- Auto-Insert Arguments

We saw that createServer takes in a function as its first argument, and that we gave it doOnIncoming. When relying on Node to auto-run our doOnIncoming function, we give it the function's label. Node will then automatically add parenthesis in order to execute my function when the time comes. However, that creates a problem for me, because if I'm not the one putting the parenthesis, how am I able to pass in arguments? to insert the arguments? It turns out, Node has two main jobs. Automatically add the parenthesis when the times comes, in order to auto-run it, and also... automatically insert the arguments full of data for me. And wouldn't that be amazing, if that data were exactly the inbound message that I need to inspect, in order to determine what to send back. All I have to do in return, is prepare my function in a such a way, that I could catch those arguments when they are auto-inserted. And that's all possible, by using placeholders known as - parameters.


8. Request & Response

When an inbound message arrives at my computer, Node takes the function that we told it to run (my doInIncoming), and it runs it. But it also... passes in 2 objects. Now, we said that we want to see the message, and be able to read it. Wouldn't it be amazing, if that first argument passed is actually the message itself? Turns out, it is! Do I get the the inbound message as a string? I actually don't. Because Node wants to make my life easier. So instead, the next thing Node does as soon as a message comes in, it gonna immediately going to set an http message, ready to send back, but both of them are in a format that I don't wanna deal with in JavaScript. So instead, Node is going to automatically package up 2 JavaScript objects for us. Note that they are JAVASCRIPT objects, but they are being set by Node. Node is going to auto-create them. These two object are the most important objects in all of Node. The first one is gonna package for us in a nice JavaScript object the important information from the inbound message.

What's the most important information we got from the message?

  • URL
  • headers
  • method
  • . . .

Node is gonna parse the message, grab all the thing you might want and need, and put them in the object above, in those nice little objects (like req.url). This object comes without a label, so we need to give it one, we need to put placeholders (known as parameters) to capture this object and give it a name. The name can be anything we want. Traditionally, we called it "req", but you can choose your own. Now, what do we want doInIncoming to do eventually? Reading is not our main purpose, it's not our goal. Our goal is to eventually send back a response message! Reading is just means to that goal! How are we gonna do that? We better have inside this function's code access to this message that's going back, so we can add stuff to it, some data, maybe some html, or css, and send it back. So the next thing Node does, as soon as a message comes in, is it's gonna immediately gonna set an http message, ready to send back, but both of them (the incoming & the created outgoing) are in a format that I don't wanna deal with in JavaScript. The second object Node created for me has a bunch of methods, javascript labels, for editing/updating the final result of the outgoing message. Fundamentally 2 different behaving objects. One has actually got the inbound data on it, which we can access, the other has functions that when we run them from javascript as labels, back into Node, to add stuff to the outgoing message which gets sent back.

So, a full recap:
In 3 lines, we have set up our server.

function doOnIncoming(req, res){ res.end('Welcome to LuckyLove!') }
const server = http.createServer(doOnIncoming);
server.listen(80);

What does this code do: ⁃ save the function ⁃ use a label called createServer ⁃ To set up a Node background feature, which really is an internal feature, which is specifically opening a socket, an open channel to the internet. The talking to the internal features part is being done through a library called "libuv". ⁃ To store a function (doOnInbound) to run when an inbound message comes in, by passing it as the first parameter. A function that will be automatically triggered by Node, and.... the most important piece of all, on an inbound message, not only Node is going to auto-run the function, Node is also going to auto-insert the 2 most important objects of all to it, the request & response objects, that were automatically made by Node. One of them has all the information from the inbound message, packaged up in a nice format, each property holds 1 piece of information related to the message. And the second one is an object full of functions all of which are linked to auto-created response message, that we can add text to, or content to, HTML files images JavaScript files, by running some of the function on that auto-inserted second object. One of them is res.end, which tells Node "Hey, this message is ready to be sent back, let's go!". End can accept some things ⁃ to get an object full of functions, that we call our "edit functions", that when run they tap into this instance of internal feature, and update it on the go. ⁃ Use the "listen" method on that returned server object to update the port to 80, and part listening.


9. Editing the soon-to-be-sent-out response

As we've mentioned before, Node gives us an object full of edit functions, which we can then use to edit our outgoing response. The most common ones are "write" & "end", but we will see some more today. The list is:

  • end
  • write
  • headers

10. How to turn on Node

Where do I write JavaScript? I can write JavaScript anywhere, it's just some text. Typically I write JavaScript in a file, and then give it to the web browser to execute it, but I can write it anywhere. Node, however, is an app. It's just an app that I turn on, just like a web browser. A web browser is just an app that has access to my computer's internals, because we know it can send messages to the internet. Node is no different. How do I open a normal app on my computer? Double click. Unfortunately Node isn't like that. Because developers hate double-clicking. They are pathologically oppose to using their mouse. So, instead we have a different way of of turning on apps, of interfacing with the computer's features. And that's by using the terminal/command line. We can even turn on VS Code from the terminal. And we can also turn on Node, by writing "node", and pressing Enter. When we do so, we want it to start running some JavaScript code. Where do we tell it to run the code from? Aha! We're gonna save some code in a file, and then give the name of the file along with its path to the node command:

node ./server.js

If Node is installed on your computer, it will start loading. It will turn on Node, which would turn on the JavaScript engine, that will allow you to turn on node features by writing javascript (-ish) code. We write all that javascript code in that saved file which path we provided. Before the days of "nodemon",iIf we had done any changes in our javascript file (server.js), we would have had to turn off Node, turn it back on, run through all the javascript code, and set up all of those internal features all over again.


11. Error Handling - The "server.on" method

We get errors in server side development. We're bound to! Because eventually we're dealing with someone else's computer! Trying to send us messages. There're a 1000 things that could go wrong in the process. We need to be able to handle errors. We need to understand better how our background http feature is working. Right now, it only auto-triggers doOnIncoming, when the message comes in. But what if we get a corrupt request? Do we want to look at at it? Investigate it? And send something back? No. We wanna look at it, maybe log it, and see what error is at hand. Wouldn't it be nice if we could set up another function that will be the one to run when a client error shows up? Turns out, there's a little piece of Node we haven't discussed. When the inbound message arrives, it's not automatically running the function doOnIncoming. It actually sends out a loud shout! Within Node. A message, they call it an "event", just a word/string, that is emitted in Node. And that word is "request". That event is what triggers the call to (the execution of) doOnIncoming. How do we tell Node that we want THAT function to trigger on THAT word when it's broadcasted? Well, we actually did that implicitly, here:

const server = http.createServer(doOnIncoming);

The first parameter passed to createServer, is actually saying "Hey Node, when the word/event request gets emitted, run doOnIncoming". But we can actually do it manually ourselves:

function doOnIncoming(req, res){ res.end('Welcome to LuckyLove!') }

function doOnError(infoOnError){ console.log(infoOnError); }

const server = http.createServer();
server.listen(80);

server.on('request', doOnIncoming);
server.on('clientError', doOnError);

Our return server object has another edit function called "on". The "on" function doesn't do anything in JavaScript, inly in Node. By the help of "on", we can match between an event name built into Node, and a javascript function we had defined early on. When that event is then triggered by Node, Node will trigger the running of the function we gave it as a match. We are now seeing here that behind the scenes, we actually have 2 built-in events called 'request' & 'clientError'. Previously, createServer was using one of them, even though we weren't aware of this, when we ran http.createServer like this:

http.createServer(doOnIncoming);

What it actually did behind the scenes is this:

server.on('request', doOnIncoming);

Basically, this was saying: "When a good message comes along, i.e. when the request event is emitted, invoke the function called doOnRequest". But what if a bad request comes along? For that we have an event name known as "clientError". So together, we have two:

server.on('request', doOnRequest);
server.on('clientError', doOnError);

And these are the explicit way to do so, to attach a function to a certain event that's emitted.


12. The File System

We have much of our app now set up, comprised of handling and inspecting the inbound message (which is in http format) known as requests. It's the core of our app! People, it's the core of Node. And it's the core of servers. This IS Node. Everything else is ancillary. It's really additional to this core model, of inbound message, send back a response. Cause that's all we really care about, really, the heart of it, is a user/client wants stuff, we look at what they want, we have a message that's ready to be sent out, we add stuff to that message, and send that back. However, Node can do even more. With Node, we can use a feature of the computer known as the File System, or simply the known as the hard-drive. How? Well Node, or rather C++, has access to that. It has a built-in feature that speaks to the File System. It's known as fs, both in Node in JavaScript. Meaning that a label in JavaScript for Node's fs is also called fs. We are now gonna see Node using another internal feature of our computer - the File Storage/System. Let's now, for the sake of learning, assume we have a very large file called "tweets.json". Where? Let's say it's in some folder called "root". We will consider "root" to be the folder in which out Node application is running from, because all Node apps must run from some folder, and Node knows about it.

Alright, question! Do we have access to that file through JavaScript? Definitely not! In the web browser we definitely not either! For security reasons! So from the start Node had no one to copy from about how to read/write a file. Thank god we had someone write a ton of C++ code to give us access, and of course the wonderful team at libuv, who worked in conjunction, but they are a big part of the interface between these two portions, especially when it comes to the file system. Because actually we're gonna see libuv is gonna spin up a thread of execution in the background, to handle the pulling of that data into Node. And all that just because the creators discovered there was too much complexity, and too many risks, with relying on using the computer internal operating system ability to basically spin up a background thread or handle things going on in the background and said "ok, we're just gonna always make sure we have an open focus channel to pull this data into Node". We will talk more about that a little bit later.

So, we're just about ready to start getting some data from our file system, where we saved a file called tweets.json. Now we want to use JavaScript labels, for a Node C++ feature, that's written in C++, that does have access to out file system. What is the JavaScript label that gonna give us access to Node's C++ feature that gives us access to the file storage? fs is the label, and fs.readFile is the specific label built-in to fs to give us access to read a file. fs.readFile takes in a string, which would later be interpreted to a "path", a position of a file in our file system, and I'm guessing we're gonna rely on Node to know how to go and look into the file system, grab that file, and start bringing it back in.

Let's see some code:

function (){
// code that removes bad tweets.
}

function useImportedTweets( errorData, data ){
const cleanTweetsJson = cleanTweets(data);
const tweetsObj = JSON.parse(cleanTweetsJson);
console.log(tweetsObj.tweet2);
}

fs.readFile('./tweets.json', useImportedTweets);

The important line is of course the last line of fs.readFile. We see that readFile accepts a string as its first argument. To JavaScript? This is just a string of characters. It has no idea what that means. Node, however, is going to look at it and say "oh ok, it's asking for this specific location on the computer". The second parameter is a function that we want to have AUTO-RUN! That's our keyword! Auto-ran! When do you think that function is going to auto-run? When the file has been completely read. So JavaScript triggers a Node feature using a label called fs.readFile, passes it a string and a function to auto-run. Node then takes the two arguments and says: "ok, let's first have a look at that string and try to parse it as a path". It sees that it starts with a dot "./", which tells its to look at the current running folder. For what file? For "tweets.json". Node is doing that with the help of libuv. Now... without going into too much details... But just so you know, unlike for when we speak to the network where we.... Node and libuv do not handle the actual opening of the socket and the focusing of the thread on awaiting an inbound message. A thread is the computer's ability to do a task, and focus on a single task at once. We are not responsible in either node or libuv for having a thread dedicated to an awaiting an inbound message. That is handled by the computers internals, the operating system itself. A thread is just the processing power of the computer to focus on a single task, meaning in this case to listen to an inbound message. We don't do that in Node nor in libuv. It's done by the operating system itself.

However, because there's too much (and this is to my understanding) too much variety in how different computers implement access to file storage and file system, libuv said we're gonna handle the setting up of a persistent thread that's going to focus on pulling that data into Node, and we're gonna be in charge of that, on any computer operating system you use.

This is one of the big senior questions in Node:

  • "Name an I/O in Node that sets up a dedicated thread, which is handled by libuv for doing that task"

  • "Name an I/O in Node that sets up a dedicated thread, which is NOT handled by libuv for doing that task"

The answer are: The File System - sets up a thread in libuv. I/O through a network/socket rely on the computer to do the focusing an awaiting on the inbound message. So, we're reading that file, reading reading reading, and when the time comes, Node is going to auto-run the function, and pass it an argument! I have a feeling that the auto-created auto-inserted data (the argument) might just be the data the file (tweets.json) that had just been read. Notice how the error is the first argument being passed! This is known for as the "Error first" pattern, which Node embraced. When there's no error, the error object would default to null, not undefined! But null.


13. The Event System

Here we're gonna do pretty much the same thing we did before, but this time we're gonna be a bit more sophisticated, and add a key pattern - a pattern known as the "Event System". The "Event System" is a system which behind the scenes, when things happen in the computer's internal features, Node is going to broadcast a message. The two events we saw were "request" & "clientError". By default, "request" & "clientError" have no functions attached to them, but they still get triggered when an inbound message arrives, each at his own condition. Our job is to attach a function to the "hook" they each have, using the "in" method. Wishful thinking here, but we want doOnIncoming to auto-run when "request" is emitted, and doOnError to auto-run when Node shouts "clientError".


14. The Event Loop

- from the official docs

  • What is the Event Loop?
    The event loop is what allows Node.js to perform non-blocking I/O operations — despite the fact that JavaScript is single-threaded — by offloading operations to the system kernel whenever possible. Since most modern kernels are multi-threaded, they can handle multiple operations executing in the background. When one of these operations completes, the kernel tells Node.js so that the appropriate callback may be added to the poll queue to eventually be executed. We'll explain this in further detail later in this topic.

  • Event Loop Explained When Node.js starts, it initializes the event loop, processes the provided input script (which may make async API calls, schedule timers, or call process.nextTick() ), then begins processing the event loop. The following diagram shows a simplified overview of the event loop's order of operations.

       ┌───────────────────────────┐
    ┌─>│ timers │
    │ └─────────────┬─────────────┘
    │ ┌─────────────┴─────────────┐
    │ │ pending callbacks │
    │ └─────────────┬─────────────┘
    │ ┌─────────────┴─────────────┐
    │ │ idle, prepare │
    │ └─────────────┬─────────────┘ ┌───────────────┐
    │ ┌─────────────┴─────────────┐ │ incoming: │
    │ │ poll │<─────┤ connections, │
    │ └─────────────┬─────────────┘ │ data, etc. │
    │ ┌─────────────┴─────────────┐ └───────────────┘
    │ │ check │
    │ └─────────────┬─────────────┘
    │ ┌─────────────┴─────────────┐
    └──┤ close callbacks │
    └───────────────────────────┘

Each box will be referred to as a "phase" of the event loop.
Each phase has a FIFO queue of callbacks to execute. While each phase is special in its own way, generally, when the event loop enters a given phase, it will perform any operations specific to that phase, then execute callbacks in that phase's queue until the queue has been exhausted or the maximum number of callbacks has executed. When the queue has been exhausted or the callback limit is reached, the event loop will move to the next phase, and so on.
Since any of these operations may schedule more operations and new events processed in the poll phase are queued by the kernel, poll events can be queued while polling events are being processed. As a result, long running callbacks can allow the poll phase to run much longer than a timer's threshold. See the timers and poll sections for more details.
NOTE!!!
There is a slight discrepancy between the Windows and the Unix/Linux implementation, but that's not important for this demonstration. The most important parts are here. There are actually seven or eight steps, but the ones we care about — ones that Node.js actually uses - are those above.

- Phases Overview
• timers: this phase executes callbacks scheduled by setTimeout() and setInterval().
• pending callbacks: executes I/O callbacks deferred to the next loop iteration. idle, prepare: only used internally.
• poll: retrieve new I/O events; execute I/O related callbacks (almost all with the exception of close callbacks, the ones scheduled by timers, and setImmediate()); node will block here when appropriate.
• check: setImmediate() callbacks are invoked here.
• close callbacks: some close callbacks, e.g. socket.on('close', ...).
- Phases in Detail

_Timers_

A timer specifies the threshold after which a provided callback may be executed rather than the exact time a person wants it to be executed. Timers callbacks will run as early as they can be scheduled after the specified amount of time has passed; however, Operating System scheduling or the running of other callbacks may delay them.

---

## 15. Asynchronicity

...
setTimeout
The Timers module in Node.js contains functions that execute code after a set period of time. Timers do not need to be imported via require(), since all the methods are available globally to emulate the browser JavaScript API. To fully understand when timer functions will be executed, it's a good idea to read up on the Node.js Event Loop.
setTimeout is not JavaScript. There's no Timer in JavaScript. Even in the web browser, it's something that happens in the background features of the browser, known as Browser APIs. In Node, we had to build those from scratch. And one of them is the ability to set up a Timer. Now, exactly how timers are run in Node? Well, for starters, what are timers? In general timers are just a time in the computer clock comparing a future time for how many milliseconds have passed to see whether enough milliseconds have passed to run that things which you delayed by X time seconds. So we'll just call it a timer, but it's technically doing that in the background, it's setting a start time of which the time setTimeout run in the background, and then checking... well, actually it's libuv, and actually, technically, it's the event loop that checks every time "has enough time has passed such that the timer is complete, and the associated function attached wants to run?".

...
The event-loop is very strict.
What rules does it set for what code to run next and when it may run?
...

There's actually a whole bunch of queues. And different functions set to auto-run by Node will be put in different queues.
The event-loop is really restrictive on what has to have finished in synchronous regular JavaScript code before anything from those queues is allowed on to the callstack.
Node is most powerful because of the automated JS function execution triggered by Node at just the right moment.
This means we don't have to wait in JS for the right moment to run code and block any other code running, but it also means we better know immediately how Node decides what to automatically execute at what moment.
And so we're now gonna set up a scenario here, and see 9 lines of code, with 4 functions and we're gonna delay 3 of them by using Node C++ features to attach these functions and have them auto-run. And we're gonna see that each of the functions will behave differently. In other words, when it's triggered and ready to run, it'll be added to a different queue, and the event-loop will prioritize certain queues over other queues.

```javascript
function useImportedTweets( errorData, data ){
const tweets = JSON.parse(data);
console.log(tweets.tweet1);
}

function immediately(){ console.log('run me last! _crying_') }

function printHello(){ console.log('Hello') }

function blockFor500(){
// Block JS thread DIRECTLY for 500ms
// with e.g. a for loop with 5m elements
}

setTimeout(printHello, 0);

fs.readFile( './tweets.json', useImportedTweets );

blockFor500();

console.log('Me first!');

setImmediate(immediately);

Up here we're saving 4 functions, and then gonna set a printHello function to run after 0 milliseconds (which I already have a feeling it won't run after 0 milliseconds), and I'm gonna set up a function useImportedTweets to run after a bunch of tweet data being imported, the whole file using readFile. I'm then gonna run blockFor500 ms function, which we're not gonna write the code of, but when it runs - it is not a timer! It is going to do some task in JS 5 million times that will end up taking 500ms in JavaScript, not in Node, but in JavaScript. Then, we're gonna run a console.log, and then we're gonna run a funny little thing called setImmediate - which is another Node feature that gives us control over putting stuff into another totally separate queue. We'll see. We're gonna see 3 of the queues today. There are 2 more which we're not gonna cover, I'll tell you what they are though.

The event-loop is very strict. What rules does it set for what code to run next and when those functions are allowed back in?

setTimeout is our first interesting line. When it executes, Node sets up a timer, gives it 0 milliseconds (as per the above code), so it immediately get a "status" of "done", and where does it go? Does the function go back to our call stack? No! It goes into our "Timer Queue". Then comes the line of fs.readFile() which sets up an instance of the fs feature of Node to access the file system, with the help of libuv, which actually sets up a background thread to handle the passing of the data and focus on the data coming in. readFile needs to know the path to the file, and the function to auto-run when it's complete. Now, in JavaScript, we hit the line of calling blockFor500 ms. It gets pushed onto the call stack, and is gonna sit there for 500 ms. Let's say, that around 200ms in, the fs.readFile finished its job. The file comes back. i shall remind you that the attached function to readFile, the auto-run function was useImportedTweets. What data is going to be auto-inserted to it? 2 parameters! The error data, which we hope would be null, and the file data in the Buffer format. Is the function useImportedTweets allowed into the call stack? Definitely not. So, instead, we have a second queue, which is called the "I/O queue", or the "I/O callback queue". Into it are queued up any functions triggered to run by... well honestly most of Node. Like, 95% of the functions you'll have set to auto-run will end up in this queue. Any ones that involve data coming from the file system, from a network socket, any of those, all the associated auto-run functions go into the I/O callback queue. So useImportedTweets is stored within the "I/O callback queue". An so it sits there, waiting for the event loop to call it. After 500ms have passed since the initiation of blockFor500, does the event loop feel free to go look inside one of the queues? No! Because it still has some code left to run/execute within the script! What is the next thing it runs?

console.log("Me first!");

So, "Me first!" Is really the thing that's gonna get printed first. But even now, there's still some more code we need to run! And this next one is really really an intriguing one.

setImmediate(immediately);

This is a feature of Node, to ensure that we can add a function to run, after all I/O functions have finished running. And that is called the "Check Queue". How do I get a function into the check queue? I use setImmediate. Whatever function I pass into setImmediate, that function will be absolutely the opposite of running immediately. It is the worst named function in all of history, and it's been at least 10 years since it was released, so bear that in mind. It is the worst name, because it is in the LAST queue that's gonna be checked. But put the name aside for a moment, there are times where we want to make sure that all I/O work at that point has been done. All completed input/output auto-run functions have run. And the way we can do that is by using setImmediate that will put the associated function in the last queue. So, we ran setImmediate, pass it the function we called immediately. This setImmediate function also speaks to Node, which is going to set up a C++ background feature, that instantly grabs the associated function passed and pushes it into the "Check queue". Now, is there any global code left? Is there anything on our call stack? No! So here kicks in the event loop, and says "We're good to check the queues!".

What is the status of each of our queues?

(Partial sketch...)
│ │
│ │
│───────────────────────────│
┌──│ global │
│ └───────────────────────────┘
│ Call Stack


│ Timer Queue
│ ┌───────────────────────────┐
│--│ (1) printHello, │
│ └───────────────────────────┘

│ I/O Callback Queue
│ ┌───────────────────────────┐
│--│ (1) useImportedTweets, │
│ └───────────────────────────┘

│ Check Queue
│ ┌───────────────────────────┐
└──┤ (1) immediately, │
└───────────────────────────┘

So the first queue we're checking is the Timer Queue. Well it's actually technically what's called a Timer min-heap data-structure-wise, but we can metaphorically think of it as a queue. So we're popping out printHello from the Timer Queue, and put it into the call stack. It gets executed, and prints "Hello" in our console. Next we are checking the IO callback queue, which is where 95% of our delayed functions to be auto-run will be in, which where the most interesting stuff are happening, because... timers are not that interesting. All stuff that's interacting IO they all go in the IO callback queue. So Node checks the IO callback queue and finds there useImportedTweets, which was added there longgg ago, and oh! It's got auto-inserted data! useImportedTweets is added to the call stack, gets executed, console logs tweet number 1, which let's assume contained the word "Hello". useImportedTweets is then dequeued from the call stack, and the event loop check its final queue, knowing that ALL IO callbacks had been ran automatically, handled one by by one, and then and only then we'd have Node check its final queue - the Check queue. Node checks the "Check queue" and finds there the function immediately, dequeues it, adds it to the call stack, executes it, and console logs "Run me last".

Now, We are not going to talk about Promises today, but if you watch "async Hard Parts" you would have heard about Promises. These are an alternative way of setting up work to happen in Node, with C++, where rather than a function being auto-run in JavaScript, data will be auto-updated in JavaScript that will trigger a function to run on that data. That function is not added to any of these queues. It's added toe something called a "Micro-task queue", which takes precedence over each of these queues. It's like priority number 0. Not only that, it actually goes back and checks that queue between each of the checkings of the other queues. And ACTUALLY! There are TWO of those! Let's call them (a) and (b). (a) The first one, is if we run a function which we're not meant to use anymore, called process.nextTick(), and pass it a function, that one gets stuck in this very first queue here. (b) Any function delayed using Promises gets inserted into this second queue here. And in-between the event loop checking either one of the 3 queues we've mentioned before, it will always go back and first check these 2 micro-task queues, before it moves on to check the next queue. There's one final queue called the "close queue", which is on any "close" events. Those associated functions go into a 4th queue called the "Close handle queue", and any functions set to auto-run on for example the closing of a stream, they will be added in this "Close Queue". 6 queues in total, that the event loop prioritize: one two three and four, and a zero queue, which actually splits into (a) and (b), which get checked in between any of the other four queues. That is now all the queues of Node's event loop.

(Full sketch)
│ │
│ │
│───────────────────────────│
┌──│ global │
│ └───────────────────────────┘
│ Call Stack


│ Queue 0: the Micro-Task Queue
│ ┌──────────────────────────────────────┐
│--│ (a) process.nextTick()
│ │
│ │
│ │ (b) Functions delayed using Promises
│ └──────────────────────────────────────┘


│ Queue 1: the Timer Queue
│ ┌──────────────────────────────────────┐
│--│ setTimeout, setInterval
│ └──────────────────────────────────────┘

│ Queue 2: the I/O Callback Queue
│ ┌──────────────────────────────────────┐
│--│
│ └──────────────────────────────────────┘

│ Queue 3: the Check Queue
│ ┌──────────────────────────────────────┐
│--│ setImmediate
│ └──────────────────────────────────────┘

│ Queue 4: the Close Queue
│ ┌──────────────────────────────────────┐
└──┤ readStream.on('close'), socket.on('close'), ...
└──────────────────────────────────────┘

This entire model is built in Node, with the help of libuv. All of these queues, and the event loop triggering, it is all in Node. Not JavaScript. It is all built in C++.

Summary - Rules for the automatic execution of JS code by Node 1. Hold each deferred function in one of the task queues when when the Node background API "completes". 2. Add the function to the call stack (i.e. execute the function) ONLY when the call stack is totally empty (have the Event Loop check for this condition). 3. Prioritize functions in the "Timer" queue over the ones in the "IO callback" queue, over the ones in the "Check" queue, over the ones in the "Close" queue. AND!!! Prioritize the "Micro-task Queue" over any of these 4 queues.

A deferred function - when I say deferred I'm talking about the one that we didn't run ourselves, and gonna be auto-run later, deferred, delayed.

           (Full sketch)

│ │
│ │
│───────────────────────────│
┌──│ global │
│ └───────────────────────────┘
│ Call Stack


│ Queue 0: the Micro-Task Queue
│ ┌──────────────────────────────────────┐
│--│ (a) process.nextTick()
│ │
│ │
│ │ (b) Functions delayed using Promises
│ └──────────────────────────────────────┘


│ Queue 1: the Timer Queue
│ ┌──────────────────────────────────────┐
│--│ setTimeout, setInterval
│ └──────────────────────────────────────┘

│ Queue 2: the Pending Callbacks Queue
│ ┌──────────────────────────────────────┐
│--│ executes I/O callbacks deferred to the next loop iteration. Callback for some System operations.
│ └──────────────────────────────────────┘

│ Queue 3: the Poll Queue
│ ┌──────────────────────────────────────┐
│--│ incoming: connections, requests, data, etc.
│ └──────────────────────────────────────┘

│ Queue 4: the Check Queue
│ ┌──────────────────────────────────────┐
│--│ setImmediate
│ └──────────────────────────────────────┘

│ Queue 5: the Close Queue
│ ┌──────────────────────────────────────┐
└──┤ readStream.on('close'), socket.on('close'), ...
└──────────────────────────────────────┘

• Queue 2: The Pending Callbacks Queue This phase executes callbacks for some system operations such as types of TCP errors. For example if a TCP socket receives ECONNREFUSED when attempting to connect, some unix systems want to wait to report the error. This will be queued to execute in the pending callbacks phase.

• Queue 3: The Poll Queue The poll phase has two main functions: 1. Calculating how long it should block and poll for I/O, then 2. Processing events in the poll queue. When the event loop enters the poll phase and there are no timers scheduled, one of two things will happen:

• If the poll queue is not empty, the event loop will iterate through its queue of callbacks executing them synchronously until either the queue has been exhausted, or the system-dependent hard limit is reached.

• If the poll queue is empty, one of two more things will happen:

⁃ If scripts have been scheduled by setImmediate(), the event loop will end the poll phase and continue to the check phase to execute those scheduled scripts.

⁃ If scripts have not been scheduled by setImmediate(), the event loop will wait for callbacks to be added to the queue, then execute them immediately.

Once the poll queue is empty the event loop will check for timers whose time thresholds have been reached. If one or more timers are ready, the event loop will wrap back to the timers phase to execute those timers' callbacks.

• Queue 4: The Check Queue

This phase allows a person to execute callbacks immediately after the poll phase has completed. If the poll phase becomes idle and scripts have been queued with setImmediate(), the event loop may continue to the check phase rather than waiting. setImmediate() is actually a special timer that runs in a separate phase of the event loop. It uses a libuv API that schedules callbacks to execute after the poll phase has completed. Generally, as the code is executed, the event loop will eventually hit the poll phase where it will wait for an incoming connection, request, etc. However, if a callback has been scheduled with setImmediate() and the poll phase becomes idle, it will end and continue to the check phase rather than waiting for poll events. The main advantage to using setImmediate() over setTimeout() is setImmediate() will always be executed before any timers if scheduled within an I/O cycle (AND I SAY! NOT IF! BUT ALWAYS! DOESN'T MATTER WHEN IT'S CALLED!), independently of how many timers are present.

• Queue 5: The Close Queue If a socket or handle is closed abruptly (e.g. socket.destroy()), the 'close' event will be emitted in this phase. Otherwise it will be emitted via process.nextTick().

• Process.nextTick()

You may have noticed that process.nextTick() was not displayed in the diagram, even though it's a part of the asynchronous API. This is because process.nextTick() is not technically part of the event loop. Instead, the nextTickQueue will be processed after the current operation is completed, regardless of the current phase of the event loop. Here, an operation is defined as a transition from the underlying C/C++ handler, and handling the JavaScript that needs to be executed.

Looking back at our diagram, any time you call process.nextTick() in a given phase, all callbacks passed to process.nextTick() will be resolved before the event loop continues. This can create some bad situations because it allows you to "starve" your I/O by making recursive process.nextTick() calls, which prevents the event loop from reaching the poll phase.


16. List Of All Auto Emitted Events

All objects that emit events are instances of the EventEmitter class.

• Event 1: connection

A TCP server would emit a TCP connection event every time somebody connects.

• Event 2: body

If someone is doing an HTTP upload the request object would emit a "body" event, each time you get a packet of that load. So if somebody is uploading a stream, a movie or so, to your server, then you get: "body"... "body" ... "body" ...


17. Node V.s. Deno

Ryan Dahl about Node:

  • • Regret 1: the extension-less require
    Also, this whole thing where you don't include the extension in in the require module. why? it's like needlessly non-explicit. You now have to probe the file system for different things: "Did you mean .js? .ts? Did you mean dot bluh bluh bluh?". And like "NO! just write the fu... the extension in there". And a lot of people disagree with me, there is some debate over this issue. People like the extension-less thing. It's "cleaner"! or whatever. And i'm like.... "No".

  • • Regret 2: index.js
    Also.... index.js. I'm sorry. I thought it was cute. There's index.html, i thought i would be cute for like you know when you include a directory that it would look up the index.js file. Sorry about that. This was like.... needlessly introduced. I regret those.

  • • Regret 3: security
    Unfortunately, in Node, we just bounce to everything, and there's zero security. You run a Node program, and you have access to all sorts of system calls, and that was really a missed opportunity to be able to make a server-side runtime that could potentially be secure in certain situations. Obviously, if you wanted to give access to the disk, then people are going to be able to exploit the disk, but there are certain situations where you wanna run a program outside of the web-browser, but you don't necessarily want it to be able to write to the disk, or access the network. Right? For example, a linter. It would be nice for me to be able to download the massive codebase that eslint is and run that without having worry about it taking over my computer - which it could.

  • • Regret 4: the build system
    Probably the biggest regret is the build system. Such a pain. Build systems are very very difficult, and very very important to building projects. Node uses this thing called "gyp". If you're writing a module that links to a C library, you use this thing called gyp, to compile that C library and link it into Node. Right? Gyp is this thing that Chrome used to use, but later on Chrome abandoned gyp for this other tool called GN, several years later. We couldn't have predicted that, but that's what happened, and now it's been many many years since that happened, and Node is the sole user of gyp. It's a very funky interface. Like it's a json file, but it's in Python, it's... it's very terrible. To top it off, Node has several wrappers around this, one of them is called "node-gyp", you might have heard about it. It's just layers upon layers of unnecessary complexity. V8 doesn't build with gyp anymore, it has a gyp wrapper to support Node, but... it's just.... There's just so much unnecessary complexity there, and yeah, I frankly think this is one of the biggest failures of Node.

  • • Regret 5: package.json
    I mistakenly made package.json popular by allowing the "require" in Node semantics to look into the package.json and look through files. This made package.json necessary for Node programs where it was not before. And then I ultimately included nom into Node, which made npm the standard Node distribution service. The problem I have with the package.json, is that it gives this rise to the concept of "modules" as like a directory of files, where that wasn't really a concept before. Like, we just had JavaScript files. Like, on the web you just have JavaScript files, and you can just script-tag and include them all over the place. There isn't like a... you know, a thing! The second problem I have with the package.json is that the package.json has all the unnecessary noise in it. Like: license, repository, ... Like, why am I filling this out? I feel like a book-keeper or something. This is just unnecessary stuff to do, when all I'm trying to do is "link to a library".

  • • Regret 6: node_modules This whole algorithm for resolving module names is just WILDLY COMPLEX! It's kind of been added to over time, in ways that are regrettable. It deviates greatly from how browsers do stuff, and it's my fault, and I'm very sorry, and unfortunately, it is impossible to undo now.


Deno is a runtime for JavaScript, TypeScript, and WebAssembly that is based on the V8 JavaScript engine and the Rust programming language. Deno was co-created by Ryan Dahl, who also created Node.js.

Deno explicitly takes on the role of both runtime and package manager within a single executable, rather than requiring a separate package-management program. • Comparison with Node.js

Deno and Node.js are both runtimes built on Google's V8 JavaScript engine, the same engine used in Google Chrome. They both have internal event loops and provide command-line interfaces for running scripts and a wide range of system utilities.

Deno mainly deviates from Node.js in the following aspects: 1. Supports only ES Modules like browsers where Node.js supports both ES Modules and CommonJS. CommonJS support in Deno is possible by using a compatibility layer. Supports only URLs for loading local or remote dependencies, similar to browsers. Node.js supports both URLs and modules.

Does not require a package manager for resource fetching, thus no need for a registry like npm. Supports TypeScript out of the box, using a snapshotted TypeScript compiler or the swc compiler with caching mechanisms. Aims for better compatibility with browsers with a wide range of Web APIs. Restricts file system and network access by default in order to run sandboxed code. Supports a single API to utilize promises, ES6 and TypeScript features whereas Node.js supports both promise and callback APIs. Minimizes core API size, while providing a large standard library with no external dependencies. Uses message passing channels for invoking privileged system APIs and using bindings.


18. Spawn Child Process

If you want to harness the full power of your processor, you have what's known in node as "child processes".

There are 4 ways to instantiate a child process:

  • exec
  • execFile
  • spawn
  • fork

Node has a module that exports these 4 methods, and it's called "child_process".

• Method 1: exec

Using exec we can run a command inside our shell, and get the output back. It is pretty useful when running SMALL commands, "small" meaning commands which have a small stdout. Spawns a shell, then executes the command within that shell, buffering any generated output. The command string passed to the exec function is processed directly by the shell and special characters (vary based on shell) need to be dealt with accordingly.

The command:

import { exec } from 'child_process';

exec('ls -la', ( error, stdout, stderr ) => {
if (error) return console.error(error.message);
if (stderr) return console.error(stderr);

console.log(stdout);
});

As you can see from the example above, the exec command takes in 2 arguments: the command as a string, a callback function.

The callback function itself takes in 3 arguments:

  • error: that is in case there is an error executing this command. I.E. the command not found, or it has missing arguments.

  • stderr: the command has been executed, but there has been some error inside the terminal.

  • stdout: the response/output of a good execution of the command. Problems with this command: The problem with this "exec" command is that all the stdout is taken into buffer, and is then printed out to the console. So like, if we write a command which has a huge std output, then we won't be able to use this method. You could verify that with the "find" method on your root directory:

    import { exec } from 'child_process';

    exec('find /', ( error, stdout, stderr ) => {
    if (error) return console.error(error.message);
    if (stderr) return console.error(stderr);

    console.log(stdout);
    });

    If you try to run this, you'll get an error saying: error: maxBuffer length exceeded If we want to execute command which have a large stdout, we need to use the "spawn" method.

• Method 2: spawn

Example of running ls -lh /usr, capturing stdout, stderr, and the exit code:

import { spawn } from 'node:child_process';

const child = spawn('ls', ['-lh', '/usr']);

// Catches portions of data streams (outputs) coming from the command:
child.stdout.on('data', (data) => {
console.log(`stdout: ${data}`);
});

// Catches an error after executing the command successfully:
child.stderr.on('data', (data) => {
console.error(`stderr: ${data}`);
});

// Catches an error with the command itself. i.e. the command not found
child.on('error', (error) => {
console.log(`There was an error with the command: ${error}`);
});

// Catches an exit signal from the process:
child.on('close', (code) => {
console.log(`child process exited with code ${code}`);
});

// Catches an exit signal from the process:
child.on('exit', (code) => {
console.log(`child process exited with code ${code}`);
});

The child_process.spawn() method spawns a new process using the given command, with command-line arguments in args.

The spawn method accepts 3 arguments as input:

  • the command itself as string.

  • args: an array of flags & sub-commands. An optional parameter. Defaults to an empty array.

  • options: used to specify additional options. An optional parameter. Defaults to this object:

      const defaults = {
    cwd: undefined,
    env: process.env,
    };
  • cwd: Use cwd to specify the working directory from which the process is spawned. If not given, the default is to inherit the current working directory. If given, but the path does not exist, the child process emits an ENOENT error and exits immediately. ENOENT is also emitted when the command does not exist.

  • env: Use env to specify environment variables that will be visible to the new process, the default is process.env. undefined values in env will be ignored. IMPORTANT TO KNOW!!! You cannot pass flags to the main command string. If you need flags the only way to pass them is by using the optional args argument:

    // Good: would work!
    const child = spawn('pnpm', ['-v']);
    // Bad: would throw an exception...
    const child = spawn('pnpm -v');

Exit Event V.S. Close Event the short version is, 'exit' emits when the child exits but the stdio are not yet closed. 'close' emits when the child has exited and its stdios are closed. Besides that they share the same signature.