I have a function that recursively gets all the files in a directory using fs.readdirSync. It worked fine with the small directory I ran it as a test, but now I'm running it on a directory that's over 100GB and it takes a long time to complete. Any ideas on how to speed this up or if there's a better way? I'll eventually have to run this on some directories containing terabytes of data.
// Recursive function to get files function getFiles(dir, files = []) { // Get an array of all files and directories in the passed directory using fs.readdirSync const fileList = fs.readdirSync(dir); // Create the full path of the file/directory by concatenating the passed directory and file/directory name for (const file of fileList) { const name = `${dir}/${file}`; // Check if the current file/directory is a directory using fs.statSync if (fs.statSync(name).isDirectory()) { // If it is a directory, recursively call the getFiles function with the directory path and the files array getFiles(name, files); } else { // If it is a file, push the full path to the files array files.push(name); } } return files; }
Unfortunately,
Async
is slower. So we need to optimize your code. You can do this using the{withFileTypes:true}
option, which is 2x faster.I also tried the
{recursive:true}
option for node v20, but it was even slower than your solution. It does not work withwithFileTypes
.Maybe a better SSD with high read speeds would help. While I'm guessing the file entries are read from the file system index, not sure how the hardware affects that.
Output: