Skip to content

Commit 6a7aa45

Browse files
committed
Added an NPM package for downloading, building and providing the node.js
addon from source files refs #309
1 parent 7aa8818 commit 6a7aa45

6 files changed

Lines changed: 1737 additions & 0 deletions

File tree

bindings/node.js/.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
/node_modules/
2+
/whisper.cpp

bindings/node.js/README.md

Lines changed: 166 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,166 @@
1+
# NPM package wrapping the whisper.cpp Node.js addon
2+
3+
This package is useful in a node.js environment (node, Electron, etc.) where it will provide access to the C++ implementation of whisper.cpp which should be the fastest possible way to run it.
4+
5+
## Install
6+
To install you need to have [cmake v4+](https://cmake.org/download/) (and [Visual Studio](https://github.com/nodejs/node-gyp#on-windows) on Windows) already available on you machine - it will be used to build the addon for your environment.
7+
8+
```shell
9+
npm install whisper.cpp.node
10+
npx whisper.cpp.node install
11+
```
12+
13+
This will download the whisper.cpp repository and will build the addon. There are few commands that can be used:
14+
15+
- "install" - downloads the latest released tag of the whisper.cpp repository at the time of publishing or does nothing if the repo is already downloaded;
16+
- "install latest" - downloads the latest master;
17+
- "reinstall" - downloads the latest released tag of the whisper.cpp repository at the time of publishing even if the repo is already downloaded;
18+
- "reinstall latest" - will produce the same result as "install latest";
19+
- "rebuild" - will not download anything but will simply try to rebuild the addon (if for example you change the version of node).
20+
21+
The addon will then be available as the package's main export for use like:
22+
23+
```javascript
24+
const whisper = require("whisper.cpp.node");
25+
26+
const transcription = await whisper({
27+
language: 'en',
28+
model: './models/ggml-base.en.bin',
29+
fname_inp: './your/file/here'
30+
});
31+
32+
console.log(transcription);
33+
```
34+
35+
Check the [Supported Parameters section](#supported-parameters) for more parameter information.
36+
37+
## What the package will not provide for you
38+
39+
It will not download any models for inference. As noted in many other packages, there are models ready for download and use in the [Hugging Face repo of whisper.cpp](https://huggingface.co/ggerganov/whisper.cpp/tree/main).
40+
41+
It will also not work if you try to bundle it for the browser (you should use the [whisper.cpp](https://www.npmjs.com/package/whisper.cpp) package instead which provides the WASM version).
42+
43+
## Links
44+
45+
- [GitHub Repository](https://github.com/gkostov/whisper.cpp.node)
46+
- [NPM Package](https://www.npmjs.com/package/whisper.cpp.node)
47+
48+
49+
And following is the original README of the addon where you can see details for using it.
50+
______
51+
52+
# whisper.cpp Node.js addon
53+
54+
This is an addon demo that can **perform whisper model reasoning in `node` and `electron` environments**, based on [cmake-js](https://github.com/cmake-js/cmake-js).
55+
It can be used as a reference for using the whisper.cpp project in other node projects.
56+
57+
This addon now supports **Voice Activity Detection (VAD)** for improved transcription performance.
58+
59+
## Install
60+
61+
```shell
62+
npm install
63+
```
64+
65+
## Compile
66+
67+
Make sure it is in the project root directory and compiled with make-js.
68+
69+
```shell
70+
npx cmake-js compile -T addon.node -B Release
71+
```
72+
73+
For Electron addon and cmake-js options, you can see [cmake-js](https://github.com/cmake-js/cmake-js) and make very few configuration changes.
74+
75+
> Such as appointing special cmake path:
76+
> ```shell
77+
> npx cmake-js compile -c 'xxx/cmake' -T addon.node -B Release
78+
> ```
79+
80+
## Run
81+
82+
### Basic Usage
83+
84+
```shell
85+
cd examples/addon.node
86+
87+
node index.js --language='language' --model='model-path' --fname_inp='file-path'
88+
```
89+
90+
### VAD (Voice Activity Detection) Usage
91+
92+
Run the VAD example with performance comparison:
93+
94+
```shell
95+
node vad-example.js
96+
```
97+
98+
## Voice Activity Detection (VAD) Support
99+
100+
VAD can significantly improve transcription performance by only processing speech segments, which is especially beneficial for audio files with long periods of silence.
101+
102+
### VAD Model Setup
103+
104+
Before using VAD, download a VAD model:
105+
106+
```shell
107+
# From the whisper.cpp root directory
108+
./models/download-vad-model.sh silero-v6.2.0
109+
```
110+
111+
### VAD Parameters
112+
113+
All VAD parameters are optional and have sensible defaults:
114+
115+
- `vad`: Enable VAD (default: false)
116+
- `vad_model`: Path to VAD model file (required when VAD enabled)
117+
- `vad_threshold`: Speech detection threshold 0.0-1.0 (default: 0.5)
118+
- `vad_min_speech_duration_ms`: Min speech duration in ms (default: 250)
119+
- `vad_min_silence_duration_ms`: Min silence duration in ms (default: 100)
120+
- `vad_max_speech_duration_s`: Max speech duration in seconds (default: FLT_MAX)
121+
- `vad_speech_pad_ms`: Speech padding in ms (default: 30)
122+
- `vad_samples_overlap`: Sample overlap 0.0-1.0 (default: 0.1)
123+
124+
### JavaScript API Example
125+
126+
```javascript
127+
const path = require("path");
128+
const { whisper } = require(path.join(__dirname, "../../build/Release/addon.node"));
129+
const { promisify } = require("util");
130+
131+
const whisperAsync = promisify(whisper);
132+
133+
// With VAD enabled
134+
const vadParams = {
135+
language: "en",
136+
model: path.join(__dirname, "../../models/ggml-base.en.bin"),
137+
fname_inp: path.join(__dirname, "../../samples/jfk.wav"),
138+
vad: true,
139+
vad_model: path.join(__dirname, "../../models/ggml-silero-v6.2.0.bin"),
140+
vad_threshold: 0.5,
141+
progress_callback: (progress) => console.log(`Progress: ${progress}%`)
142+
};
143+
144+
whisperAsync(vadParams).then(result => console.log(result));
145+
```
146+
147+
## Supported Parameters
148+
149+
Both traditional whisper.cpp parameters and new VAD parameters are supported:
150+
151+
- `language`: Language code (e.g., "en", "es", "fr")
152+
- `model`: Path to whisper model file
153+
- `fname_inp`: Path to input audio file
154+
- `use_gpu`: Enable GPU acceleration (default: true)
155+
- `flash_attn`: Enable flash attention (default: false)
156+
- `no_prints`: Disable console output (default: false)
157+
- `no_timestamps`: Disable timestamps (default: false)
158+
- `detect_language`: Auto-detect language (default: false)
159+
- `audio_ctx`: Audio context size (default: 0)
160+
- `max_len`: Maximum segment length (default: 0)
161+
- `max_context`: Maximum context size (default: -1)
162+
- `prompt`: Initial prompt for decoder
163+
- `comma_in_time`: Use comma in timestamps (default: true)
164+
- `print_progress`: Print progress info (default: false)
165+
- `progress_callback`: Progress callback function
166+
- VAD parameters (see above section)

bindings/node.js/bin/build.js

Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,84 @@
1+
#!/usr/bin/env node
2+
3+
const { execSync } = require('child_process');
4+
const fs = require('fs');
5+
const path = require('path');
6+
const https = require('https');
7+
const { createWriteStream } = require('fs');
8+
9+
async function downloadAndUnzip(url, filename, output) {
10+
const file = createWriteStream(filename);
11+
12+
console.log(`Downloading from ${url} ...`);
13+
await new Promise((resolve, reject) => {
14+
https.get(url, (response) => {
15+
if (response.statusCode === 302 && response.headers.location) {
16+
https.get(response.headers.location, (redirectResponse) => {
17+
redirectResponse.pipe(file);
18+
file.on('finish', resolve);
19+
file.on('error', reject);
20+
}).on('error', reject);
21+
} else {
22+
response.pipe(file);
23+
file.on('finish', resolve);
24+
file.on('error', reject);
25+
}
26+
}).on('error', reject);
27+
});
28+
console.log(`Download complete. Extracting...`);
29+
30+
const unzipCmd = process.platform === 'win32' ?
31+
`powershell -command "Expand-Archive -Path '${filename}' -DestinationPath '${output}' -Force"` :
32+
`unzip -o "${filename}" -d "${output}"`;
33+
execSync(unzipCmd, { stdio: 'inherit' });
34+
fs.unlinkSync(filename);
35+
}
36+
37+
const isLatestRequested = process.argv[3] == 'latest';
38+
const releaseUrl = isLatestRequested ? 'https://github.com/ggml-org/whisper.cpp/archive/refs/heads/master.zip' : 'https://github.com/ggml-org/whisper.cpp/archive/refs/tags/v1.8.3.zip';
39+
const whisperCppDir = path.resolve(__dirname, '..', 'whisper.cpp');
40+
const buildCmd = `cd ${whisperCppDir} && npx cmake-js compile -T addon.node -B Release`;
41+
42+
async function install() {
43+
console.log(
44+
'whisper.cpp will now be downloaded from the github repo and the addon will be built.'
45+
+ '\nIt is going to a take a while ...'
46+
);
47+
// extract to a .tmp directory because it's not yet know what directory the files will be extracted to
48+
const whisperCppTmpDir = whisperCppDir + '.tmp';
49+
await downloadAndUnzip(releaseUrl, path.resolve(__dirname, '..', releaseUrl.split('/').pop()), whisperCppTmpDir);
50+
// move the extracted files out from the first (and only) directory inside to "whisper.cpp"
51+
fs.renameSync(path.resolve(whisperCppTmpDir, fs.readdirSync(whisperCppTmpDir)[0]), whisperCppDir);
52+
fs.rmdirSync(whisperCppTmpDir);
53+
console.log(`Building ...`);
54+
execSync(buildCmd, { stdio: 'inherit' });
55+
console.log('whisper.cpp.node should now be ready for use.');
56+
}
57+
58+
if (process.argv[2] == 'install') {
59+
if (isLatestRequested) {
60+
if (fs.existsSync(whisperCppDir)){
61+
console.log('Replacing with the latest one from master.');
62+
execSync(`rm -rf ${whisperCppDir}`, { stdio: 'inherit' });
63+
}
64+
install();
65+
} else if (fs.existsSync(whisperCppDir)) {
66+
console.log(
67+
'The directory with whisper.cpp exists so the addon should be already available for use.'
68+
+ '\nIf you think there\'s something wrong with the installation and would like to install afresh, then run again with the "reinstall" option.'
69+
);
70+
} else
71+
install();
72+
} else if (process.argv[2] == 'reinstall') {
73+
// remove if already there and install
74+
execSync(`rm -rf ${whisperCppDir}`, { stdio: 'inherit' });
75+
install();
76+
} else if (process.argv[2] == 'rebuild') {
77+
console.log(
78+
'whisper.cpp.node will now be rebuilt.'
79+
+ '\nIt is going to a take a while ...'
80+
);
81+
execSync(buildCmd, { stdio: 'inherit' });
82+
console.log('whisper.cpp.node should now be ready for use.');
83+
} else
84+
console.log('Not sure what you want to do right now. Please pick one of "install", "reinstall" or "rebuild" commands.');

0 commit comments

Comments
 (0)