Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/codeql.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ name: CodeQL Analysis

on:
push:
branches: [ main ]
branches: [ main, dev ]
pull_request:
branches: [ main ]
schedule:
Expand Down
184 changes: 184 additions & 0 deletions AUDIO_FX.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,184 @@
# FRecorder Audio FX & Noise Reduction

All audio processing in FRecorder is applied to **WAV recordings only**. Effects are applied in real-time during recording (filters, gate, boost) or as post-processing after the recording stops (noise reduction).

---

## Signal Chain (recording order)

```
Mic → Gain Boost → High-Pass Filter → Low-Pass Filter → Noise Gate → WAV file → Noise Reduction
```

Real-time effects are applied sample-by-sample as audio is captured. Noise reduction runs on the finished WAV file.

---

## 1. Input Gain Boost

Amplifies the raw microphone signal before any other processing.

| Setting | Multiplier | Use case |
|---------|-----------|----------|
| Off | 1× | Default — no amplification |
| +6 dB | 2× | Quiet sources at moderate distance |
| +12 dB | 4× | Very quiet sources, distant speakers |

**How it works:** Each PCM sample is multiplied by the gain factor and clamped to 16-bit range (±32767) to prevent digital clipping.

**Recommendations:**
- Start with Off. Only increase if the waveform is visibly too small.
- +12 dB amplifies noise by 4× as well — combine with the noise gate or noise reduction to compensate.
- If you hear distortion or see the waveform hitting the rails, reduce the boost.

---

## 2. High-Pass Filter (HPF)

Removes low-frequency content below a cutoff frequency. Implemented as a second-order Butterworth (biquad) filter applied in real-time.

| Setting | Cutoff | Slope |
|---------|--------|-------|
| Off | — | — |
| HPF 80 Hz | 80 Hz | 12 dB/octave |
| HPF 120 Hz | 120 Hz | 12 dB/octave |

**How it works:** A biquad high-pass filter (Q = 0.7071, Butterworth) attenuates frequencies below the cutoff at 12 dB per octave. The filter state is maintained across audio buffers for seamless operation.

**Recommendations:**
- **80 Hz** — Removes deep rumble (HVAC, traffic, wind) while preserving bass in music and speech.
- **120 Hz** — More aggressive. Good for speech-only recordings where bass content is unwanted. Removes handling noise and proximity effect from close-miked vocals.
- Use HPF in almost all field recording situations. Low-frequency rumble is rarely useful and wastes dynamic range.

---

## 3. Low-Pass Filter (LPF)

Removes high-frequency content above a cutoff frequency. Same biquad implementation as the HPF.

| Setting | Cutoff | Slope |
|---------|--------|-------|
| Off | — | — |
| LPF 9.5 kHz | 9,500 Hz | 12 dB/octave |
| LPF 15 kHz | 15,000 Hz | 12 dB/octave |

**How it works:** A biquad low-pass filter (Q = 0.7071, Butterworth) attenuates frequencies above the cutoff.

**Recommendations:**
- **15 kHz** — Gentle rolloff. Removes ultrasonic noise and aliasing artifacts from cheap microphones while keeping all audible content.
- **9.5 kHz** — Aggressive. Useful for speech-only recordings (intelligibility lives below 8 kHz). Removes hiss, high-frequency interference, and sibilance.
- Leave Off for music or any recording where high-frequency detail matters.

---

## 4. Noise Gate

Silences the audio when the signal level drops below a threshold. Prevents recording room tone, hiss, or background noise during pauses in speech.

| Parameter | Value | Description |
|-----------|-------|-------------|
| Threshold | 400 RMS | Signal level that opens the gate |
| Hysteresis | 50% of threshold | Lower level that triggers gate closing (prevents chatter) |
| Attack | 0.1 ms | Time to fully open (nearly instant) |
| Hold | 300 ms | Time the gate stays open after signal drops below threshold |
| Release | 500 ms | Fade-out time from open to closed |

**How it works:** The gate operates as a state machine: CLOSED → ATTACK → OPEN → HOLD → RELEASE → CLOSED. When the RMS level of an audio chunk exceeds the threshold, the gate opens. When it drops below the hysteresis level, the gate enters a hold period before fading out. The envelope multiplier (0.0–1.0) is applied to every sample.

**Recommendations:**
- Best for **speech recordings** in quiet environments where you want dead silence between phrases.
- **Not recommended** for music, ambient recordings, or environments with continuous background sound — the gate will chop the audio unnaturally.
- Works well combined with HPF (remove rumble first, then gate on the cleaner signal).
- The fixed threshold (400 RMS) is tuned for typical phone microphone levels. Very quiet sources may never open the gate — use gain boost to compensate.

---

## 5. Noise Reduction (post-processing)

Spectral-gating noise reduction inspired by Audacity's Noise Reduction effect. Applied **after recording stops**, processing the entire WAV file. A progress dialog is shown during processing.

### How it works

1. **Noise profile** — The first N seconds of the recording are analyzed to build a per-frequency-bin noise profile (mean + standard deviation of spectral magnitude).
2. **Threshold** — For each frequency bin, a threshold is computed: `mean + scale × std`. The sensitivity parameter controls the scale factor.
3. **Spectral subtraction** — For each FFT frame of the full recording, bins whose magnitude is below the noise threshold are attenuated. The reduction amount is controlled by the reduction dB parameter.
4. **Frequency smoothing** — The gain mask is averaged across neighboring frequency bins to avoid musical noise (isolated tonal artifacts).
5. **Temporal smoothing** — The gain mask is smoothed over time with attack (20 ms) and release (100 ms) constants to prevent abrupt transitions.
6. **Overlap-add reconstruction** — Processed frames (2048-sample Hann-windowed, 50% overlap) are combined back into a continuous signal and written to the WAV file in-place.

### Configurable parameters

| Parameter | Range | Default | Description |
|-----------|-------|---------|-------------|
| Reduction (dB) | 0–24 | 12.0 | How much noise to remove. Higher = more aggressive removal. |
| Sensitivity | 0–24 | 12.0 | How aggressively bins are classified as noise. Higher = more bins treated as noise. |
| Freq smoothing | 0–6 bands | 3 | Number of neighboring frequency bands averaged. Reduces musical noise artifacts. |
| Noise profile | 0.5–5.0 s | 1.0 | Duration of audio from the start used to learn the noise signature. |

### Parameter details

**Reduction (dB):**
- 0 dB — No reduction (pass-through).
- 6 dB — Light cleanup. Subtle hiss removal.
- 12 dB — Moderate reduction. Good general-purpose setting.
- 18–24 dB — Heavy reduction. May introduce artifacts on transients.

The value maps to a reduction strength multiplier: `strength = dB / 12.0`. At 12 dB, the full estimated noise floor is subtracted. At 24 dB, twice the noise floor is subtracted.

**Sensitivity:**
- 0 — Conservative. Only the most obvious noise bins are affected. Threshold = `mean + 3×std`.
- 12 — Balanced. Good default.
- 24 — Aggressive. Everything near the noise floor is treated as noise. Threshold = `mean + 0×std`. Risk of removing quiet signal content.

**Frequency smoothing:**
- 0 — No smoothing. Each bin is treated independently. May produce "musical noise" (random tonal blips).
- 3 — Default. Averages the gain mask across 3 neighboring bins on each side. Good balance.
- 6 — Maximum smoothing. Very smooth but may blur frequency detail.

**Noise profile duration:**
- The algorithm assumes the **first N seconds of the recording contain only noise** (no speech or signal). Record a moment of silence at the start.
- 1.0 s — Default. Sufficient for stationary noise (fan, hiss, hum).
- 2.0–5.0 s — Better for non-stationary noise. Captures more variation in the noise floor.
- 0.5 s — Minimum useful. Use when you can't afford a long silence at the start.

### Recommendations

- **Always record 1–2 seconds of silence** at the beginning before speaking. This gives the algorithm a clean noise profile.
- Start with defaults (12 dB / 12 sensitivity / 3 bands / 1.0 s). Adjust only if results are unsatisfactory.
- If you hear "musical noise" (watery, bubbly artifacts), increase frequency smoothing or decrease sensitivity.
- If speech sounds muffled or thin, reduce the reduction dB or sensitivity.
- Noise reduction works best on **stationary noise** (constant hiss, fan, hum). It struggles with intermittent noise (traffic, people talking nearby).
- Combine with HPF: remove low-frequency rumble with the filter during recording, then clean up remaining hiss with noise reduction after.
- Currently only supports **16-bit WAV** files. 24-bit recordings are not processed.

---

## Recommended setups

### Speech / interview in a quiet room
- HPF: 120 Hz
- LPF: Off
- Noise gate: On
- Gain boost: Off
- Noise reduction: Off (or 6 dB if there's audible hiss)

### Speech in a noisy environment (street, café)
- HPF: 80 Hz
- LPF: 15 kHz
- Noise gate: Off (continuous noise would cause choppy gating)
- Gain boost: Off or +6 dB
- Noise reduction: On (12 dB / 12 sensitivity / 3 bands / 1.0 s)

### Quiet source at distance
- HPF: 80 Hz
- LPF: Off
- Noise gate: Off
- Gain boost: +6 dB or +12 dB
- Noise reduction: On (12–18 dB / 12 sensitivity / 3 bands / 1.0 s)

### Music / ambient recording
- HPF: Off (or 80 Hz if there's rumble)
- LPF: Off
- Noise gate: Off
- Gain boost: Off
- Noise reduction: Off (or very light: 6 dB / 6 sensitivity)
6 changes: 4 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,9 @@ Fork of [Dimowner/AudioRecorder](https://github.com/Dimowner/AudioRecorder) with
- **USB Audio Input** — Record from external USB microphones, audio interfaces, and other USB audio devices. Automatically detects connected devices and lets you select them as the recording source. Use a USB audio interface like the Rode AI-Micro or a TRS-to-USB-C adapter like the BOYA BY-K4 to plug in contact mics, lavaliers, or any standard 3.5mm audio source.
- **Live Monitoring** — Listen to what's being recorded in real-time through Bluetooth headphones or the built-in speaker. Toggle on/off before or during recording.
- **Gain Boost** — Adjustable input gain (+6 dB / +12 dB) to amplify quiet sources. Applied in real-time with clipping protection.
- **Noise Reduction** — Optional spectral noise reduction applied on save (WAV only).
- **Noise Reduction** — Optional spectral noise reduction applied on save (WAV only). Configurable parameters.
- **High/Low-Pass Filters** — Configurable HPF (80/120 Hz) and LPF (9.5/15 kHz) for cleaning up recordings.
- **Noise Gate** — Monitor-only noise gate to cut background noise during live monitoring.
- **Noise Gate** — Noise gate to cut background noise during silence.
- **Save Formats** — WAV (16-bit or 24-bit), MP3 (320 kbps via LAME), and FLAC (lossless). Recording is always done internally in WAV for maximum quality; conversion happens on save.
- **Bit Depth** — Selectable 16-bit or 24-bit WAV output.
- **Configurable Audio** — Sample rate (8–48 kHz), mono/stereo, audio input device selection.
Expand All @@ -27,6 +27,8 @@ Fork of [Dimowner/AudioRecorder](https://github.com/Dimowner/AudioRecorder) with
- **File Management** — Rename, share, import, bookmark, trash/restore recordings. Built-in file browser.
- **Themes** — Multiple color themes to personalize the app.

See **[Audio FX & Noise Reduction](AUDIO_FX.md)** for a detailed explanation of all audio effects, how they work, and recommended setups.

## Format Changes from Upstream

Removed support for **3GP** and **M4A** recording formats. These were low-quality legacy formats not suited for field recording. All recording is now done in WAV internally, with the user choosing the output/save format (WAV, MP3, FLAC).
Expand Down
4 changes: 4 additions & 0 deletions app/src/main/java/com/vdo/frecorder/app/RecordingService.java
Original file line number Diff line number Diff line change
Expand Up @@ -453,6 +453,10 @@ private void startRecording(String path) {
// Set noise reduction and filter preferences on WavRecorder
if (recorder instanceof WavRecorder) {
((WavRecorder) recorder).setNoiseReductionEnabled(prefs.isNoiseReductionEnabled());
((WavRecorder) recorder).setNoiseReductionDb(prefs.getNoiseReductionDb());
((WavRecorder) recorder).setNoiseReductionSensitivity(prefs.getNoiseReductionSensitivity());
((WavRecorder) recorder).setNoiseReductionFreqSmoothing(prefs.getNoiseReductionFreqSmoothing());
((WavRecorder) recorder).setNoiseProfileSeconds(prefs.getNoiseProfileSeconds());
((WavRecorder) recorder).setHpfMode(prefs.getHpfMode());
((WavRecorder) recorder).setLpfMode(prefs.getLpfMode());
((WavRecorder) recorder).setNoiseGateEnabled(prefs.isNoiseGateEnabled());
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -776,9 +776,15 @@ public void importAudioFile(final Context context, final Uri uri) {
@Override
public void run() {
try {
if (uri.getScheme() == null || !uri.getScheme().equals("content")) {
throw new SecurityException("Only content:// URIs are supported for import");
}
ParcelFileDescriptor parcelFileDescriptor = context.getContentResolver().openFileDescriptor(uri, "r");
FileDescriptor fileDescriptor = parcelFileDescriptor.getFileDescriptor();
String name = extractFileName(context, uri);
String name = FileUtil.sanitizeFileName(extractFileName(context, uri));
if (name == null) {
throw new IOException("Invalid file name extracted from URI");
}

File newFile = fileRepository.provideRecordFile(name);
if (FileUtil.copyFile(fileDescriptor, newFile)) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -34,11 +34,14 @@
import android.widget.Button;
import android.widget.CompoundButton;
import android.widget.LinearLayout;
import android.widget.SeekBar;
import android.widget.Spinner;
import android.widget.Switch;
import android.widget.TextView;
import android.widget.Toast;

import android.app.AlertDialog;

import com.vdo.frecorder.ARApplication;
import com.vdo.frecorder.AppConstants;
import com.vdo.frecorder.ColorMap;
Expand All @@ -48,6 +51,7 @@
import com.vdo.frecorder.app.trash.TrashActivity;
import com.vdo.frecorder.app.widget.SettingView;
import com.vdo.frecorder.audio.AudioDeviceManager;
import com.vdo.frecorder.data.Prefs;
import com.vdo.frecorder.util.AndroidUtils;
import com.vdo.frecorder.util.FileUtil;
import com.vdo.frecorder.util.RippleUtils;
Expand Down Expand Up @@ -78,6 +82,7 @@ public class SettingsActivity extends Activity implements SettingsContract.View,
private Switch swKeepScreenOn;
private Switch swAskToRename;
private Switch swNoiseReduction;
private TextView btnNoiseReductionConfigure;

private Spinner nameFormatSelector;
private Spinner audioSourceSelector;
Expand Down Expand Up @@ -262,6 +267,9 @@ protected void onCreate(Bundle savedInstanceState) {
swNoiseReduction = findViewById(R.id.swNoiseReduction);
swNoiseReduction.setOnCheckedChangeListener((btn, isChecked) -> presenter.setNoiseReductionEnabled(isChecked));

btnNoiseReductionConfigure = findViewById(R.id.btnNoiseReductionConfigure);
btnNoiseReductionConfigure.setOnClickListener(v -> showNoiseReductionSettingsDialog());

audioSourceSelector = findViewById(R.id.audio_source_selector);
audioSourceSelector.setOnItemSelectedListener(new AdapterView.OnItemSelectedListener() {
@Override
Expand Down Expand Up @@ -654,6 +662,96 @@ public void disableAudioSettings() {
gainBoostSetting.setEnabled(false);
}

private void showNoiseReductionSettingsDialog() {
Prefs prefs = ARApplication.getInjector().providePrefs(getApplicationContext());

float currentDb = prefs.getNoiseReductionDb();
float currentSensitivity = prefs.getNoiseReductionSensitivity();
int currentFreqSmoothing = prefs.getNoiseReductionFreqSmoothing();
float currentProfileSeconds = prefs.getNoiseProfileSeconds();

LinearLayout layout = new LinearLayout(this);
layout.setOrientation(LinearLayout.VERTICAL);
int pad = (int) getResources().getDimension(R.dimen.spacing_normal);
layout.setPadding(pad, pad, pad, 0);

// Reduction dB slider (0-24, step 0.5)
TextView txtDb = new TextView(this);
txtDb.setText(getString(R.string.noise_reduction_db, currentDb));
layout.addView(txtDb);
SeekBar seekDb = new SeekBar(this);
seekDb.setMax(48); // 0-24 in 0.5 steps
seekDb.setProgress((int) (currentDb * 2));
seekDb.setOnSeekBarChangeListener(new SeekBar.OnSeekBarChangeListener() {
@Override public void onProgressChanged(SeekBar sb, int progress, boolean fromUser) {
txtDb.setText(getString(R.string.noise_reduction_db, progress / 2.0f));
}
@Override public void onStartTrackingTouch(SeekBar sb) {}
@Override public void onStopTrackingTouch(SeekBar sb) {}
});
layout.addView(seekDb);

// Sensitivity slider (0-24, step 0.5)
TextView txtSensitivity = new TextView(this);
txtSensitivity.setText(getString(R.string.noise_reduction_sensitivity, currentSensitivity));
layout.addView(txtSensitivity);
SeekBar seekSensitivity = new SeekBar(this);
seekSensitivity.setMax(48); // 0-24 in 0.5 steps
seekSensitivity.setProgress((int) (currentSensitivity * 2));
seekSensitivity.setOnSeekBarChangeListener(new SeekBar.OnSeekBarChangeListener() {
@Override public void onProgressChanged(SeekBar sb, int progress, boolean fromUser) {
txtSensitivity.setText(getString(R.string.noise_reduction_sensitivity, progress / 2.0f));
}
@Override public void onStartTrackingTouch(SeekBar sb) {}
@Override public void onStopTrackingTouch(SeekBar sb) {}
});
layout.addView(seekSensitivity);

// Frequency smoothing slider (0-6, step 1)
TextView txtFreqSmoothing = new TextView(this);
txtFreqSmoothing.setText(getString(R.string.noise_reduction_freq_smoothing, currentFreqSmoothing));
layout.addView(txtFreqSmoothing);
SeekBar seekFreqSmoothing = new SeekBar(this);
seekFreqSmoothing.setMax(6);
seekFreqSmoothing.setProgress(currentFreqSmoothing);
seekFreqSmoothing.setOnSeekBarChangeListener(new SeekBar.OnSeekBarChangeListener() {
@Override public void onProgressChanged(SeekBar sb, int progress, boolean fromUser) {
txtFreqSmoothing.setText(getString(R.string.noise_reduction_freq_smoothing, progress));
}
@Override public void onStartTrackingTouch(SeekBar sb) {}
@Override public void onStopTrackingTouch(SeekBar sb) {}
});
layout.addView(seekFreqSmoothing);

// Noise profile seconds slider (0.5-5.0, step 0.5)
TextView txtProfile = new TextView(this);
txtProfile.setText(getString(R.string.noise_profile_seconds, currentProfileSeconds));
layout.addView(txtProfile);
SeekBar seekProfile = new SeekBar(this);
seekProfile.setMax(9); // 0.5-5.0 in 0.5 steps → 0..9 maps to 0.5..5.0
seekProfile.setProgress((int) ((currentProfileSeconds - 0.5f) * 2));
seekProfile.setOnSeekBarChangeListener(new SeekBar.OnSeekBarChangeListener() {
@Override public void onProgressChanged(SeekBar sb, int progress, boolean fromUser) {
txtProfile.setText(getString(R.string.noise_profile_seconds, (progress + 1) / 2.0f));
}
@Override public void onStartTrackingTouch(SeekBar sb) {}
@Override public void onStopTrackingTouch(SeekBar sb) {}
});
layout.addView(seekProfile);

new AlertDialog.Builder(this)
.setTitle(R.string.noise_reduction_settings)
.setView(layout)
.setPositiveButton(R.string.btn_save, (dialog, which) -> {
prefs.setNoiseReductionDb(seekDb.getProgress() / 2.0f);
prefs.setNoiseReductionSensitivity(seekSensitivity.getProgress() / 2.0f);
prefs.setNoiseReductionFreqSmoothing(seekFreqSmoothing.getProgress());
prefs.setNoiseProfileSeconds((seekProfile.getProgress() + 1) / 2.0f);
})
.setNegativeButton(R.string.btn_cancel, null)
.show();
}

@Override
public void showProgress() {
// TODO: showProgress
Expand Down
Loading
Loading